Article: Latency: The Hidden Risk Stalking Your Migration – and Why I Waste So Much Time Getting Groceries
by Andrew Shoop
Andrew Shoop is a Consulting Architect at Fulcrum Technology Solutions. Shoop has been a part of the Networking team since 2013.
I’m not the most organized person in the world, so I often find even mundane meal preparation requires multiple trips to the grocery store. Sure, I’ll remember the main stuff, but then we’ll be out of butter or milk, we have every cheese under the sun except the one we actually need, or the eggs are too old to be trustworthy. Then after it’s all done and we’re in the clean-up phase I realize there’s no garbage bags left.
A lot of applications are the same way. If they need cilantro, the developer writes the code to send off for cilantro. When they need onions, the code goes to fetch onions. Multiple trips, but unlike people, packets never complain. Maybe it would be better if they did. If packets made a habit out of whining to developers and system architects, maybe your cloud, datacenter or SD-WAN migration wouldn’t run into half the issues it would normally encounter.
Most people have a limited number of stores they get their supplies from, and applications are the same way. Shampoo comes from this store, produce from that one, and meat from that specialty butcher shop over there. User authentication from that server, database queries from these servers, input files from that SAN over there.
And then someone migrates the application to the cloud. Or changes datacenters. Or converts from expensive MPLS circuits to lower-cost SD-WAN. This is like moving in real life. When you move to your new neighborhood, you find new stores for all your supplies. But packets don’t think – and they don’t complain. Unless someone explicitly figured out a mechanism to assign those new suppliers, what generally happens is the application keeps using the same old stores … it just means travelling a few hundred miles to do it. No big deal, right? It’s just a little added latency.
Latency is the amount of time it takes for a single packet to make it from sender to recipient. Latency is relatively inflexible, generally a built-in component of the architecture and telecommunications infrastructure the application is running on. Until we invent teleportation or quantum communications, there is no amount of budget or design that will increase the speed of light in fiber, radio waves in wireless, or electrons in a wire. A network engineer can typically only “fix” latency when an alternative path is available – such as replacing wireless with wired, satellite with terrestrial, or an overloaded link for one that is not.
Going to Chicago for your horribly unhealthy breakfast cereal is only a good idea if you’re roughly in that area to begin with. This is true more or less regardless of how fast your car drives, whether you arrive during rush hour, whether or not the road is under construction, how high the speed limit is – even if you happen to own a private plane, it’s still breathtakingly inefficient. In these cases, it isn’t the underlying transport system that’s causing poor performance – the real culprit is the choice to make that trip in the first place.
But we’re talking about electronic communications, not grocery trips – surely the transit time is negligible, right? Well, that depends. If your application is going to wait for a response for each request, and it’s going to make a thousand trips … what can be considered “negligible” just dropped. By a factor of one thousand. And we’ve seen plenty of apps that make a thousand trips per user submission. We’ve seen apps with tens of thousands. We’ve seen apps that make a separate query to the server for every pixel the mouse pointer moves across the screen. Even a vanilla single-screen laptop generally has a little over 2 million pixels on it – that’s 50,000 queries with a single flick of the wrist.
Latency can sneak up on application owners as testing often takes place in low-latency environments. Testing from the developer’s PC to a server in the same building likely has a round-trip-time of less than one millisecond. However, a wireless user coming in via VPN could well see 20 to 25 times the latency – and these are high quality connection numbers from within the same state. Intercontinental users or satellite transmission will push this metric beyond 100 milliseconds. That trivial transaction time that was too fast to detect in testing becomes a much bigger problem when multiplied by 100 times or more!
Most of the time, latency’s impact on performance requires architectural solutions. Terminal services, Citrix, and other similar remote technologies can be used to put the client and the server closer to each other, often in the same building or in the same cloud region. Standing up regional servers around the globe is also a common method for dealing with latency. From an application design perspective, making fewer queries for larger amounts of information is another method to latency-proof an application.
All latency problems can be foreseen. And usually when something can be foreseen, it can be avoided. It’s just a question of performing the analysis before your migration begins, or your application rolls into production. While packets don’t complain, your application users tend to be a little less sanguine about the matter.
The Fulcrum Difference
At Fulcrum Technology Solutions, we differentiate ourselves from other technology- and business-consulting firms with a unique guarantee: when you hire Fulcrum, we commit to finish the job. Whether working under a time-and-materials contract or a cost-plus arrangement, we will not leave until we’ve delivered exactly what we said we’d do. Our word defines us, and motivates us to give you the service that you deserve!