Web app developers spend most of our time not thinking about how data is actually transmitted through the bowels of the network stack. Abstractions at the application layer let us pretend that networks read and write whole messages as smooth streams of bytes. Generally this is a good thing. But knowing what's going underneath is crucial to performance tuning and application design. The character of our users' internet connections is changing and some of the rules of thumb we rely on may need to be revised.
A story about computer science
and other improbable things.
In reality, the Internet is more like a giant cascading multiplayer game of pachinko. You pour some balls in, they bounce around, lights flash and -usually- they come out in the right order on the other side of the world.
It's common to talk about network connections solely in terms of "bandwidth". Users are segmented into the high-bandwidth who get the best experience, and low-bandwidth users in the backwoods. We hope some day everyone will be high-bandwidth and we won't have to worry about it anymore.
That mental shorthand served when users had reasonably consistent wired connections and their computers ran one application at a time. But it's like talking only about the top speed of a car or the MHz of a computer. Latency and asymmetry matter at least as much as the notional bits-per-second and I argue that they are becoming even more important. The quality of the "last mile" of network between users and the backbone is in some ways getting worse as people ditch their copper wires for shared wifi and mobile towers, and clog their uplinks with video chat.
It's a rough world out there, and we need to to a better job of thinking about and testing under realistic network conditions. A better mental model of bandwidth should include:
The quantum of internet transmission is not the bit or the byte, it's the packet. Everything that happens on the 'net happens as discrete pachinko balls of regular sizes. A message of N bytes is chopped into ceil(N / 1460)
packets [1] which are then sent willy-nilly. That means there is little to no difference between sending 1 byte or 1,000. It also means that sending 1,461 bytes is twice the work of sending 1,460: two packets have to be sent, received, reassembled, and acknowledged.
Packet #1 Payload ..................................................................... ..................................................................... ..................................................................... ..................................................................... ..................................................................... ..................................................................... ..................................................................... ..................................................................... ..................................................................... ..................................................................... ..................................................................... ..................................................................... ..................................................................... ..................................................................... ..................................................................... ..................................................................... ..................................................................... ..................................................................... ..................................................................... ..................................................................... ........................................................... Packet #2 Payload .Listing 0: Byte 1,461 aka The Byte of Doom
Crossing the packet line in HTTP is very easy to do without knowing it. Suppose your application uses a third-party web analytics library which, like most analytics libraries, stores a big hunk of data about the user inside long-lived cookie tied to your domain. Suppose you also stuff a little bit of data into the cookie too. This cookie data is thereafter echoed back to your web server upon each request. The boilerplate HTTP headers (Accept, User-agent, etc) sent by every modern browser take up a few hundred more bytes. Add in the actual URL, Referer header, query parameters... and you're dead. There is also the little-known fact that browsers split certain POST requests into at least two packets regardless of the size of the message.
One packet, more or less, who cares? For one, none of your fancy caching and CDNs can help the client send data upstream. TCP slow-start means that the client will wait for acknowledgement of the first packets before sending the rest. And as we'll see below, that extra packet can make a large difference in the responsiveness of your app when it's compounded by latency and narrow upstream connections.
Packet latency is the time it takes a packet to wind through the wires and hops between points A and B. It is roughly a function of the physical distance (at 2/3 of the speed of light) plus the time the packet spends queued up inside various network devices along the way. A typical packet sent on the 'net backbone between San Francisco and New York will take about 60 milliseconds. But the latency of a user's last-mile internet connection can vary enormously [2]. Maybe it's a hot day and their router is running slowly. The EDGE mobile network has a best-case latency of 150msec and a real-world average of 500msec. There is a semi-famous rant from 1996 complaining about 100msec latency from substandard telephone modems. If only.
Packet loss manifests as packet latency. The odds are decent that a couple packets that made up the copy of this article you are reading got lost along the way. Maybe they had a collision, maybe they stopped to have a beer and forgot. The sending end then has to notice that a packet has not been acknowledged and re-transmit.
Wireless home networks are becoming the norm and they are unfortunately very susceptible to interference from devices sitting on the 2.4GHz band, like microwaves and baby monitors. They are also notorious for cross-vendor incompatibilities. Another dirty secret is that consumer-grade wifi devices you'll find in cafés and small offices don't do traffic shaping. All it takes is one user watching a video to flood the uplink.
Internet providers lie. That "6 Megabit" cable internet connection is actually 6mbps down and 1mbps up. The bandwidth reserved for upstream transmission is often 20% or less of the total available. This was an almost defensible thing to do until users started file sharing, VOIPing, video chatting, etc en masse. Even though users still pull more information down than they send up, the asymmetry of their connections means that the upstream is a chokepoint that will probably get worse for a long time.
We need a way to simulate high latency, variable latency, limited packet rate, and packet loss. In the olden days a good way to test the performance of a system through a bad connection was to configure the switch port to run at half-duplex. Sometimes we even did such testing on purpose. :) Tor is pretty good for simulating a crappy connection but it only works for publicly-accessible sites. Microwave ovens consistently cause packet loss (my parents' old monster kills wifi at 20 paces) but it's a waste of electricity.
Figure 0: It's popcorn for dinner tonight, my love. I'm doing science!
The ipfw on Mac and FreeBSD comes in handy for local testing. The command below will approximate an iPhone on the EDGE network with a 350kbit/sec throttle, 5% packet loss rate and 500msecs latency. Use sudo ipfw flush
to deactivate the rules when you are done.
$ sudo ipfw pipe 1 config bw 350kbit/s plr 0.05 delay 500ms $ sudo ipfw add pipe 1 dst-port http
Here's another that will randomly drop half of all DNS requests. Have fun with that one.
$ sudo ipfw pipe 2 config plr 0.5 $ sudo ipfw add pipe 2 dst-port 53
To measure the effects of latency and packet loss I chose a highly-cached 130KB file from Yahoo's servers. I ran a script to download it as many times as possible in 5 minutes under various ipfw rules [3]. The "baseline" runs were the control with no ipfw restrictions or interference.
Just 100 milliseconds of packet latency is enough to cause a smallish file to download in an average of 1500 milliseconds instead of 350 milliseconds. And that's not the worst part: the individual download times ranged from 1,000 to 3,000 milliseconds. Software that's consistently slow can be endured. Software that halts for no obvious reason is maddening.
Yahoo's web performance guidelines are still the most complete resource around, and backed up by real-world data. The key advice is to reduce the number of HTTP requests, reduce the amount of data sent, and to order requests in ways that use the observed behavior of browsers to best effect. However there is a simplification which buckets users into high/low/mobile categories. This doesn't necessarily address poor-quality bandwidth across all classes of user. The user's connection quality is often very bad and getting worse, which changes the calculus of what techniques to employ. In particular we should also take into account that:
Assuming that a large but unknown percentage of your users labor under adverse network conditions, here are some things you can do:
/blah#foo=bar
instead of /blah?foo=bar
. Nothing after the # mark is sent to the server.This article is the first in a series and part of ongoing research on bandwidth and web app performance. Next, we will dig deeper into some of the open questions we've posed, examine real-world performance in the face of high latency and packet loss, and suggest more techniques on how to make your apps work better in adverse conditions based on the data we collect.
[1] ceil(N / 1460) is the same algorithm you use to figure out how many trips it takes to carry your laundry down the stairs. (ceil is geekspeak for rounding up.) Say you have 50 pounds of clothes and the basket holds 13 pounds. 50 / 13 = 3 remainder 11, so you need to make 4 trips. The bigger the basket the fewer the trips. So why not use huge packets? On private networks you might see configurations for "Jumbo frames". But in the wild you have to consider the cost of packet loss, typical message sizes, old or incompatible routers, etc.
That specific number (aka Maximum Segment Size) comes from the maximum packet size (aka Maximum Transmission Unit) of 1,500 octets (aka bytes) set in RFC 1191 (aka Ethernet v2), minus the space reserved for the source and destination addresses, flags, etc. IPv6, which has been coming any day now since the Clinton administration, will probably converge on an MSS of 1,220 or 1,440 in the wild. Point being, we're stuck with tiny packets for the rest of our lifetimes.
[2] DNS can also cause latency. We tend to take hostname lookups for granted, but an ISP's DNS resolvers are often unloved. It once took me several years to convince BellSouth's customer service that one of their DNS resolvers was actually off the network. User DNS problems are doubly nasty because we as application developers can't control or even detect them.
[3] The script was single-threaded and used a new TCP connection for each request. A single restriction was used per run, ie X milliseconds latency or Y% packet loss. The wifi was a Linksys WRT54g at a distance of 5 meters, with standard firmware in 802.11g mode and WPA2 encryption. The uplink was a "6mbps" home cable connection about 50 miles and ten network hops away from the nearest Yahoo caching server, during off-peak hours.
[4] The Google mobile team recently put out an interesting fact: "On an iPhone 2.2 device, 200k of JavaScript held within a block comment adds 240ms during page load, whereas 200k of JavaScript that is parsed during page load added 2600 ms."
Image Credit: Pachinko by the_toe_stubber on Flickr.