The Etiquette of Internet Service Quality

Servers, houses and computers iStock_000057701042_FullOne of the central issues in the development of Internet policy is the matter of priorities. Proponents of network neutrality insist that service providers should not be able to sell “paid prioritization”, for example. While most of the arguments in favor of such a ban are technically incoherent – the fear of “fast lanes” is peculiar in a market where Content Delivery Networks are indispensible – some have endeavored to dig into the technical literature in order to see whether there’s a way to domesticate prioritization in order to makes its sale acceptable. While this has not been done to my satisfaction for the most part, it’s commendable that some proponents of strict broadband regulations have at least given it a try.

Scott Jordan, the chief technologist at the FCC, did a credible job of explaining this issue – which is called “Quality of Service” (QoS) in engineering – at an ITIF event in 2010. I covered this presentation on HTF when it happened.

Barbara van Schewick, head of the Stanford Center founded by Larry Lessig, has submitted a 194 page paper to the FCC from the Stanford Law Journal, but it’s quite a bit less successful than Jordan’s presentation. The paper, Network Neutrality and Quality of Service: What a Nondiscrimination Rule Should Look Like, makes the traditional arguments that QoS is only helpful when networks are congested and that carrier attempts to determine the QoS requirements of applications are bound to fail and should therefore not be allowed. It comes down in favor of something called “application agnostic QoS”, but it’s not crystal clear what this means in a world in which QoS requirements are application-specific.

In several sections of the paper van Schewick makes this complaint:

…requiring network providers to take action before an application can get the Quality of Service or differential treatment it needs violates the principle of innovation without permission and reduces the chance that new applications actually get the type of service they need.[p. 123]

This is an incorrect understanding of QoS that colors the policy outcome in an improper way. In the first place, no application has a need to be treated “better” or “worse” than any other application in a strict sense. So the framing of QoS in terms of priorities in the state of congestion isn’t actually correct, although it can be implemented as a prioritization scheme it can be implemented in other ways as well, and usually is.

QoS is a construct that expresses application needs to the network, which can take several different forms: an application may desire low delay, low cost or low loss rate, relative to some abstract average and in various combinations, but few applications are concerned with what other applications are doing as long as they get the performance they require.

The best way to think about low delay QoS is in terms of a pair of supermarket analogies. Some items for sale in the supermarket have expiration dates, like milk and eggs, and others, like salt and vodka, don’t. Frozen items need to be transported from the market to the home before thawing, so they’re a special case at the extreme end of the scale because their value depends on quick transport. Frozen foods get special handling in the store and at home, especially during a hot summer. In fact they’re handled in a special way between the distributor and the store as well, moving in refrigerated trucks and trains. So the low delay form of QoS is familiar to all of us.

Low delay is good for real time applications like Voice over IP (Skype and Vonage, chiefly), video conferencing (Cisco Telepresence) and combat gaming. The expiration date of a Skype packet is two tenths of a second after creation, which means the receiving ISP has a little more than one and a half tenths of a second after it gets it. Because of “hot potato routing”, the receiving ISP gets the packet from the sending ISP very close to the sending user.

Low delay in supermarkets is facilitated by some checkout line etiquette. If you want to get out of the store fast, you can buy 10 or 15 items or less and take the express checkout line. If there’s no express line, you depend on the generosity of others. We’ve probably all had the experience of being waved through the line by a shopper with a huge load of groceries when we only have one or two. This bit of etiquette works because the shopper with a full cart knows that letting the light shopper through isn’t going to change their exit time from the store very much – like one or two percent – while getting around the full cart shopper makes a big difference to the other guy. It’s simply civilized behavior.

Low loss is like hitting the bull’s eye every time you throw a dart. Unlike real time, Web surfing can’t tolerate any lost data, so each Web packet that fails to be delivered is retried. If VoIP loses a packet, it just goes on the next one because it doesn’t want to fall behind the speaker’s voice.

When the Web retries a packet there’s cost, however. The Web uses TCP, and TCP is responsible for congestion management on the Internet (because most congestion is created by apps that use TCP, among other reasons.) When TCP loses a packet, it guesses that the Internet is congested and slows down. So you want low loss when you’re surfing the web so that your pages will load fast. The web has an argument for low loss because web pages load so fast that any congestion they cause quickly ends – pages load in a second or two, and then the congestion is over. VoIP calls last much longer, but even while they’re underway, the load they put on the Internet is light. Video conferencing is different; we’ll get to that later.

So web packets can tolerate more delay than VoIP because they’re more averse to loss than delay. Of course, every app wants to run fast, but there are tradeoffs. The shopper with the full cart is willing to accept a bit of delay because she knows the day will come when she only has a carton of milk and a loaf of bread and wants to get home quick.

Low cost-oriented applications are like shoppers who go to Walmart or Costco. They’re willing to tolerate poor service and long lines because their time is not very valuable (in a relative sense) and the deals are great. The Walmart shoppers of the Internet are patch updates and disk backups, both of which typically take place in the dead of night.

Video is yet another mode because it limits its own sending rate like voice does but with a bias toward a combination of low delay and low cost. Video comes in two flavors, a real time variant used by conferencing apps and a clumpy, buffered form used by entertainment apps like TV and movie streaming.

Movie streaming from Netflix or Amazon is a buffered process in which the video player software in the entertainment box in your house grabs data from the remote video server as fast as it can but plays it out to your TV set (or equivalent) at a fixed rate. The TV plays a series of images, replacing an old one with a new one 30 or 60 times a second, regardless of how fast the it’s getting new pictures (as long as it has a new one at the expected time.)

Netflix fills a buffer (storage for a series of pictures and their related audio) before it starts playing because the buffer is protection from packet loss and delay; this takes a few seconds. Amazon buffers too, but in my experience its buffer filling time is barely perceptible because it’s a more cleverly designed system that will likely take Netflix down.

Applications have peculiar traffic signatures that allow them to be recognized by the size, number, and frequency of the packets they send and receive. Van Schewick expresses a lot of concern about “deep packet inspection” in her paper because she’s been told it’s a nefarious practice. Video streaming apps send packets in bunches of a few hundred at a time interspersed with long pauses. Each bunch is enough to fill the player’s buffer, and once it gets to a low water mark, it’s filled again.

This form of interaction uses the video server’s most constrained resource – the solid state disk – most efficiently, but it’s hell on any non-video apps sharing the pipe, especially voice apps. It’s as if every shopper has two full baskets, there’s no express line, and you need to get out of the store in a hurry with a tub of Rocky Road in the middle of summer for your pregnant wife. Bad news.

In reality, the voice app could cut in line every time without making anyone else have a bad day and without the network operator figuring out who’s watching porn and who’s watching Jane Austen movies. Not a problem at all.

There are ways to juggle these applications such that performance is maximized for the low delay apps and low loss apps without degrading the clumpy video apps in a noticeable way: practice checkout line etiquette and let the voice packets cut in line and intermingle the web packets with clumpy video, with web packets incurring more loss and clumpy packets incurring more delay. For complicated reasons, this balance maximizes overall welfare (defined as everyone getting what they want.)

So I’m suggesting that the packets on shared broadband links will generally benefit from shuffling packets around in queues that determine their order of transmission. Rather than doing a simple “first-in, first-out” thing, we ISPs should to signature-based delay and loss allocation. I don’t suggest charging for this, as far as I’ve taken it.

There is another traffic signature where charging for better than standard service might be appropriate however: video conferencing. This is not a buffered system like video streaming because conferencing packets need to arrive as soon as possible. You can’t react to the other party if there’s too much delay, obviously.

Supermarket etiquette doesn’t apply because the video conferencing carries a heavy load. What we need here is a special checkout line with an ultrafast robotic checker, which the conferencing app should be willing to pay for. In simple terms, it’s extra capacity to be used by video apps when they’re running and for the general public the rest of the time. These apps need to be authenticated as well, so charging and extra capacity can piggyback on authentication. I suppose this is how a personal shopper works.

Low cost applications would simply use network capacity that’s left over after these other forms of QoS are carried out.

In the next part, I’ll explain the dynamics of Internet congestion, another subject the FCC’s Open Internet order and its intellectual sources haven’t fully grasped.