Internet Congestion 201


This is the second part of an examination of the nature of congestion on packet switched networks such as the Internet. In the first part, Internet Congestion 101, we looked the at an idea expressed on Chris Marsden’s blog regarding the assumption of a “reasonable level of backhaul.” As Chris acknowledges in a comment, the task of pinning down the level of shared capacity (backhaul is shared by its nature) that’s reasonable falls on the regulator rather than the engineer. The reason for this is that the way supply and demand are brought into balance on packet switched networks is dynamic; on a circuit switched network, demand is static per call, so the operator simply has to provision enough shared capacity to supply the number of subscribers that are likely to make calls at the network peak (probably Mother’s Day afternoon in the US.) The consequence of demand exceeding supply is the inability to make calls, and that’s clearly unacceptable.

On the packet switched network, however, the units of traffic that have to be supported with sufficient bandwidth are packets, which differ from calls primarily by their duration, their number, and their tolerance for delay. An overloaded telephone network rejects calls at the network’s entry point (this technique is called “admission control” in networking) while a slightly overloaded packet switched network simply delays packets in queues and a more overloaded one discards packets. Since packets are generated by machines which have a memory of each packet that’s in transit, discarded packets are presented to the network over and over until they’re successfully transmitted (or until the sender loses interest in the conversation.) So overload on a packet network has an entirely different set of consequences than overload on a circuit network.

As the design goal of packet switching is to make all provisioned network bandwidth available for use, and the potential load that can be offered by the users of a packet network will always exceed supply by a considerable margin (more on this later,) queuing delay and packet discards aren’t as much pathologies as symptoms of a network operating within desirable design margins. Provisioning more capacity than users will consume is a waste of money, but provisioning so little that users are frustrated by discernible delays will generally result in lost customers for the network operator. Packet networks are designed, therefore, to be operated at or close to overload for short periods time punctuated by short periods of underload when the network catches up with queued packets and then goes idle.

Packet networks have a fractal property where traffic conditions exhibit patterns that repeat at different time scales. Each packet consumes 100% of network capacity, but for a very short time, on the order of milliseconds, typically followed by a short period of quiet. Streams of packets load the network for seconds or longer, followed by equivalent periods of quiet. Swarms of files transferred using peer-to-peer protocols by numbers of users can extend periods of load into the minutes or hours.

When we discuss congestion on packet networks we’re actually talking about packet loss and delay, and we’re making assumptions about the applications that people are using. Interactive applications like the Web are perfectly suited for packet networks because they exhibit short periods of activity followed by (relatively) long periods of silence; streaming is less well suited, and telephony is only well-suited when it’s relatively low bandwidth.

So the question we’re actually asking when we want to know how much capacity is “reasonable” is what assumptions the operator can make about the mix of applications on the network. If everyone is using the same application for the same period of time every day, we can easily create an Erlang formula that defines network capacity. If people are free to run diverse applications with different requirements from the network and which place different degrees of load on the network, the exercise in prediction becomes a lot harder.

We probably wouldn’t want a regulator dictating rules to users about which applications we can run and how often; but setting down a “reasonable capacity” standard for operators amounts to the same thing. Apart from the policy implications this sort of thing may have, as an engineering exercise this is a difficult problem because users are so different and trends are so fickle. In 2007, there was an explosion of peer-to-peer that required much greater upstream capacity (from the user to the Internet core); investing heavily in upstream would have been a mistake, however, as the P2P fad has crested and video streaming from content delivery networks has taken over as the fastest growing type of traffic; it requires more downstream capacity.

Ultimately, dictating a minimum level of capacity from the packet network operator is the same as telling the user what applications he can run. If we’re willing to accept that, we can go on to a discussion about what the reasonable type of usage might be.