How did China Telecom Snag 15% of the Internet?

Iljitsch van Beijnum explains how China Telecom was able to temporarily snag a big part of the Internet’s traffic for a few minutes last Spring. The Internet’s routing system is fundamentally insecure and error-prone, so things like that happen all the time.

Understanding the Internet’s insecure routing infrastructure

So what exactly happened in China that caused 15 percent of the Internet’s prefixes—not 15 percent the traffic—to get rerouted to that nation in April? And was it an accident or something more malicious? I wasn’t at the China Telecommunications Corporation office to observe the incident as it unfolded, so I can’t say for sure whether this was a diabolical plan executed to perfection or a network engineer doing something really, really stupid. But I’m betting on the latter, and not just because of Hanlon’s Razor.

One common BGP screw-up is leaking the entire routing table. There are currently some 341,000 prefixes making up the Internet and, in order to be able to reach them all, BGP routers need to have all of these in their routing tables. If, for some reason, a BGP router doesn’t have any filters, it will simply send a copy of that full table to all the routers in neighboring ASes that it’s connected to.

Leaking a full table is a mistake that happens fairly regularly, and it looks like this is what happened in China. Here’s what may have gone down.

When a filter is updated, it can become nonfunctional. Usually, this is caught with a “maximum prefixes” filter of last resort—this kills a BGP session if more than a predetermined number of prefixes is received. But, even without this, such a leak usually isn’t too devastating because a detour through (for instance) China means that additional ASes are traversed, and BGP prefers shorter paths over longer ones. This is possible because, for each prefix, the ASes on the way to that destination are recorded in an “AS path.”

However, just leaking the full table, or at least a sizable fraction of it, was exacerbated by a curious design decision by China Telecom. That decision made it look to the outside world like China Telecom had also stripped off the AS path from all the prefixes that it had leaked. So it looked to peers of China Telecom that destinations such as CNN were located in China Telecom’s network, rather than that CNN was merely reachable through China Telecom.

This is why relatively many ASes started sending their traffic to China. Stripping off the AS paths happens when information in BGP is exported into another routing protocol that is used locally, and then re-exported back into BGP. This practice is considered dangerous because of earlier incidents like the one discussed here. There is also not really any good reason to do it—there are a few not-so-good reasons—and I can’t think of any way that this would happen by accident.

So just the leaking of a full or partial BGP table in itself isn’t too suspicious, although networks the size of China Telecom should know better. But doing so with the AS paths removed may be construed as a reason for a moderate level of suspicion by those so inclined.

It will happen again and again, due to human error as much as anything.