Bringing Privacy into the Open

We don’t write about Internet privacy a great deal because it’s much more a matter of pure policy than one of technology. Privacy policy has more to do with consent, retention, protection, and dissemination than with how computers and networks actually work, so most of it is outside the scope of a technology blog such as this one. But there are some important technical issues involved in privacy that come to the surface when legislators and regulators consider how to tailor privacy policy to the various players in the Internet space. As we see in so many policy battles, groups of firms aligned by business models fight to protect their common interests from regulation while seeking to impose harsh restrictions on firms with different business models.

The first time I testified before Congress, back in 2009, the topic was privacy.  Advertisers and advertising networks are the obvious candidates for privacy regulation because their businesses get a boost from detailed knowledge of user preferences. At the most basic level, advertisers don’t want to waste money showing Lexus ads to poor people or showing ads for fancy women’s shoes to teenaged males. In principle, the better the targeting the more advertisers are willing to pay. A lead sheet is more valuable than a billboard.

At the Congressional hearing, advertisers pushed back on the Internet subcommittee’s desire to regulate their industry by claiming ISPs had access to more information than they did. While this was true in a superficial sense – all the packets going to Amazon, Google, and eBay pass through an ISP at some point – it’s also disingenuous because ISPs have a business model that depends on subscription fees rather than ad sales. So even if your ISP knows the details of every transaction you make on the Internet, the information doesn’t translate into revenue.

Or at least it didn’t until recently. ISPs have discovered that they can reduce the prices they need to charge users if they can sell some information to ad networks. This allows the ISPs to connect the data they’re in a position to harvest with a means of using it to pitch products to consumers, something the ISPs can’t do on their own since they can’t alter the information that goes to your web browser. Google taught this lesson to the ISPs by offering extremely low cost, very high speed Internet service so that they can harvest more information to use in their advertising auctions. AT&T notably followed by matching Google’s price – $70 for a gigabit pipe – on the condition that they can harvest personal information and sell it to advertisers. The same service is available with privacy for $100 a month. Most people choose the $70 plan, of course.

Differential privacy regulation got a boost when the FCC reclassified Internet access under Title II of the Communications Act because that move split privacy between the FCC and the Federal Trade Commission according to Internet business models. The FCC controls privacy for Title II businesses and the FTC has the ball for advertisers and web sites. Consequently, the advertisers are now pressing the FCC to impose severe restrictions on ISP use of personal information while arguing before the FTC that industry self-regulation is the way to go.

This double standard approach came to a head last week when a collection of self-styled privacy advocates presented a letter to the FCC arguing that ISPs are uniquely positioned to spy on consumers and hence must be harshly controlled:

Providers of broadband Internet access service, including fixed and mobile telephone, cable, and satellite television providers, have a unique role in the online ecosystem. Their position as Internet gatekeepers gives them a comprehensive view of consumer behavior and until now privacy protections for consumers using those services have been unclear. Nor is there any way for consumers to avoid data collection by the entities that provide Internet access service.

While ISPs are first and last to carry bits between consumers and Internet services, the claims about “gatekeeping” and the helplessness of consumers to guard personal information from ISPs are factually challenged. There are many gates and many gatekeepers on the Internet, of course. Many transactions begin with a Google search, many gaming sessions take place over Facebook, and many purchases are mediated by Amazon, eBay, or PayPal. Payment services have the most luscious information of all: what we purchase, where we purchased it, and how much we paid for it. That’s not exactly small potatoes, and it makes more sense to use PayPal with its top-notch security than some tiny web site of uncertain reputation to handle your credit card information.

Peter Swire, the privacy czar in the Clinton Administration, pushed back on the privacy hawk letter with a technical analysis of the two main claims, which he finds technically dubious for the same reasons I do. In the first place, many web services now use HTTPS, which encrypts the information exchanged between user and web site. If I search for “cats” on Google, the URL looks like something like this:

https://www.google.com/search?q=cats&rlz=1C5CHFA_enUS563US566&oq=cats&aqs=chrome.0.69i59j0l5.2170j0j9&sourceid=chrome&es_sm=91&ie=UTF-8

This causes the browser to do a DNS query for www.google.com and to send the rest of the query in encrypted format to the IP address returned by DNS. Hence, the only information the ISP can see is the destination IP Address and its equivalent domain name. So the degree of access the ISP has is determined largely by the service that’s being used (Google in this case.) A lot can be made of the fact that the DNS query is done on plain text, but that’s about to change.

IETF has a DNS Privacy working group developing a means of cloaking DNS queries going to external resolvers operated by Google and a few others; it’s called “DPRIVE.” DPRIVE started last summer with an informational document, RFC 7626, setting out goals and principles. It addresses some of the myths about DNS:

2.1.  The Alleged Public Nature of DNS Data

   It has long been claimed that "the data in the DNS is public".  While
   this sentence makes sense for an Internet-wide lookup system, there
   are multiple facets to the data and metadata involved that deserve a
   more detailed look.  First, access control lists and private
   namespaces notwithstanding, the DNS operates under the assumption
   that public-facing authoritative name servers will respond to "usual"
   DNS queries for any zone they are authoritative for without further
   authentication or authorization of the client (resolver).  Due to the
   lack of search capabilities, only a given QNAME will reveal the
   resource records associated with that name (or that name's non-
   existence).  In other words: one needs to know what to ask for, in
   order to receive a response.  The zone transfer QTYPE [RFC5936] is
   often blocked or restricted to authenticated/authorized access to
   enforce this difference (and maybe for other reasons).

   Another differentiation to be considered is between the DNS data
   itself and a particular transaction (i.e., a DNS name lookup).  DNS
   data and the results of a DNS query are public, within the boundaries
   described above, and may not have any confidentiality requirements.
   However, the same is not true of a single transaction or a sequence
   of transactions; that transaction is not / should not be public.  A
   typical example from outside the DNS world is: the web site of
   Alcoholics Anonymous is public; the fact that you visit it should not
   be.

This puts the focus on cloaking the lookup, and the working group is developing the means to do that. When DNS lookups are cloaked, the ISP loses its special power to know which domains you’re interested in visiting.

Next, you can also cloak web queries – the http stuff – and off-domain references by using VPNs, as many people do when working from home. VPNs have a performance penalty, so not that many people use them because they can make pokey web sites even pokier. They also have a cost, which is really what this battle is all about.

Companies that are heavily invested in advertising already provide DNS service to keep destinations hidden from ISPs, and the same firms that do that can easily provide VPNs for attractive prices – like free – for the same purpose. The fact that they don’t currently do this does not change the fact that the second claim in the privacy letter is false. It is simply not the case that “consumers [cannot] avoid data collection by the entities that provide Internet access service.” VPNs are such a means. They aren’t popular, but they exist. In many countries, VPNs are essential to using Netflix, which is a bit embarrassing as they’re a means of bypassing content licenses

If DNS as it currently works is a privacy vulnerability, this says something interesting about the nature of the service offered by ISPs, as a matter of fact. And the fact that DNS can be provided by third parties also says something interesting about differential privacy regulations.

As the FCC sees it, collecting information from DNS queries answered by an ISP is regulated under Title II because DNS is inseparable from the “offer” of Internet service. But the FCC’s logic also says that collecting information from DNS queries submitted to Google is exempt from Title II and FCC jurisdiction because Google is not an ISP (except in a few towns with an unknown number of users.)

Does that make sense? If the FCC and the FTC adopt uniform regulations for DNS privacy the double standard goes away, but that will take us into the territory where the FCC’s Title II regulations extend beyond the scope of ISPs and into the fabric of the Internet itself.

Perhaps this slippery slope is inevitable, but it raises a host of questions.

 

Comments
  • imispgh

    You did not mention the possibility of man-in-the-middle attacks (MTM)
    when there is no VPN. Also the ISPs nor folks down stream do not go out
    of their way yo suggest VPNs even exist. I bet less than 10% of the
    public knows what they are and what they actually do.

    • Richard Bennett

      DNSSEC is the solution to MTM attacks on DNS queries.

Comments are closed.