Disclose how the operator responds to Web browser “do not track” signals or other mechanisms that provide consumers the ability to exercise choice regarding the collection of personally identifiable information about an individual consumer’s online activities over time and across third-party Web sites or online services, if the operator engages in that collection. Disclose whether other parties may collect personally identifiable information about an individual consumer’s online activities over time and across different Web sites when a consumer uses the operator’s Web site or service. Though these requirements sound straightforward, in practice it can be difficult to determine which activities trigger the law’s disclosure requirement. This problem is compounded by the law’s loose relationship to the W3C Web Tracking Protection specification, the draft technology standard that established the “Do Not Track” mechanism and instigated the CalOPPA amendment. That specification, which has not yet graduated into a final standard, also attempts to define what kinds of tracking are acceptable and how they should be disclosed to users, but CalOPPA’s requirements do not map directly to the specification’s. This article is intended to serve as a practical guide to applying CalOPPA, grounded in the technology and policy behind Do Not Track.
How web requests work
To understand how users are tracked online and how the Do Not Track mechanism addresses the issue, it’s helpful to understand the basics about how web browsers interact with websites. When a user navigates to a website using a browser, the browser sends an HTTP request to the website’s server that looks something like this:The server then sends back a response in a similar format. If the request was successful, the server will provide the page requested, in the form of a text file in HTML format. The server can also instruct the browser in its response to store a cookie, i.e. a small text file provided by the server, on the user’s computer. The HTML file provided by the server will typically contain links to other files on the web server—including images files, stylesheets, and javascript code—which provide some of the page’s content, formatting, and functionality. For each of these linked resources, the browser sends an additional HTTP request and receives a corresponding response like those above. The browser then renders the web page by assembling all of these pieces as instructed by the HTML file. Most modern websites also include resources from third-party sites. For example, a page might embed an Instagram image or a Facebook “Like” button, or some javascript that allows the operator to monitor user interactions with Google Analytics. The browser will retrieve these in the same way, by sending HTTP requests to the third-party servers that hosts them. Those servers provide the content directly to the browser, and may also instruct the browser to store cookies. This all goes on in the background—the user may never see that his or her browser connected to these third-party services.
GET /index.html HTTP/1.1 Host: example.com
Privacy concerns around tracking
When we refer to “tracking” online, we might be talking about any number of activies, ranging from the commonplace and benign to the creepy or predatory. On the benign end of that range are things like:- The use of session cookies to keep track of a user’s activity over the course of a single visit to a website, for example to maintain a “shopping cart” for users who are not logged in.
- The short-term logging of visitors’ IP addresses, to identify abusive activity (e.g. denial-of-service attacks) or troubleshoot technical issues.
The Do Not Track signal
The Do Not Track signal is a browser option that indicates that the browser’s user wishes not to be tracked. When the browser sends an HTTP request to a website, the user’s Do Not Track preference is communicated as part of the request. A request containing a Do Not Track header looks something like this:Here, the Do No Track (DNT) signal is “on,” meaning the user requests not to be tracked. But an HTTP request is just that: a request. How the web server responds to the request, including the Do Not Track header, depends on how it’s configured. Right now, most websites probably are not configured to respond to Do Not Track headers at all, largely because the technical standard that is supposed to determine how they should respond has not been finalized. CalOPPA accounts for this uncertainty by limiting its requirements to disclosure: operators need not respond to Do Not Track signals in any particular way, but they must tell users how they respond. But without a final standard to navigate by, even this requirement puts operators in a difficult position. Should they create comprehensive Do Not Track policies and configure their systems to implement them—a potentially resource-intensive process for many sites—despite the risk that the final standard will vary significantly from the current draft? Should they wait for the final standard and, in the meantime, comply with CalOPPA by simply stating in their privacy policies: “This site does not respond to Do Not Track signals”? Many sites are opting for the latter, but to those unfamiliar with the law, a statement like this can make it look like the site is flouting privacy concerns. In a recent article questioning the privacy claims of secret-sharing websites Whisper and Secret, for example, Wired pointed out that “Secret even admits in its privacy policy that its website ignores the advertising industry’s ‘do not track’ option built into browsers.” CalOPPA’s definition of “tracking” also differs from the W3C specification’s. CalOPPA requires operators to disclose how they respond to Do Not Track if they engage in “the collection of personally identifiable information about an individual consumer’s online activities over time and across third-party Web sites or online services.” The specification, on the other hand, permits operators to track user activity over time—even if they’ve requested not to be tracked—so long as the operator does not share information about the users’ interactions with third parties. Thus, an operator that does not share data with third parties can comply with the specification without altering its behavior in response to Do Not Track signals, but may still be required by CalOPPA to notify users that it does not respond to Do Not Track signals.
GET /index.html HTTP/1.1 Host: example.com DNT: 1
Implementing the W3C Specification
Despite the unfinished state of the W3C specification, some operators will prefer to implement it rather than tell users that their Do Not Track preferences will be ignored. The specification has two sets of requirements: those that apply to “first party” websites and those that apply to “third party” websites. The first party is the site a user primarily intends to interact with—for example, when a user accesses google.com directly, by either entering “google.com” in the browser’s location bar or clicking a link to google.com, the first party is Google. If a user visits nytimes.com and that site displays ads provided by Google AdSense, the New York Times is the first party and Google is a third party. The first party’s basic obligation upon receiving a Do Not Track signal is simple: do not share information about the user’s visit with third parties. As long as the operator only uses the information internally, the specification places no limits on the information it may collect. If the first party engages a service provider to assist in processing the user’s data, but that provider is bound by contract not to share the data or use it for any other purpose, sharing with that service providers is not restricted regardless of the user’s Do Not Track preference. A third party to the user’s visit may only:- collect, share, or use data related to that interaction; or
- use data about previous network interactions in which it was a third party
- the user has explicitly consented to the collection, sharing, or use;
- the data is collected for one of several limited “permitted uses”—frequency capping, financial logging, security, debugging, or audience measurement—and subject to certain limitations; or
- the data is de-identified according to the procedure defined in the specification.