Tech Policy / Privacy

Largest Study of Online Tracking Proves Google Really Is Watching Us All

Google’s Web trackers are present on the majority of the Web’s top million sites.

May 18, 2016

If you read or clicked anything online today, some part of Google probably knows about it.

That’s one lesson from the largest study yet of the technology that tracks people’s movements around the Web. When Princeton researchers logged the use of tracking code on the Internet’s million most popular websites, Google code was found on a majority of them.

Google Analytics, a product used to log visitors to websites that integrates with the company’s ad-targeting systems, was found on almost 70 percent of sites. DoubleClick, a dedicated ad-serving system from Google, was found on close to 50 percent of sites. The top five most common tracking tools were all Google-owned.

The new study, which was conducted using an open-source tool, also uncovered a stealthy new technique used by some small tracking companies that exploits the way browsers process audio, using it to “fingerprint” computers so they can be tracked around the Web. The researchers say the technology they used to collect their findings could help regulators and the developers of tools that block tracking, like some ad blockers.

Online publishers use tracking code from Google, Facebook, and many other companies to help target advertising. When a company’s code is embedded on many different sites, it can build up detailed profiles on individuals as they move around the Web, assigning them unique identifiers so they can be recognized again.

Researchers and privacy advocates say this practice deserves close scrutiny.

For example, a tracker that knows you recently browsed articles related to pregnancy, baby clothes, and miscarriage might indeed be able to find an ad you are likely to click. Because the systems that do this are automated, it is likely that no human looked at the data. But you may not feel comfortable with the idea of that information bouncing around a tangle of companies and algorithms outside your control.

Research has shown how Google’s ad-targeting system can use information in ways that might be seen as discriminatory: for example, by targeting men but not women with ads for high-paying jobs (see “Probing the Dark Side of Google’s Ad-Targeting System”). Documents leaked by Edward Snowden indicated that the National Security Agency tapped into Google’s online tracking systems as a way to identify surveillance targets.

The new study of online trackers was carried out by Arvind Narayanan, an assistant professor at Princeton, and grad student Steven Englehardt. They surveyed the one million sites using software developed at Princeton, called OpenWPM, which has been released for free. It automatically visits websites using the Firefox browser and logs any tracking technology it encounters.

Narayanan says the study turned up causes for both concern and hope.

For example, the researchers uncovered a new trick in which some companies silently send an audio signal to your browser. Minute differences in software and hardware mean the response can be used to identify your computer (try it out on yourself here). “There are ever more creative and intrusive types of fingerprinting being deployed,” says Narayanan.

But he adds that the dominance of Google and a handful of other giants could also help regulators and citizens keep tabs on the trackers keeping tabs on us. “Only a small number of companies have trackers that are really prevalent,” says Narayanan. “This suggests that external oversight and public pressure can lead to positive change.”

The consolidation of power in Web tracking should also make things easier for the maintainers of tools that block Web trackers. The Princeton study tested one such tool, the browser plug-in Ghostery, and found it effective.