Trackography illustrates which companies track us when we read the news online. They track us by including embedded images and code in websites, which collect and send information to their servers. These companies - such as Google and Facebook - are the "third party trackers" which track our online activity through the use of browser cookies and other tracking technologies.
We ran our distributed data collection software in various countries around the world and identified hundreds of companies which track individuals through media websites. According to our results, some of these companies track individuals in almost all of the countries and media websites that we examined. We call them the "globally prevailing tracking companies".
Our results show that some of the "globally prevailing tracking companies", which track individuals' access to media websites around the world the most, include the following:
According to our tests, Google is the dominant tracking company globally, as it was included in more than 85% of the results that we collected - a percentage much higher than any other tracking company. This is likely due to the fact that most media websites around the world use Google Analytics.
View all of the globally prevailing tracking companies that we have examined so far through our repository on github.
According to the websites of these companies, they engage in (one or more of) the following:
Profiling: the process of constructing a profile, which is a visual display of data associated with a specific individual or group of individuals. Such profiles are generated by computerised data analysis.
Advertising: a business model which uses the Internet to deliver promotional marketing messages to consumers. The online advertising industry is often aided by web analytics, market research and profiling on customers and potential future customers. Behavioural targeting comprises a range of technologies and techniques used by online website advertisers to increase the effectiveness of their advertising based on individuals' web-browsing behaviour.
Market research: involves the collection and analysis of information about target markets and customers. Through the use of statistical and analytical methods on information about individuals and/or organisations, this research aids the advertising industry.
Web analytics: the measurement, collection, analysis and reporting of online data for the purpose of understanding and optimizing the usage of websites.
Web crawling: otherwise known as an Internet bot, a web crawler is a software application which runs automated tasks over the Internet for the purpose of indexing the content of a website or of the Internet as a whole.
These companies argue that they track users so that they can improve the services that they provide. However, how do these companies handle our data?
We analysed the privacy policies of some of the globally prevailing tracking companies to gain an insight on how they claim that they handle our data. In particular, we collected data on the following fields of data from their privacy policies:
The types of data they collect (PII, non-PII, technical data)
Whether they provide safeguards to prevent the full identification of individuals' IP addresses
Whether individuals can opt-out from their tracking
Whether they support Do Not Track (DNT)
The types of tracking technologies they use (browser cookies, flash cookies, web beacons/web bugs)
Whether they comply with the U.S- EU Safe Harbour Framework
Whether they are TRUSTe certified
PII: Personally Identifiable Information (PII) is data which can be used on its own or with other data to identify, contact or locate an individual. Examples of such information include an individual's name, postal address, telephone number, email address and personal pictures.
Non-PII: Non-Personally Identifiable Information (non-PII) is data that might correspond to a particular individual, account or profile, but which is not necessarily sufficient to identify, contact or locate the individual to whom such data pertains.
Technical data: information which relates to an individual's device and online activity, such as a device's operating system, browser type, screen resolution and geographic location based on its IP address.
Safeguards to prevent the full identification of IP addresses: Some companies have introduced safeguards that prevent the identification of the full IP address of users, by clearing the last bits of the IP address numbers.
Opt-out: refers to several methods through which individuals can avoid being tracked through various techniques and technologies, such as browser cookies, that advertising companies use to collect information about an individual's web-browsing behaviour and to send unsolicited product or service information.
Do Not Track (DNT): a HTTP header field that requests a web application to disable its tracking or cross-site user tracking of an individual user. The Federal Trade Commission has asked companies to support DNT, but various companies ignore it.
Browser cookies: otherwise known as HTTP cookies or web cookies, browser cookies are computer files made of text that a website sends to your computer's hard drive while you are viewing a website. Companies link the information they store in cookies to your information so that they can personalise your experience when you visit a website.
Flash cookies: otherwise known as Local Shared Objects (LSO), flash cookies are computer files made of text that websites which use Adobe Flash send to your computer's hard drive while you are viewing those websites. Companies link the information they store in flash cookies to your information so that they can personalise your experience when you visit a website.
Web beacons/ Web bugs: web tracking techniques which are embedded in web pages, are invisible to users and allow third parties to check that a user has viewed those web pages. Web beacons/Web bugs are commonly used for web page and email tracking, as part of web analytics.
U.S.- EU Safe Harbour Framework: requires companies to comply with the EU Directive 95/46/EC on the protection of personal data.This framework is intended for organisations within the U.S and EU that store customer data. The International Safe Harbour Privacy Principles are designed to prevent accidental information disclosure or loss.
TRUSTe: a trust mark sales company based in the U.S which assesses, monitors, and certifies websites, mobile apps, cloud, and advertising channels to ensure that companies "safely collect and use customer data to power their business".
The data we collected for the above fields based on the privacy policies of some of the globally prevailing tracking companies can be viewed through our repository on github.
1. Out of 25 globally prevailing tracking companies, 19 of them state in their privacy policies that they collect personally identifiable information (PII) and disclose data to third parties, without explicitly prohibiting them from using such data for unspecified purposes.
2. Only 11 out of 25 globally prevailing tracking companies disclose how long they retain data for in their privacy policies.
3. 22 out of 25 globally prevailing tracking companies are based in the United States of America.
4. Only 3 out of 25 globally prevailing tracking companies support Do Not Track (DNT).
5. While 25 globally prevailing tracking companies state in their privacy policies that users can "opt-out" from online tracking, this option is largely conditional in some cases due to some of the following reasons:
users can only opt-out if their browser is not configured to block third party cookies
users can only opt-out by cancelling their account with a service
users need to opt-out from every device that they use
users can only opt-out from the browser that they are using, which means that cross-site tracking across other browsers might continue
if users remove tracking cookies, they will not be able to access certain services
if users opt-out, they will have restricted access to content and features
in some cases, users can only opt-out through the Digital Advertising Alliance website
7. While most globally prevailing tracking companies comply with the U.S - EU Safe Harbor Framework, this does not prevent them from collecting users' data and from sharing it with third parties.
For more information about these companies, view the data we have collected on github. Additionally, read our article about how the prevailing tracking companies handle our data here.
How are intelligence agencies linked to some of the globally prevailing tracking companies?
Various companies track individuals through websites with the aim of creating profiles about them which can aid the advertising business. However, individual and group profiles can also potentially serve as a beneficial asset to law enforcement agencies for national security purposes.
Some examples through which intelligence agencies have attempted to gain access to data collected by some of the globally prevailing tracking companies, as included in confidential documents leaked by Edward Snowden, include the following:
1. The NSA collected and mined data in bulk from some of the main global tracking companies identified through this project, including Google, Facebook, Yahoo and AOL.
2. The NSA and GCHQ hacked into the data centres of two of the main global tracking companies identified through this project: Google and Yahoo.
3. The GCHQ monitored the activity of users in real time on Facebook and Twitter, which include two of the main global tracking companies identified through this project.
Trackography illustrates that online tracking is not as harmless as we might think it is. Various third parties track our online activity when we access websites and create profiles about us. However, we largely cannot control how these profiles are created, whether they are accurate, who subsequently gains access to them and how they are used.
We aim to increase transparency about the data collection industry. Help us track the trackers.