Author: Maria Xynou (Researcher, Tactical Tech)
Whether we realise it or not, most of our online activity is part of a big business. The data industry makes billions of dollars from collecting data on who we are and what we are interested in by tracking the websites we access every day and the types of news we read online. But who are these trackers and what happens to our data?
In an attempt to increase transparency with regards to these issues, Tactical Tech launched a new project on global online tracking called “Trackography”. Through this project, Tactical Tech identified hundreds of companies which track individuals around the world when they access media websites. We zoomed in on the privacy policies of the 26 globally prevailing tracking companies to gain some insight on how these companies handle our data. Below we provide an overview of their privacy policies and limitations.
Privacy Policies
Many companies have privacy policies based on which they claim that they handle the data that they collect. While it remains unclear whether and to what extent companies actually comply with and limit their activities to everything stated in their privacy policies, these policies are worth looking at due to the following reasons:
-
The examination of privacy policies can potentially provide us an insight about the data industry
-
Companies provide a certain degree of transparency with regards to how they handle our data through their privacy policies
-
Companies can potentially be held accountable if it is proven that they have handled data in a way which is not in compliance with their privacy policies
-
Companies' privacy policies are supposed to adhere to their countries' privacy legislation and when this is not the case, they can be imposed a fine
-
The examination of privacy policies can potentially provide us an insight on what privacy safeguards they do and do not provide
-
If we identify means of handling data (such as the use of certain tracking technologies) which are not included in companies' privacy policies, we can demand accountability and seek further transparency
-
A comparison of various privacy policies can potentially highlight the companies which provide or fail to provide adequate safeguards
We decided to start with the privacy policies of the “globally prevailing tracking companies”, which are the main companies that track users in all of the results that have been collected through Trackography so far. While reading the privacy policies of 26 globally prevailing tracking companies, we identified various fields for data collection which we thought present some interest. Such fields include the types of data these companies collect, whether they disclose it to third parties and how long they retain it for. Trackography's csv on github includes this data collection, which anyone can access and contribute to.
Types of data collection and potential risks
Companies state in their privacy policies that they collect data which can fall within the following three categories:
-
Personally identifiable information
-
Non-personally identifiable information
-
Technical data
According to the privacy policies of 26 of the globally prevailing tracking companies, 20 of them state that they collect “personally identifiable information”. This is data which – on its own or in conjunction with other data – can be used to identify, contact or locate an individual. Such information can include, for example, your name, postal address, telephone number, email address and personal pictures. As these companies collect personal information about us through various sources, they can potentially link that information to other data collected about us through online tracking and create profiles about us. Such profiles can subsequently be shared with and sold to other third parties – including law enforcement agencies and/or the data industry.
Another type of data that is being collected by third party trackers includes “non-personally identifiable information”. Such data can correspond to a particular individual, account or profile, but is not necessarily sufficient to identify, contact or locate the individual to whom such data pertains. This data can include your interests, what articles were clicked on or where your mouse lingered. 25 out of 26 globally prevailing tracking companies state in their privacy policies that they collect such data.
A final type of data that is being collected by third party trackers includes so-called “technical data”. This is information which relates to an individual's device and online activity, such as a device's operating system, browser type, screen resolution and geographic location based on its IP address. 24 out of 26 globally prevailing tracking companies state in their privacy policies that they collect such data. It is noteworthy that only one company out of the 26 , Gemius, explicitly states in its privacy policy that it has introduced safeguards to prevent the full identification of IP addresses, by clearing the last bits of the IP address numbers.
While the collection of non-personally identifiable information and technical data might sound harmless, it is part of a bigger picture. Data which describes an individual can potentially be used to identify an individual. So-called “anonymous data” can be used to de-anonymise profiles and individuals – especially if data pertaining to an individual includes many mutual variables.
Let's imagine, for example, that third parties collect non-personally identifiable information and technical data about an individual which include the following: an IP address which shows that the individual is using a device in Ghana with regular access to LGBT websites and forums. It is not hard to do the math to conclude that there is a high probability that that individual is located in Ghana and interested in LGBT issues. If third parties subsequently identify the LGBT groups in Ghana, they can probably pinpoint the potential individuals the aforementioned data might belong to. And if they gain access to more data about these individuals through other sources – which they can buy from various companies in the data industry – they are likely to identify the specific individual to whom such “anonymous” data pertains to.
While companies often imply (or directly state) in their privacy policies that they respect their customers privacy by only collecting non-personal data, this is questionable since so-called anonymous data can quite easily be de-anonymised – especially in the context of the booming data industry.
How do tracking companies claim that they handle our data?
Companies which are part of the tracking business create profiles about us which can subsequently be sold to the advertising industry and other third parties. 24 out of 26 companies explicitly state in their privacy policies that they disclose the data that they collect to third parties, while only 5 of them prohibit such parties from using it for unspecified purposes.
It is interesting to see that while Google, for example, states in its privacy policy that it only discloses data to third parties for legal reasons and for external processing, it does not explicitly mention that it prohibits such third parties from using that data for unrelated purposes. Facebook states that it primarily discloses collected data to service providers, which are often legally required to hand over data to law enforcement agencies. Quantcast, on the other hand, directly states that it can share non-personally identifiable information with law enforcement agencies.
Interestingly enough, only 4 out of 26 globally prevailing tracking companies state in their privacy policies that they do not disclose personally-identifiable information to third parties without user consent. In short, most of the globally prevailing tracking companies state in their privacy policies that they disclose our data to third parties which can include advertisers, publishers, law enforcement and various other third parties.
Once data has been collected about us, companies provide us the option to access it. According to the privacy policies of 26 globally prevailing tracking companies, most of them provide us the option to access our own data. However, it remains unclear whether companies do indeed provide us access to all data collected about us.
Data retention is at the heart of it all. Companies store the data they collect, which allows them to aggregate it, analyze it and to identify and match patterns across time. Only 11 out of 26 globally prevailing tracking companies disclose how long they retain data for. Out of the data retention periods that are disclosed, AddThis – which stores data for 1,825 days – appears to retain data for the longest period, while Twitter – which stores data for 37 days – appears to retain data for the shortest period. However, this might all be misleading mainly due to the following three reasons:
-
Companies often have different retention periods for different types of data (personally identifiable information, non-personally identifiable information and technical data)
-
Companies can renew their data retention periods (potentially multiple times) as mentioned in their privacy policies, which means that it is ultimately unclear how long they really retain data for
-
Companies share and disclose data to third parties, each of which has its own data retention framework
While it is clear that all 26 globally prevailing tracking companies do indeed store our data, it is extremely unclear for how long. One of the concerns is that security is a black box and what might not be a crime today, might potentially be viewed as a crime tomorrow. Data is part of an evolving society, which raises serious concerns of various forms of retribution which could potentially occur through its unregulated retention across time.
How do tracking companies claim that they protect our data?
In response to privacy concerns, companies argue that they comply with the EU- U.S. Safe Harbor framework and that they are TRUSTe certified. The EU- U.S. Safe Harbor framework requires companies to comply with the EU Directive 95/46/EC on the protection of personal data. This framework is intended for organizations which operate within the EU and that store the customer data of EU citizens, while the International Safe Harbor Privacy Principles are designed to prevent accidental information disclosure or loss.
However, the main issue revolves around the fact that companies share and disclose data to countless other third parties, which might not necessarily comply with and enforce privacy safeguards. And when data is retained almost indefinitely, it remains unclear which third parties subsequently gain access to it and who they share and disclose such data to. Companies are not always transparent about the uses of their data collection in their privacy policies or when directly asked about it. While it is important for companies to comply with the Safe Harbor framework and its privacy principles, it appears to be inadequate in terms of regulating the use of data by other third parties and in ensuring that companies are indeed transparent about all the uses of the data that they collect.
TRUSTe is a trust mark sales company based in the U.S which assesses, monitors, and certifies websites, mobile apps, cloud, and advertising channels to ensure that companies "safely collect and use customer data to power their business". 12 out of 26 globally prevailing tracking companies state in their privacy policies that they are TRUSTe certified. While external security certification might be necessary to assess the practices carried out by companies, such certification alone is inadequate in assessing whether companies do indeed safely collect and use customer data in a way which safeguards customers' rights. And even if they do – which is largely debatable – it remains unclear whether all the other third parties which subsequently gain access to collected data take measures to adequately protect it.
More importantly though, companies argue that customers have the ability to “opt-out” from online tracking. “Opt-out” refers to several methods through which individuals can avoid being tracked through various technologies, such as browser cookies. Advertising companies use such technologies to collect information about an individual's web-browsing behaviour and to send unsolicited product or service information. While almost all globally prevailing tracking companies state in their privacy policies that users can "opt-out" from online tracking, this option is largely conditional due to some of the following reasons:
-
users can only opt-out if their browser is not configured to block third party cookies
-
users can only opt-out by canceling their account with a service
-
users need to opt-out from every device that they use
-
users can only opt-out from the browser that they are using, which means that they can still be tracked when they use different browsers
-
if users remove tracking cookies, they will not be able to access certain services
-
if users opt-out, they will have restricted access to content and features
-
in some cases, users can only opt-out through the Digital Advertising Alliance website
In short, companies state that we can “opt-out” from their online tracking, but this option appears to largely be illusionary. After all, only 3 out of 26 globally prevailing tracking companies state in their privacy policies that they support Do Not Track (DNT). DNT is a HTTP header field that requests a web application to disable its tracking or cross-site user tracking of an individual user. Even though the Federal Trade Commission has asked companies to support DNT, most companies appear to ignore it in practice.
Beyond online tracking
Online tracking might seem harmless, but let's not forget that it's part of a bigger picture.
By examining the websites of the “globally prevailing tracking companies”, we noticed that many of them also engage in the profiling business. This means that data that they collect through online tracking can subsequently be matched with other data collected about us through other sources. While one piece of information (such as our IP address) might not necessarily tell a story about us, aggregate data (which can also include the websites we access, our email address, personal pictures and other information) can likely form a comprehensive and accurate one.
Many of us, for example, have Gmail accounts, regularly access YouTube and use Google maps – all of which are Google services. As a result, Google can link data which has been collected through our online tracking– such as our IP address and the websites we are interested in – with other data it has already collected about us through other services, such as YouTube. The large and vast volumes of data companies like Google have about us enables them to create profiles about us which they subsequently sell to third parties.
Companies, like Trackography's “globally prevailing tracking companies”, are in the business to make money – most of which is made out of profiles created about us, largely without our knowledge or consent. By examining their privacy policies, however, we can start to raise further questions and to even take action – especially when and if safeguards are inadequate.
Privacy policies change all the time and those of hundreds of companies have yet to be reviewed. All contributions to our research – which can be accessed via github – are more than welcome.