22

Busting the cookies and privacy myth

I am going to try and make today’s post as simple as possible. Note some explanations maybe a little too simple for my technical audience.

I was referenced in the WSJ article- Google’s iPhone tracking regarding my cross-domain cookies post.

For my technical audience, here is a detailed version of how my code was actually used.

So let us begin from scratch… what are cookies?

Cookies (data on your computer used to identify you) are stored whenever you visit a site. Usually, this data is used to keep you logged into a site or to figure out if you are a repeat visitor.

This is fine and in-order for the web to function in it’s current state, in many ways, essential.

Site's store cookies

In the above diagram, when you visit siteA.com, a cookie is set for siteA.com on your computer.

When you visit siteB.com, a cookie is set for siteB.com on your computer.

But siteB.com, CANNOT read the data from the cookies set in siteA.com and vice-versa. Thus siteB.com does not know if you ever visited siteA.com or what you did there.

Let’s introduce the ad-companies

Ad-companies started displaying ads on sites depending on the content of the page. So, if you are a webmaster, you add a line of JavaScript code and a neat looking advertisement is displayed on your site page.

Ads are displayed

But the information on the page is not enough to generate targeted ads. So the ad-companies wanted to know where you’ve been before.

Now here is where the privacy issue unfolds.

Tracking your moves

Inorder to track you, as you moved from one site to another (both displaying sites ads from the same ad-company), the companies need to store cookies.

Cookies of one site cannot be accessed by another

The problem is that cookies set on one site CANNOT be read by another site. In other words, cookies set on siteA.com cannot be read by siteB.com.

So the work-around used is to store a third-party cookie (i.e. a cross-domain cookie). When a user visits siteA.com, a cookie is not only set for siteA.com but also for a common domain which is accessed by all sites that use the ad-company’s code.

siteA.com and siteB.com set cookies for their domain as well as for adsite.com

Thus, siteA.com, siteB.com and all other sites displaying ads from adsite.com, store and access the same third-party cookie.

So when you visit siteA.com, a third-party cookie is created on adsite.com. When you visit siteB.com, the same cookie is read.

siteA.com and siteB.com access their own site cookies and the cookies of adsite.com

Thus adsite.com now knows that you have been to siteA.com before going to siteB.com. On the basis of this knowledge, a targeted ad can be shown. For an ad-company which provides ads to only two sites, this is not very useful. But imagine if you have 70% of the sites using your code.

Then, nearly most of the times, the same cookie can be read and your movements can be tracked as you browse through the internet. So now, the ad-company knows exact what kind of sites you are visiting and serve ads accordingly.

But why does the WSJ article only talk about Safari?

Safari, by default, does not allow third-party cookies to be created.

However, most major browsers- Firefox, Internet Explorer (using the correct header tags), Chrome etc. do ALLOW setting third-party cookies.

Where does your technique come in?

The technique I posted two years ago, allows setting third-party cookies for Safari as well. This helps in offering a consistent user experience irrespective of the browser used.

Note that other browsers already allow creation of third-party cookies. So Safari is the only exception.

How was this discovered?

A Stanford researcher, Jonathan Mayer found that many large companies like Google, MIG, PointRoll etc. used a variation of my technique to circumvent Safari’s policy.

Infact, Facebook’s official developer best practices article linked to my post (the page has since been removed).

You must remember that this issue affects Safari users ONLY.

Other browsers already allow third-party cookies and thus they CAN track you as they do NOT have any such policy.

How do I protect myself? Should I disable cookies?

NO. Do not disable cookies altogether. This will cause most of your sites (which have a login system) to stop working.

Third-party cookies also have a number of other uses like logging in users to third-party sites and are NOT always used for tracking. They cannot steal your data (a myth noted by some users in comments).

If you want, you can disable third-party cookies in browsers. Dennis O’Reilly has written a guide on how to disable third-party cookies for major browsers.

You may also want to use incognito/private browsing modes when using a public PC.

Questions/Suggestions?

If you have any questions about how this affects you or need further explanation, feel free to post a comment or contact me.


933 Words
16560 Views