Chapter 6: Tracking Through
Cookies
What does your typical day spent online look
like?
You might visit a couple of Youtube channels
you always watch, check out local news websites you always follow
and get some entertainment from the same sources as always. The
thing is, we all have our behavior patterns that change very
little, offline and online: same food, same entertainment and so
on. We change the rut when we experience major events, such as
moving, buying a house or car, and that's what the ads are aiming
for, to jump in right as we're about to have a major course
correction and offer their product or service. The purpose of ads
isn't to convince us to buy a product once, but to make us lifelong
customers.
To do that, ad networks created a scheme to
track users online and analyze their behavior, such as tracking
where users click and how long they stay on each page, but first
things first, let's explain cookies. Visiting a website sets a
cookie , a small, unique text file
that has legitimate uses, such as showing which links we've opened
or keeping us logged in when we come back to the site. When the
internet first became popular in the 1990s, the problem websites
faced was that they had no way to distinfguish between users.
Cookies was a solution that created a persistent identity for
users; cookies worked back then, so the concept just remained and
nobody really thought about what will happen when cookies get
exploited.
A cookie is self-contained and can only be
read by the website that created it. Cookies typically last for ten
years and are removed only when they expire on their own or when
the user clears them. All private information in a cookie, the user
entered, such as form data or username and password combination
that logs him or her into a website; everything else is called
metadata , or data on data. The way
ad networks hacked cookies is by realizing that a cookie is set if
merely a single pixel is requested from a website . This
means that an ad company can embed slivers of its content all
across the internet and create a comprehensive surveillance grid
that knows every move of every visitor.
To keep things simple, let's say Coca-Cola
hires an ad company to serve soda ads online. The ad company
approaches websites, such as CNN.com, and pays a couple of cents
for each unique view and a bit more for each ad click that leads to
a purchase of soda. CNN gets millions of visits, so now only has to
write engaging and truthful content to keep people coming. Thus,
users get interesting content, Coca-Cola gets to sell soda, but ad
companies have the biggest task – they have to psychologically
profile users to figure out which ad is the most appropriate and
justify the millions that Coca-Cola gave them. So far, it's all
pretty innocuous, but we're about to see how quickly this gets out
of hand.
Now let's imagine John, a typical internet
user. John visits CNN.com and gets served CNN's and 20 third-party
cookies that have nothing to do with CNN itself but belong to
websites owned by the ad company. Why? Because ads aren't on CNN
itself; they are served from the ad company websites, and each sets
its own cookie that doesn't have to contain any more information
other than the time created, and now we've got John Smith's online
presence pinpointed in time down to a millisecond.
John sees the soda ads on CNN but doesn't
really feel thirsty. Now he's finished reading the article, and he
goes to Youtube, which sets its own and another 20 third-party
cookies, this time by different ad websites. Youtube also serves
ads but let’s say it partnered with Nike to sell sneakers. The ad
company that partnered with Nike serves different ads and watches
John's behavior – was the ad watched to
the end? Did John skip it? And so on. All of this helps compile
data not just on John but on his entire demographic, so if John is
32 years old, not married and loves hiking, his behavior can be
used to figure out what other men of that age, marital status and
hobby preference like or dislike and what kind of ad will make them
open their wallet.
Repeat this process enough, and over the
course of the day John received hundreds of third-party cookies,
only a few of which were actually necessary to use the websites he
visited; every other cookie is there to track him online by showing
when he visited each website. Websites can also agree to share
other data on users behind the scenes, with users completely
oblivious to the fact. In this way, the two ad companies create a
sprawling web of surveillance, and it all started with trying to
sell soda and sneakers to make everyone happy. Now imagine this same setup increased thousandfold,
with different ad companies competing for data and ad placement,
and you'll get a bit closer to the real picture of what using the
internet is actually like.
It’s hard to overstate how much money is
involved in advertising. In 2017, Pepsi made an ad[15] starring Kendall Jenner where she is shown
posing a couple of times, walking and handing a can of Pepsi to
another actor. She’s on the screen some 30 seconds but apparently
got $400,000-1,000,000 for her role. These companies have enormous
budgets and can afford to drop millions on ads without flinching,
just to get a chance to penetrate another space before the
competition. Imagine having a website and being approached by one
of these companies with the offer of truckloads of money to place
ads. It’s free money and completely legal, so why not do it?
Ad companies approach millions of website
owners and offer them deals through Google ad services, which let
owners make money by just getting visited; thus, exposing their
users' behavior. Each user over time
creates a completely unique stockpile of cookies that show their
every move from the moment they got the first cookie. Web browsers
do allow cookies to be deleted manually and usually have a separate
option to reject all cookies, but some websites will detect the
latter and refuse to give access to such user. Blocking some or all
cookies can also make websites unusable since it's rare that a
website hosts all of its content.
So, to recap: cookies are a useful piece of
technology that has become the foundation of how we use the
internet, but third parties have figured out how to exploit cookies
and track users. Note how we qualified these ad company cookies as
being “third party”, as that's the core issue in the entire cookie
tracking problem. In this case, third-party content simply means anything
served to the user without explicit permission or knowledge. For
example, John visited CNN.com but got 20 third-party cookies by
let's say adserve.com, adserver.com, adservices.com and so
on.
By sharing content
and serving a digital potpourri to users, websites have made it
impossible to keep anything private or isolated; it would be like
50 ad executives listening in to every conversation you have with
your friend and cutting in to offer an ad based on what you're
mentioning. How is any of this cookie tracking legal? Don't
websites, in this case, CNN.com, have to disclose that they're
helping third parties track users? They actually do , it's just that
nobody reads any of these privacy policies. It's quite brilliant
because what would otherwise be
surveillance is perfectly acceptable when the user
consents.
In May 2018, EU introduced GDPR, a sweeping
set of rules for websites using cookies for tracking, mandating
that users have to give “informed consent”, so websites simply put
up a huge banner for all incoming EU users that stated, “We're
tracking you using cookies.” The user then dismisses it and
continues being tracked. Now let's examine CNN's own privacy
policy, in particular, the part where cookies are
covered[16] . Privacy policies can
change, but the core meaning will always stay the same. This one is
current as of October 2018 and has a wall of text, but we'll just
focus on the words “third party” – since that reveals the method.
Ready?
“We or a third
party platform with whom we work may place or recognize a
unique cookie on your browser to enable you to receive customized
content, offers, services or advertisements on our Services or
other sites. These cookies contain no information intended to identify you
personally.”
You see how it's done? By simply admitting
that, well, we might be tracking you but it's not intended , just like that, CNN is off the hook.
Let's move on.
“We, our third
party service providers, advertisers, advertising networks
and platforms, agencies, or our Partners also may use cookies or
other tracking technologies to manage and measure the performance of advertisements displayed on or
delivered by or through the Turner Network and/or other networks or
Services. This also helps us, our service providers and Partners
provide more relevant advertising.”
There's the admission that the user behavior
is being analyzed to make better ads.
There's just one more paragraph, and we're done.
“Syncing Cookies and Identifiers. We may
work with our Partners (for instance, third party ad platforms) to synchronize unique, anonymous identifiers (such
as those associated with cookies) in order to match our Partners'
uniquely coded user identifiers to our own.”
Can you see it? There are user profiles made
based on what was visited on CNN and other websites and compared
behind the scenes with what the ad company knows about the
user.
We mentioned Youtube so let's examine its
cookie policy. By visiting Youtube.com and scrolling all the way
down, there's this tiny link titled “Privacy” with a lot of good
info, but this is a video site, so search for “A look at cookies”
and hear Google's engineer Maile Ohye explain cookies almost the
same way we did at the start of this chapter. Overall, Google has
put a lot of effort into being honest with its users and is
probably the most transparent company when it comes to tracking.
One thing you'll notice is few mentions of third-party services and
companies. This is because Google is the third party. Google has become so big in
the ad business that they command the market and they also allow
users to control these hidden ad profiles to an extent by visiting
the My Activity section of their Google account.
Google also hosts content, such as snippets of code, to help
website owners save money on bandwidth. Isn't that wonderful?
Remember what we said about cookies – if a single pixel is
requested from a third-party website, it gets to set its cookie, so
by hosting content, Google gets a much broader peek into browsing
habits of users. When Youtube videos are embedded into pages, the
cookie is set too, but Facebook, LinkedIn, and other social media
do something similar with their embedded
Like and Share buttons, all of which can be blocked with Adblock
Plus. This covers cookies, now let's examine other content found on
websites, such as Javascript.