Chapter 6: Tracking Through Cookies
What does your typical day spent online look like?
You might visit a couple of Youtube channels you always watch, check out local news websites you always follow and get some entertainment from the same sources as always. The thing is, we all have our behavior patterns that change very little, offline and online: same food, same entertainment and so on. We change the rut when we experience major events, such as moving, buying a house or car, and that's what the ads are aiming for, to jump in right as we're about to have a major course correction and offer their product or service. The purpose of ads isn't to convince us to buy a product once, but to make us lifelong customers.
To do that, ad networks created a scheme to track users online and analyze their behavior, such as tracking where users click and how long they stay on each page, but first things first, let's explain cookies. Visiting a website sets a cookie , a small, unique text file that has legitimate uses, such as showing which links we've opened or keeping us logged in when we come back to the site. When the internet first became popular in the 1990s, the problem websites faced was that they had no way to distinfguish between users. Cookies was a solution that created a persistent identity for users; cookies worked back then, so the concept just remained and nobody really thought about what will happen when cookies get exploited.
A cookie is self-contained and can only be read by the website that created it. Cookies typically last for ten years and are removed only when they expire on their own or when the user clears them. All private information in a cookie, the user entered, such as form data or username and password combination that logs him or her into a website; everything else is called metadata , or data on data. The way ad networks hacked cookies is by realizing that a cookie is set if merely a single pixel is requested from a website . This means that an ad company can embed slivers of its content all across the internet and create a comprehensive surveillance grid that knows every move of every visitor.
To keep things simple, let's say Coca-Cola hires an ad company to serve soda ads online. The ad company approaches websites, such as CNN.com, and pays a couple of cents for each unique view and a bit more for each ad click that leads to a purchase of soda. CNN gets millions of visits, so now only has to write engaging and truthful content to keep people coming. Thus, users get interesting content, Coca-Cola gets to sell soda, but ad companies have the biggest task – they have to psychologically profile users to figure out which ad is the most appropriate and justify the millions that Coca-Cola gave them. So far, it's all pretty innocuous, but we're about to see how quickly this gets out of hand.
Now let's imagine John, a typical internet user. John visits CNN.com and gets served CNN's and 20 third-party cookies that have nothing to do with CNN itself but belong to websites owned by the ad company. Why? Because ads aren't on CNN itself; they are served from the ad company websites, and each sets its own cookie that doesn't have to contain any more information other than the time created, and now we've got John Smith's online presence pinpointed in time down to a millisecond.
John sees the soda ads on CNN but doesn't really feel thirsty. Now he's finished reading the article, and he goes to Youtube, which sets its own and another 20 third-party cookies, this time by different ad websites. Youtube also serves ads but let’s say it partnered with Nike to sell sneakers. The ad company that partnered with Nike serves different ads and watches John's behavior – was the ad watched to the end? Did John skip it? And so on. All of this helps compile data not just on John but on his entire demographic, so if John is 32 years old, not married and loves hiking, his behavior can be used to figure out what other men of that age, marital status and hobby preference like or dislike and what kind of ad will make them open their wallet.
Repeat this process enough, and over the course of the day John received hundreds of third-party cookies, only a few of which were actually necessary to use the websites he visited; every other cookie is there to track him online by showing when he visited each website. Websites can also agree to share other data on users behind the scenes, with users completely oblivious to the fact. In this way, the two ad companies create a sprawling web of surveillance, and it all started with trying to sell soda and sneakers to make everyone happy. Now imagine this same setup increased thousandfold, with different ad companies competing for data and ad placement, and you'll get a bit closer to the real picture of what using the internet is actually like.
It’s hard to overstate how much money is involved in advertising. In 2017, Pepsi made an ad[15] starring Kendall Jenner where she is shown posing a couple of times, walking and handing a can of Pepsi to another actor. She’s on the screen some 30 seconds but apparently got $400,000-1,000,000 for her role. These companies have enormous budgets and can afford to drop millions on ads without flinching, just to get a chance to penetrate another space before the competition. Imagine having a website and being approached by one of these companies with the offer of truckloads of money to place ads. It’s free money and completely legal, so why not do it?
Ad companies approach millions of website owners and offer them deals through Google ad services, which let owners make money by just getting visited; thus, exposing their users' behavior. Each user over time creates a completely unique stockpile of cookies that show their every move from the moment they got the first cookie. Web browsers do allow cookies to be deleted manually and usually have a separate option to reject all cookies, but some websites will detect the latter and refuse to give access to such user. Blocking some or all cookies can also make websites unusable since it's rare that a website hosts all of its content.
So, to recap: cookies are a useful piece of technology that has become the foundation of how we use the internet, but third parties have figured out how to exploit cookies and track users. Note how we qualified these ad company cookies as being “third party”, as that's the core issue in the entire cookie tracking problem. In this case, third-party content simply means anything served to the user without explicit permission or knowledge. For example, John visited CNN.com but got 20 third-party cookies by let's say adserve.com, adserver.com, adservices.com and so on.
By sharing content and serving a digital potpourri to users, websites have made it impossible to keep anything private or isolated; it would be like 50 ad executives listening in to every conversation you have with your friend and cutting in to offer an ad based on what you're mentioning. How is any of this cookie tracking legal? Don't websites, in this case, CNN.com, have to disclose that they're helping third parties track users? They actually do , it's just that nobody reads any of these privacy policies. It's quite brilliant because what would otherwise be surveillance is perfectly acceptable when the user consents.
In May 2018, EU introduced GDPR, a sweeping set of rules for websites using cookies for tracking, mandating that users have to give “informed consent”, so websites simply put up a huge banner for all incoming EU users that stated, “We're tracking you using cookies.” The user then dismisses it and continues being tracked. Now let's examine CNN's own privacy policy, in particular, the part where cookies are covered[16] . Privacy policies can change, but the core meaning will always stay the same. This one is current as of October 2018 and has a wall of text, but we'll just focus on the words “third party” – since that reveals the method. Ready?
“We or a third party platform with whom we work may place or recognize a unique cookie on your browser to enable you to receive customized content, offers, services or advertisements on our Services or other sites. These cookies contain no information intended to identify you personally.” 
You see how it's done? By simply admitting that, well, we might be tracking you but it's not intended , just like that, CNN is off the hook. Let's move on.
“We, our third party service providers, advertisers, advertising networks and platforms, agencies, or our Partners also may use cookies or other tracking technologies to manage and measure the performance of advertisements displayed on or delivered by or through the Turner Network and/or other networks or Services. This also helps us, our service providers and Partners provide more relevant advertising.” 
There's the admission that the user behavior is being analyzed to make better ads. There's just one more paragraph, and we're done.
“Syncing Cookies and Identifiers. We may work with our Partners (for instance, third party ad platforms) to synchronize unique, anonymous identifiers (such as those associated with cookies) in order to match our Partners' uniquely coded user identifiers to our own.” 
Can you see it? There are user profiles made based on what was visited on CNN and other websites and compared behind the scenes with what the ad company knows about the user.
We mentioned Youtube so let's examine its cookie policy. By visiting Youtube.com and scrolling all the way down, there's this tiny link titled “Privacy” with a lot of good info, but this is a video site, so search for “A look at cookies” and hear Google's engineer Maile Ohye explain cookies almost the same way we did at the start of this chapter. Overall, Google has put a lot of effort into being honest with its users and is probably the most transparent company when it comes to tracking. One thing you'll notice is few mentions of third-party services and companies. This is because Google is the third party. Google has become so big in the ad business that they command the market and they also allow users to control these hidden ad profiles to an extent by visiting the My Activity section of their Google account.
Google also hosts content, such as snippets of code, to help website owners save money on bandwidth. Isn't that wonderful? Remember what we said about cookies – if a single pixel is requested from a third-party website, it gets to set its cookie, so by hosting content, Google gets a much broader peek into browsing habits of users. When Youtube videos are embedded into pages, the cookie is set too, but Facebook, LinkedIn, and other social media do something similar with their embedded Like and Share buttons, all of which can be blocked with Adblock Plus. This covers cookies, now let's examine other content found on websites, such as Javascript.