In 2019, Google had opened up to the world on a possible scenario they had envisioned for the future, that being the elimination of third-party cookies in favor a new behavioral targeting method based around group browsing habits. This proposal would come to be known as the Privacy Sandbox, and would be based solely on FLoC (Federated Learning of Cohorts), a series of cookie-less protocols that work in parallel to deliver the experience that third-party cookies are currently able to uphold. The following initiative would open up new horizons in regards to how user data is collected, and would enable cohorts of a few thousand members or more to be formed based on similarities amongst individual browser sessions.
According to the specifics which have been outlined by the tech giant thus far, groups are cemented using domains of the sites which each user had previously visited, which are then inputted into an algorithm which is computed locally on each user’s machine (no need for a central server here). It is also stated that cohorts would be re-calculated on a weekly basis, which nets an enormous amount of potential with regards to being able to measure how a user’s behavior shifts over time. With this in mind, it is unsurprising that such a shift would also come with new difficulties in the privacy arena which we do not currently have to consider.
Fingerprinting
Being notoriously difficult to stop, fingerprinting (or the practice of gathering many distinct pieces of information from a user’s browser with the sole aim of creating a stable identifier on someone) has been an issue that many browser companies have fought long and hard to combat. The threat of fingerprinting is no different with FLoC, since a single cohort ID should not be able to distinguish you from a few thousand other individuals, though a tracker starting at this benchmark would have hundreds of thousands less records to search through and compare. Furthermore, the entropy of FLoC cohort data had been demonstrated to have equated to roughly 8 bits, which is even more potent given the fact that it is unlikely to be correlated with other information that the browser exposes.
Cross-Content Exposure
Secondly, for FLoC to be useful to advertisers, a user’s cohort will be required to reveal information on their behavior. This practice can be found elucidated within the following excerpt, taken from the project’s GitHub page:
This API democratizes access to some information about an individual’s general browsing history (and thus, general interests) to any site that opts into it. … Sites that know a person’s PII (e.g., when people sign in using their email address) could record and reveal their cohort. This means that information about an individual’s interests may eventually become public.
Though not harmful by itself, companies which use FLoC would be able to identify a user in other ways, including being able to tie the information it learns from FLoC to a user’s profile after they have signed in via Google’s services. Through this pipeline, a company would be able to expose two categories of information, that being possibilities for sites that they had visited previously, and their demographic background (i.e. race, political affiliation, etc.).
With this in mind, it is essential that privacy-oriented companies keep an eye on how metrics are gathered on their intended audiences, and make sure that the needs of their users are met in an ever-evolving market. Download Wave Browser and see how we are making a difference going forward in the browser-scape!