New proposed legislation by U.S. Senators Mark R. Warner and Josh Hawley seeks to protect privacy by forcing tech companies to disclose the “true value” of their data to users. Specifically, companies with more than 100 million users would have to provide each user with an assessment of the financial value of their data, as well as reveal revenue generated by “obtaining, collecting, processing, selling, using or sharing user data.” In addition, the DASHBOARD Act would give users the right to delete their data from companies’ databases.
The bill’s ambition centers on increasing transparency and empowering users, but estimating the value of user data isn’t simple and very well may not solve privacy issues.
The data collected by tech companies consists not just of traditional identifying information such as name, age and gender. Rather, as Harvard historian Rebecca Lemov has noted, it includes “Tweets, Facebook likes, Twitches, Google searches, online comments, one-click purchases, even viewing-but-skipping-over a photograph in your feed.” In other words, big data contains the mundane yet intimate moments of people’s lives. And, if Facebook captures your interactions with friends and family, Google your late night searches, and Alexa your living room commands, wouldn’t you want to know, as the bill suggests, what your “data is worth and to whom it is sold”?
However, calculating the value of user data isn’t that simple. Estimates on what user data is worth vary widely. They include evaluations of less than a dollar for an average person’s data to a slightly more generous US$100 for a Facebook user. One user sold his data for $2,733 on Kickstarter. To achieve this number, he had to share data including keystrokes, mouse movements and frequent screenshots.
Sadly, the DASHBOARD Act doesn’t specify how it would estimate the value of user data. Instead, it explains that the Securities and Exchange Commission, an independent federal government agency, “shall develop a method or methods for calculating the value of user data.” The commission, I believe, will quickly realize that estimating the value of user data is a challenging undertaking.
More than personal
The proposed legislation aims to provide users with more transparency. However, privacy is no longer solely a matter of personal data. Data shared by a few can provide insights into the lives of many. Facebook likes, for example, can help predict a user’s sexual orientation with a high degree of accuracy. Target has used its purchase data to predict which customers are pregnant. The case garnered widespread attention after the retailer figured out a teen girl was pregnant before her father did.
Such predictive ability means that private information isn’t just contained in user data. Companies can also infer your private information, based on statistical correlations in the data of a number of users. How can the value of such data be reduced to an individual dollar value? It is more than the sum of its parts.
What’s more, this ability to use statistical analysis to identify people as belonging to a group category can have far-reaching privacy implications. If service providers can use predictive analytics to guess a user’s sexual orientation, race, gender and religious belief, what is to stop them from discriminating on that basis?
Having been let loose, predictive technologies will continue to work even if users delete their part of the data that helped create them.
Control through data
The sensitivity of data depends not just on what it contains, but on how governments and companies can use it to exert influence. This is evident in my current research on China’s planned social credit system. The Chinese government plans to use national databases and “trustworthiness ratings” to regulate the behavior of Chinese citizens. Google’s, Amazon’s and Facebook’s “surveillance capitalism,” as author Shoshana Zuboff has argued, also uses predictive data to “tune and herd our behavior towards the most profitable outcomes.”
In 2014, revelations about how Facebook experimented with its feed to influence the emotional state of users ended in a public outcry. However, this instance just made visible how digital platforms, in general, can use data to keep users engaged and, in the process, generate more data.
Data privacy is as much about big tech’s ability to shape your personal life as about what it knows about you.
Who is harmed
The truth is that datafication, with all its privacy implications, does not affect everyone equally. Big data’s hidden biases and networked discrimination continue to reproduce inequalities around gender, race and class. Women, minorities and the financially poor are most strongly affected. UCLA professor Safiya Umoja Noble, for example, has shown how Google search rankings reinforce negative stereotypes about women of color. In light of such inequality how could a numerical value ever capture the “true” value of user data?
The proposed legislation’s lack of specificity is disconcerting. However, even more troubling might be its insistence that data transparency will be achieved by revealing monetary value alone. Numeric assessments of financial worth don’t reflect data’s power to predict our actions or guide our decisions.
The DASHBOARD Act aims to make the business of data more transparent and empower users, but if lawmakers want to tackle data privacy, they need to regulate not just data monetization, but more widely address the value and cost of data in people’s lives.
Samuel Lengen is a Research Associate at the Data Science Institute at the University of Virginia.