Personal Data Collection: The Complete WIRED Guide


The information data brokers collect may be inaccurate or out of date. Still, it can be incredibly valuable to corporations, marketers, investors, and individuals. In fact, American companies alone are estimated to have spent over $19 billion in 2018 acquiring and analyzing consumer data, according to the Interactive Advertising Bureau.

Data brokers are also valuable resources for abusers and stalkers. Doxing, the practice of publicly releasing someone’s personal information without their consent, is often made possible because of data brokers. While you can delete your Facebook account relatively easily, getting these firms to remove your information is time-consuming, complicated, and sometimes impossible. In fact, the process is so burdensome that you can pay a service to do it on your behalf.

Amassing and selling your data like this is perfectly legal. While some states, including California and Vermont, have recently moved to put more restrictions on data brokers, they remain largely unregulated. The Fair Credit Reporting Act dictates how information collected for credit, employment, and insurance reasons may be used, but some data brokers have been caught skirting the law. In 2012 the “person lookup” site Spokeo settled with the FTC for $800,000 over charges that it violated the FCRA by advertising its products for purposes like job background checks. And data brokers that market themselves as being more akin to digital phone books don’t have to abide by the regulation in the first place.

There are also few laws governing how social media companies may collect data about their users. In the United States, no modern federal privacy regulation exists, and the government can even legally request digital data held by companies without a warrant in many circumstances (though the Supreme Court recently expanded Fourth Amendment protections to a narrow type of location data).

The good news is, the information you share online does contribute to the global store of useful knowledge: Researchers from a number of academic disciplines study social media posts and other user-generated data to learn more about humanity. In his book, Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are, Seth Stephens-Davidowitz argues there are many scenarios where humans are more honest with sites like Google than they are on traditional surveys. For example, he says, fewer than 20 percent of people admit they watch porn, but there are more Google searches for “porn” than “weather.”

Personal data is also used by artificial intelligence researchers to train their automated programs. Every day, users around the globe upload billions of photos, videos, text posts, and audio clips to sites like YouTube, Facebook, Instagram, and Twitter. That media is then fed to machine learning algorithms, so they can learn to “see” what’s in a photograph or automatically determine whether a post violates Facebook’s hate-speech policy. Your selfies are literally making the robots smarter. Congratulations.

Personal Data Collection The Complete WIRED Guide

The History of Personal Data Collection

Humans have used technological devices to collect and process data about the world for thousands of years. Greek scientists developed the “first computer,” a complex gear system called the Antikythera mechanism, to trace astrological patterns as far back as 150 BC. Two millennia later, in the late 1880s, Herman Hollerith invented the tabulating machine, a punch card device that helped process data from the 1890 United States Census. Hollerith created a company to market his invention that later merged into what is now IBM.

By the 1960s, the US government was using powerful mainframe computers to store and process an enormous amount of data on nearly every American. Corporations also used the machines to analyze sensitive information including consumer purchasing habits. There were no laws dictating what kind of data they could collect. Worries over supercharged surveillance soon emerged, especially after the publication of Vance Packard’s 1964 book, The Naked Society, which argued that technological change was causing the unprecedented erosion of privacy.

The next year, President Lyndon Johnson’s administration proposed merging hundreds of federal databases into one centralized National Data Bank. Congress, concerned about possible surveillance, pushed back and organized a Special Subcommittee on the Invasion of Privacy. Lawmakers worried the data bank, which would “pool statistics on millions of Americans,” could “possibly violate their secret lives,” The New York Times reported at the time. The project was never realized. Instead, Congress passed a series of laws governing the use of personal data, including the Fair Credit Reporting Act in 1970 and the Privacy Act in 1974. The regulations mandated transparency but did nothing to prevent the government and corporations from collecting information in the first place, argues technology historian Margaret O’Mara.

Toward the end of the 1960s, some scholars, including MIT political scientist Ithiel de Sola Pool, predicted that new computer technologies would continue to facilitate even more invasive personal data collection. The reality they envisioned began to take shape in the mid-1990s, when many Americans started using the internet. By the time most everyone was online, though, one of the first privacy battles over digital data brokers had already been fought: In 1990, Lotus Corporation and the credit bureau Equifax teamed up to create Lotus MarketPlace: Households, a CD-ROM marketing product that was advertised to contain names, income ranges, addresses, and other information about more than 120 million Americans. It quickly caused an uproar among privacy advocates on digital forums like Usenet; over 30,000 people contacted Lotus to opt out of the database. It was ultimately canceled before it was even released. But the scandal didn’t stop other companies from creating massive data sets of consumer information in the future.



Source link