This past week, I was sitting at the local watering hole when a stranger came over to me and asked "Would you give me your name, address, date of birth, social security number and your email address?" I looked at him and said," Are you nuts?" He sat down and said, "My friend over there," pointing to a guy about twenty meters away, "bet me 10,000 baht I could not get that information from you or other sources in an hour."
We spoke for less than a minute and he said good bye and promised to return within the hour with all of my information. Seem unlikely, well, people often dole out all kinds of personal information on the Internet that allows such identifying data to be deduced. Services like Facebook, Twitter and Flickr are oceans of personal minutiae birthday greetings sent and received, school and work gossip, photos of family vacations, and movies watched.
Published reports from computer scientists and policy experts say that such seemingly innocuous bits of self-revelation can increasingly be collected and reassembled by computers to help create a picture of a person's identity, sometimes down to your name, date of birth, social security number and your e-mail address.
"Technology has rendered the conventional definition of personally identifiable information obsolete," said Maneesha Mithal, associate director of the Federal Trade Commission's privacy division. "You can find out who an individual is without it." So far, this type of powerful data mining, which relies on sophisticated statistical correlations, is mostly in the realm of university researchers, not identity thieves and marketers.
But the F.T.C. is worried that rules to protect privacy have not kept up with technology. The agency is convening three workshops on the issue to try to advise trade organizations. The F.T.C. concerns are hardly far-fetched. Last fall, Netflix awarded one million dollars to a team of statisticians and computer scientists who won a three-year contest to analyze the movie rental history of 500,000 Netflix subscribers and improve the predictive accuracy of Netflix's recommendation software by at least 10 percent. Netflix has announced that it was shelving plans for a second contest bowing to privacy concerns raised by the F.T.C. and a private litigant. In 2008 A pair of researchers at the University of Texas showed that the customer data released for that first contest, despite being stripped of names and other direct identifying information, could often be "de-anonymized" by statistically analyzing an individual's distinctive pattern of movie ratings and recommendations.
In social networks, people can increase their defenses against identification by adopting tight privacy controls on information in personal profiles. Yet an individual's actions, researchers say, are rarely enough to protect privacy in the interconnected world of the Internet. You may not disclose personal information, but your online friends and colleagues may do it for you, referring to your school or employer, gender, location and interests. Patterns of social communication, researchers say, are revealing.
"Personal privacy is no longer an individual thing," said Harold Abelson, the computer science professor at M.I.T. "In today's online world, what your mother told you is true, only more so: people really can judge you by your friends." Collected together, the pool of information about each individual can form a distinctive "social signature," researchers say.
The power of computers to identify people from social patterns alone was demonstrated last year in a study by the same pair of researchers that cracked Netflix's anonymous database: Vitaly Shmatikov, an associate professor of computer science at the University of Texas, and Arvind Narayanan, now a researcher at Stanford University.
By examining correlations between various online accounts, the scientists showed that they could identify more than 30 percent of the users of both Twitter, the micro blogging service, and Flickr, an online photo-sharing service, even though the accounts had been stripped of identifying information like account names and e-mail addresses. "When you link these large data sets together, a small slice of our behavior and the structure of our social networks can be identifying," Mr. Shmatikov said.
The F.T.C. and Congress are weighing steps like tighter industry requirements and the creation of a "do not track" list, similar to the federal "do not call" list, to stop online monitoring.
Jon Kleinberg, a professor of computer science at Cornell University who studies social networks, is skeptical that rules will have much impact. His advice: "When you're doing stuff online, you should behave as if you're doing it in public because increasingly, it is."
Oh, by the way, the stranger was back in 44 minutes with all of the information he had earlier requested.
|Your name: *|
|Your email: *|
|Recepient's email: *|
|Enter code: *|