In thinking about the Wikileaks phenomenon this week, I continually come back to the fact that so much information was leaked so easily. Apparently one individual was responsible for leaking the massive trove of State Department material and the previous haul of Department of Defense documents–more than 250,000 diplomatic cables from State Department diplomatic corps and 90,000 DOD documents about the U.S. war in Afghanistan. The information was released to the Wikileaks site in two installments by an army private named Bradley Manning. Manning had access to the Secret Internet Protocol Router Network (SIPRNET) run by the DOD and the State Department. According to Manning the data dump he had access to was not that difficult to transfer to his own media and then transfer to Wikileaks via a personal computer:
“lets just say *someone* i know intimately well, has been penetrating US classified networks, mining data like the ones described … and been transferring that data from the classified networks over the “air gap” onto a commercial network computer … sorting the data, compressing it, encrypting it, and uploading it to a crazy white haired aussie who can’t seem to stay in one country very long”. (http://www.boingboing.net/2010/06/19/wikileaks-a-somewhat.html)
Compared to the famous “Pentagon Papers” leaks, PFC Manning’s actions seem almost effortless. In 1969, Daniel Ellsberg and a friend at the Rand Corporation had to surreptitiously photocopy 47 volumes of a secret DOD history about Vietnam over off hours and weekends and sneak the documents out in small batches that would fit in his briefcase. According to Ellsberg, this took several months and he was convinced that he would be caught on several occasions. http://www.mostdangerousman.org/
The Wikileaks information is another example where technology has reduced the difficulty of connecting information with people over distance and in small packages. It is now extremely easy to share information—top secret information to the world. My concern about the Wikileaks leaks is not so much about the reveling of classified government information. It has more to do with what it says about protecting any information that is stored on a computer network.
Recently, the Federal Trade Commission released a report on digital privacy that raised the very legitimate concern about both public and private data mining activity. The report cites a general lack of knowledge by most people regarding how much data is routinely collected without explicit consent. Most of the mining is done electronically and automatically through web browser “cookies” that track our online movements and information we type into web sites. The information is mined by (mostly) private companies that operate in relative anonymity. The collection of most data exists in a legal gray area that the FTC argues should be better regulated. Furthermore, the data mined is sold around various networks of data brokers that are completely unregulated. The report contains a chart that shows how complicated and pervasive this data network is:
(http://www.ftc.gov/opa/2010/12/privacyreport.shtm)
One of the most basic recommendations that the report suggests is an “opt out” option in browsers that is easily employed. This would allow consumers the opportunity to not have their data collected or to be able to customize their data sharing options. This is particularly timely because the newest version of the Hypertext Markup Language will make data mining easy and ubiquitous. According to a New York Times article:
“The new Web language and its additional features present more tracking opportunities because the technology uses a process in which large amounts of data can be collected and stored on the user’s hard drive while online. Because of that process, advertisers and others could, experts say, see weeks or even months of personal data. That could include a user’s location, time zone, photographs, text from blogs, shopping cart contents, e-mails and a history of the Web pages visited.” (http://www.nytimes.com/2010/10/11/business/media/11privacy.html?_r=1&scp=1&sq=HTML%205%20and%20provacy&st=cse)
Data mining is the heart of what Eric Stoddart calls “dataveillence”. In his article in the Journal Studies in Christian Ethics, Stoddart measures he impact of electronic surveillance through data on our collective sense of privacy and personal liberty. Stoddart’s article is more about what institutions in civil society should be responsible for regulating data mining than about the actual technology behind the data collection. Both analysis are important if we are going to form an accurate of how and how much data is collected. Stoddart categorizes the societal investment in internal security that informs surveillance responses in two ways: “categorical suspicion” and “categorical seduction.”(Studies in Christian Ethics December 2008 vol. 21 no. 3 362-381)
While the need for security against terrorism and criminal activity drives Orwellian categorical suspicion surveillance, Stoddart, privacy advocates and the FTC all claim that the more Huxlyian categorical seduction is more worrisome. More worrisome, because we invite the surveillence into our lives through our online activity while we remain mostly unaware that is happening. Added to that is the concern that the various entities that deal in data are almost completely unregulated.
Am I worried? Yes and no. I am worried in a larger societal context that information about our private lives is collected and unregulated. I worry that we live more and more in a society that values privacy less and less.
On a consumer level, I love the ease and use of the World Wide Web. I take standard precautions to minimize indentity theft, but I have cookies enabled in all my browser settings and I regularly conduct financial transactions online. I would consider limiting those cookie enabled transactions an undue burden.
I suspect that my attitudes to my own data are probably common. How do you feel about data mining?
What steps do you employ to limit or deny data mining in our own life?
What measures would you support on the regulation of data mining? Any?
Given that there is no explicitly stated right to privacy in the US Constitution (bet there is a presumption of privacy built on precedent case law), would you support a constitutional change?
Do you have any professional experience with data mining?
Some further resources about privacy in the digital age:
The Electronic Frontier Foundations Surveillance Self Defense Project:
ACLU report on online privacy from 2003:

[...] Dan : Your Data or Mine? [...]