« R Graph Gallery | Main | Aleks's presentation on color graphics »
March 20, 2006
Data mining and politics
Although this isn't really news anymore, Washington Post writes that Democrats' Data Mining Stirs an Intraparty Battle. The big plan is that the Democrat party will build a database of all its voters, as to be able to send customized messages and identify new potential voters, but hopefully also listen to opinions in a more effective manner and obtain feedback through detecting party swings.
There has been a pile of discussion over the past few years regarding the use of data mining technology in politics. The public perceives data mining as an intrusion of their privacy. As with any kind of technology, it is hard to restrict it: it just goes underground. At the same time, new technology used improperly is just like a new gun to a soldier - in the short run he's better off with a better gun, but in the long run everyone's worse off because the enemies will have better guns too. One answer to the efforts to data mine from the top is to data mine from the bottom too. The technology does enable the administration to observe the population, but at the same time the population can use the technology to observe the actions of the administration. While we hear about the top-down efforts from the media, there are several interesting bottom-up efforts going on, and they demonstrate the possible uses of technology.
John Walker built a hyperlinked version of parts of the US Code, and identified interesting special provisions, which would probably remain hidden without technology. GovTrack.us collects information about the US Government, linking it with discussions. There are several other web sites with a similar role: http://www.vote-smart.org/ and http://www.opensecrets.org/. PoliticalFriendster maintains profiles of officials along with their connections. The one that seems to have started this all, was MIT's Open Government Information Awarness, now accessible only through the web archives. OGIA was itself inspired by DARPA's Information Awareness Office, also only accessible through web archives. We have done some experiments with techniques for accessing the roll call voting records in a slightly more accessible way - and it is fascinating that some of the proximities that appear in the PoliticalFriendster also appear in the roll call vote record. Finally, one can view the structure of the budget graphically too.
So, in conclusion, technology is affecting how politics is done. The distinct asymmetry of mass broadcasting media differs from a somewhat more symmetric World Wide Web.
Posted by Aleks at March 20, 2006 3:03 PM
Trackback Pings
TrackBack URL for this entry:
http://www.stat.columbia.edu/~cook/movabletype/mt-tb.cgi/368
Comments
I wouldn't say MIT's OGIA started it all. I was part-way through the development of GovTrack when I found out about that site. (Although I did snatch one of their data files to bootstrap my database of congressmen. :-)
Posted by: Joshua Tauberer at March 20, 2006 4:03 PM.
Good opinion. Powerful learning tools may not get equal development for use by all sides, though. The bias for sophisticated data analysis is to facilitate the work of people who can afford consultants. One way to balance it would be with a well funded web based data exploration service. I think that would be well worth the price to any democracy, wouldn't you, an extension of the public library service perhaps. That's a need I've thought of for my own kind of data, a central repository of recorded timeline dynamics, with natural system identification tools built in. I'd really love to help people realize the value of that!
Posted by: Phil Henshaw at March 22, 2006 11:52 AM.
Indeed, a "data library" would be a really good thing to have. It is disappointing to only see the analyses and interpretations of the data in journals, but not the actual raw data. Even if one can find it, it comes in a variety of incompatible formats. There's much that could be done with some sort of internet tax.
Posted by: Aleks
at March 22, 2006 8:52 PM.