Web-scraping made easy: outwit

Archived; click post to view.
Excerpt:

Image via Wikipedia

[/caption] These days, a bonanza of political information is freely available on the internet.  Sometimes this information comes in the guise of excel sheets, comma separated data or other formats which are more or less readily machine readable. But more often than not, information is presented as tables designed to be read by humans. This is where the gentle art of screen scraping, web scraping or spidering comes in. In the past, I have used kludgy Perl scripts to get electoral results at the district level off sites maintained by the French ministry of the…

Technorati Tags: , , , , , , , ,

Software for Social Network Analysis: Pajek and Friends

Archived; click post to view.
Excerpt: Our project on social (citation and collaboration) networks in British and German political science involves networks with hundreds and thousands of nodes (scientists and articles). At the moment, our data come from the Social Science Citation Index (part of the ISI web of knowledge), and we use a bundle of rather eclectic (erratic?) scripts written in Perl to convert the ISI records into something that programs like Pajek or Stata can read. Some canned solutions (Wos2pajek, network workbench, bibexcel) are available for free, but I was not aware of them when I started this project, did not manage to install…

Technorati Tags: , , , , , , , , , , , , ,

Resolved: French Departements, INSEE, ISO and NUTS-3 codes

Archived; click post to view.
Excerpt: If you are interested in subnational politics, France is an interesting case for many reasons. On the one hand, the country is highly centralised and divided into 96 (European) Departements (administrative units) with equal legal rights (though Corsica is a bit of an exception to this). In fact, Departements were created after the revolution in an attempt to replace the provinces of the Ancien Regime with something rational and neat. On the other hand, the Departements are vastly different in terms of their size, population, economic, political and social structure, which gives you a lot of variance that can be…

Technorati Tags: , , , , ,