Software for Social Network Analysis: Pajek and Friends

Our project on social (citation and collaboration) networks in British and German political science involves networks with hundreds and thousands of nodes (scientists and articles). At the moment, our data come from the Social Science Citation Index (part of the ISI web of knowledge), and we use a bundle of rather eclectic (erratic?) scripts written in Perl to convert the ISI records into something that programs like Pajek or Stata can read. Some canned solutions (Wos2pajek, network workbench, bibexcel) are available for free, but I was not aware of them when I started this project, did not manage to install them properly, or was not happy with the results. Perl is the Swiss Army Chainsaw (TM) for data pre-processing, incredibly powerful (my scripts are typically less than 50 lines, and I am not an efficient programmer), and every time I want to do something in a slightly different way (i.e. I spot a bug), all I have to do is to change a few lines in the scripts.
After trying a lot of other programs available on the internet, we have chosen Pajek for doing the analyses and producing those intriguing graphs of cliques and inner circles in Political Science. Pajek is closed source but free for non-commercial use and runs on Windows or (via wine) Linux. It is very fast, can (unlike many other programs) easily handle very large networks, produces decent graphs and does many standard analyses. Its user interface may be slightly less than straightforward but I got used to it rather quickly, and it even has basic scripting capacities.

The only thing that is missing is a proper manual, but even this is not really a problem since Pajek’s creators have written a very accessible introduction to social network analysis that doubles up as documentation for the program (order from amazon.co.uk, amazon.com, amazon.de. However, Pajek has been under constant development since the 1990s (!) and has acquired a lot of new features since the book was published. Some of them are documented in an appendix, others are simply listed in the very short document that is the official manual for Pajek. You will want to go through the many presentations which are available via the Pajek wiki.

Of course, there is much more software available, often at no cost. If you do program Java or Python (I don’t), there are several libraries available that look very promising. Amongst the stand-alone programs, visone stands out because it can easily produce very attractive-looking graphs of small networks. Even more software has been developed in the context of other sciences that have an interest in networks (chemistry, biology, engineering etc.)
Here is a rather messy collection of links to sna software. Generally, you will want something that is more systematic and informative. Ines Mergel has recently launched a bid for creating a comprehensive software list on wikipedia. The resulting page on social network analysis software is obviously work in progress but provides very valuable guidance.

Technorati-Tags: sna, software, political science, network, analysis, perl, citation, bibliometrics, networks, social, social networks

Software for Social Network Analysis: Pajek and Friends 1

2 thoughts on “Software for Social Network Analysis: Pajek and Friends”

  1. I found problems in going from Web of Science to Pajek too, and have searched through your site in order to locate the perl scripts you are using, but without success. Is there a link I have skipped? Because it looked just like the kind of solution I might be looking for.
    Congratulations for your work, by the way.

    Reply
    • Rodrigo,
      as far as I know, the WoS site is designed to make mass-downloads by script difficult, if not impossible. At the end of the day, we asked a research assistant to download records in parcels of 500 which we then fed through a series of idiosyncratic, ill designed and undocumented scripts to extract the information we were after. These scripts were tuned to the research problem of the the day and are therefore not ready for prime time (or indeed for anybody else). I intend to rewrite the whole shebang in Python (which has a much clearer syntax plus pre-built structures to represent network data) if and when I find the time.

      Reply

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.