<?xml version="1.0" encoding="UTF-8"?> <rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" ><channel><title>Kai Arzheimer &#187; R</title> <atom:link href="http://www.kai-arzheimer.com/blog/tag/r/feed/" rel="self" type="application/rss+xml" /><link>http://www.kai-arzheimer.com/blog</link> <description>A political science blog</description> <lastBuildDate>Sat, 21 Jan 2012 19:06:37 +0000</lastBuildDate> <language>en</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <generator>http://wordpress.org/?v=3.3.1</generator> <item><title>Sampling from a Multinomial Distribution in Stata</title><link>http://www.kai-arzheimer.com/blog/sampling-from-a-multinomial-distribution-in-stata/</link> <comments>http://www.kai-arzheimer.com/blog/sampling-from-a-multinomial-distribution-in-stata/#comments</comments> <pubDate>Sat, 09 Apr 2011 22:02:23 +0000</pubDate> <dc:creator>kai</dc:creator> <category><![CDATA[Data and Methods]]></category> <category><![CDATA[My Stuff]]></category> <category><![CDATA[categorical variable]]></category> <category><![CDATA[distribution]]></category> <category><![CDATA[multinomial]]></category> <category><![CDATA[R]]></category> <category><![CDATA[random process]]></category> <category><![CDATA[stata]]></category><guid isPermaLink="false">http://www.kai-arzheimer.com/blog/?p=806</guid> <description><![CDATA[Sometimes, a man&#8217;s gotta do what a man&#8217;s gotta do. Which, in my case, might be a little simulation of a random process involving an unordered categorical variable. In R, sampling from a multinomial distribution is trivial. rmultinom(1,1000,c(.1,.7,.2,.1)) gives me a vector of random numbers from a multinomial distribution with outcomes 1, 2, 3, and [...]]]></description> <content:encoded><![CDATA[<p>Sometimes, a man&#8217;s gotta do what a man&#8217;s gotta do. Which, in my case, might be a little simulation of a random process involving an unordered categorical variable. In R, sampling from a multinomial distribution is trivial.</p><p><code>rmultinom(1,1000,c(.1,.7,.2,.1))</code></p><p><span id="more-806"></span></p><p>gives me a vector of random numbers from a multinomial distribution with outcomes 1, 2, 3, and 4, where the probability of observing a &#8217;1&#8242; is 10 percent, the probability of observing a &#8217;2&#8242; is 70 per cent, and so on. But I could not find an equivalent function in Stata. Generating artificial data in R is not very elegant, so I kept digging and found a solution in section M-5 of the Mata handbook. Hidden in the entry on <tt>runiform</tt> is a reference to <code>rdiscrete(r,c,p)</code>, a Mata function which generates a <tt>r*c</tt> matrix of draws from a multinomial distribution defined by a vector <tt>p</tt> of probabilities.</p><p>That leaves but one question: Is wrapping a handful of lines around a Mata call to replace a non-existent Stata function more elegant than calling an external program?</p><div class="su-linkbox" id="post-806-linkbox"><div class="su-linkbox-label">Link to this post!</div><div class="su-linkbox-field"><input type="text" value="&lt;a href=&quot;http://www.kai-arzheimer.com/blog/sampling-from-a-multinomial-distribution-in-stata/&quot;&gt;Sampling from a Multinomial Distribution in Stata&lt;/a&gt;" onclick="javascript:this.select()" readonly="readonly" style="width: 100%;" /></div></div>]]></content:encoded> <wfw:commentRss>http://www.kai-arzheimer.com/blog/sampling-from-a-multinomial-distribution-in-stata/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Statistics and Data links roundup for November 23rd through December 29th</title><link>http://www.kai-arzheimer.com/blog/statistics-and-data-links-roundup-for-november-23rd-through-december-29th/</link> <comments>http://www.kai-arzheimer.com/blog/statistics-and-data-links-roundup-for-november-23rd-through-december-29th/#comments</comments> <pubDate>Sun, 10 Jan 2010 20:18:44 +0000</pubDate> <dc:creator>kai</dc:creator> <category><![CDATA[Data and Methods]]></category> <category><![CDATA[Political Science]]></category> <category><![CDATA[data]]></category> <category><![CDATA[datasets]]></category> <category><![CDATA[education]]></category> <category><![CDATA[imputation]]></category> <category><![CDATA[methods]]></category> <category><![CDATA[quantitative]]></category> <category><![CDATA[R]]></category> <category><![CDATA[sna]]></category> <category><![CDATA[statistics]]></category> <category><![CDATA[stats]]></category> <category><![CDATA[teaching]]></category> <category><![CDATA[tutorial]]></category><guid isPermaLink="false">http://www.kai-arzheimer.com/blog/?p=350</guid> <description><![CDATA[Statistics and Data links roundup for November 23rd through December 29th: The Data and Story Library &#8211; DASL (pronounced &#8220;dazzle&#8221;) is an online library of datafiles and stories that illustrate the use of basic statistics methods. We hope to provide data from a wide variety of topics so that statistics teachers can find real-world examples [...]]]></description> <content:encoded><![CDATA[<p>Statistics and Data links roundup for November 23rd through December 29th:</p><ul><li><a href="http://lib.stat.cmu.edu/DASL/DataArchive.html">The Data and Story Library</a> &#8211; DASL (pronounced &#8220;dazzle&#8221;) is an online library of datafiles and stories that illustrate the use of basic statistics methods. We hope to provide data from a wide variety of topics so that statistics teachers can find real-world examples that will be interesting to their students. Use DASL&#8217;s powerful search engine to locate the story or datafile of interest.</li><li><a href="http://www.politicaldata.org/?p=14">Drawing graphs using tikz/pgf &amp; gnuplot | politicaldata.org</a> -</li></ul><div class="su-linkbox" id="post-350-linkbox"><div class="su-linkbox-label">Link to this post!</div><div class="su-linkbox-field"><input type="text" value="&lt;a href=&quot;http://www.kai-arzheimer.com/blog/statistics-and-data-links-roundup-for-november-23rd-through-december-29th/&quot;&gt;Statistics and Data links roundup for November 23rd through December 29th&lt;/a&gt;" onclick="javascript:this.select()" readonly="readonly" style="width: 100%;" /></div></div>]]></content:encoded> <wfw:commentRss>http://www.kai-arzheimer.com/blog/statistics-and-data-links-roundup-for-november-23rd-through-december-29th/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Statistics and Data links roundup for November 14th through November 23rd</title><link>http://www.kai-arzheimer.com/blog/statistics-and-data-links-roundup-for-november-14th-through-november-23rd/</link> <comments>http://www.kai-arzheimer.com/blog/statistics-and-data-links-roundup-for-november-14th-through-november-23rd/#comments</comments> <pubDate>Mon, 23 Nov 2009 22:21:10 +0000</pubDate> <dc:creator>kai</dc:creator> <category><![CDATA[Data and Methods]]></category> <category><![CDATA[Political Science]]></category> <category><![CDATA[data]]></category> <category><![CDATA[education]]></category> <category><![CDATA[imputation]]></category> <category><![CDATA[methods]]></category> <category><![CDATA[quantitative]]></category> <category><![CDATA[R]]></category> <category><![CDATA[sna]]></category> <category><![CDATA[statistics]]></category> <category><![CDATA[teaching]]></category> <category><![CDATA[tutorial]]></category><guid isPermaLink="false">http://www.kai-arzheimer.com/blog/?p=339</guid> <description><![CDATA[Statistics and Data links roundup for November 14th through November 23rd: SNAP: Network datasets - Network data - CASOS Tools: Network Analysis Data &#124; CASOS &#8211; Network Analysis Data Casos &#38; INSA NodeXL Teaching &#8211; CASCI &#8211; datasets incl. senate Statistics and Data links roundup - It&#8217;s surprisingly difficult to find suitable datasets for a [...]]]></description> <content:encoded><![CDATA[<p>Statistics and Data links roundup for November 14th through November 23rd:</p><ul><li><a href="http://snap.stanford.edu/data/">SNAP: Network datasets</a> -</li><li><a href="http://www-personal.umich.edu/~mejn/netdata/">Network data</a> -</li><li><a href="http://www.casos.cs.cmu.edu/computational_tools/data2.php">CASOS Tools: Network Analysis Data | CASOS</a> &#8211; Network Analysis Data Casos &amp; INSA</li><li><a href="http://casci.umd.edu/NodeXL_Teaching">NodeXL Teaching &#8211; CASCI</a> &#8211; datasets incl. senate</li><li><a href="http://www.kai-arzheimer.com/blog/2009/11/14/statistics-and-data-links-roundup/">Statistics and Data links roundup</a> -</li></ul><p>It&#8217;s surprisingly difficult to find suitable datasets for a sna workshop that are relevant for political scientists.</p><div class="su-linkbox" id="post-339-linkbox"><div class="su-linkbox-label">Link to this post!</div><div class="su-linkbox-field"><input type="text" value="&lt;a href=&quot;http://www.kai-arzheimer.com/blog/statistics-and-data-links-roundup-for-november-14th-through-november-23rd/&quot;&gt;Statistics and Data links roundup for November 14th through November 23rd&lt;/a&gt;" onclick="javascript:this.select()" readonly="readonly" style="width: 100%;" /></div></div>]]></content:encoded> <wfw:commentRss>http://www.kai-arzheimer.com/blog/statistics-and-data-links-roundup-for-november-14th-through-november-23rd/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Statistics and Data links roundup</title><link>http://www.kai-arzheimer.com/blog/statistics-and-data-links-roundup/</link> <comments>http://www.kai-arzheimer.com/blog/statistics-and-data-links-roundup/#comments</comments> <pubDate>Sat, 14 Nov 2009 22:21:04 +0000</pubDate> <dc:creator>kai</dc:creator> <category><![CDATA[Data and Methods]]></category> <category><![CDATA[Political Science]]></category> <category><![CDATA[data]]></category> <category><![CDATA[education]]></category> <category><![CDATA[imputation]]></category> <category><![CDATA[methods]]></category> <category><![CDATA[quantitative]]></category> <category><![CDATA[R]]></category> <category><![CDATA[statistics]]></category> <category><![CDATA[teaching]]></category><guid isPermaLink="false">http://www.kai-arzheimer.com/blog/?p=314</guid> <description><![CDATA[Image via Wikipedia Kai Arzheimer: Vorlesung Statistik II - Welcome &#124; Teaching with Data (QSSDL) &#8211; TeachingWithData.org (TwD) is a repository of tools and educational materials designed to improve quantitative literacy skills in social science courses. Built especially for faculty teaching post-secondary courses in such areas as demography, economics, geography, political science, social psychology, and [...]]]></description> <content:encoded><![CDATA[<div class="zemanta-img" style="margin: 1em; display: block;"><div><dl class="wp-caption alignright" style="width: 230px;"><dt class="wp-caption-dt"><a href="http://commons.wikipedia.org/wiki/Image:The_Normal_Distribution.svg"><img title="The re-drawn chart comparing the various gradi..." src="http://upload.wikimedia.org/wikipedia/commons/thumb/2/25/The_Normal_Distribution.svg/300px-The_Normal_Distribution.svg.png" alt="300px The Normal Distribution.svg Statistics and Data links roundup" width="220" height="165" /></a></dt><dd class="wp-caption-dd zemanta-img-attribution" style="font-size: 0.8em;">Image via <a href="http://commons.wikipedia.org/wiki/Image:The_Normal_Distribution.svg">Wikipedia</a></dd></dl></div></div><p><span id="more-314"></span></p><ul><li><a href="http://www.kai-arzheimer.com/Statistik-II/">Kai Arzheimer: Vorlesung Statistik II</a> -</li><li><a href="http://teachingwithdata.org/qssdl/welcome.action">Welcome | Teaching with Data (QSSDL)</a> &#8211; TeachingWithData.org (TwD) is a repository of tools and educational materials designed to improve quantitative literacy skills in social science courses. Built especially for faculty teaching post-secondary courses in such areas as demography, economics, geography, political science, social psychology, and sociology, the materials include stand-alone learning activities, tools, and pedagogy services.The goal is to make it easier for faculty to bring real social science data into courses across the curriculum ranging from introductory classes to senior seminars.</li><li><a href="http://www.iq.harvard.edu/blog/sss/archives/2009/08/the_changing_na.shtml">Social Science Statistics Blog: The changing nature of R resources</a> -</li><li><a href="http://gking.harvard.edu/files/abs/pr-abs.shtml">Gary King &#8211; What to do About Missing Values in Time Series Cross-Section Data</a> &#8211; James Honaker and Gary King, &#8220;What to do About Missing Values in Time Series Cross-Section Data,&#8221; American Journal of Political Science, (April, 2010), forthcoming, copy at http://gking.harvard.edu/files/abs/pr-abs.shtml. (Paper: PDF)</li></ul><div class="zemanta-pixie" style="margin-top: 10px; height: 15px;"><a class="zemanta-pixie-a" title="Reblog this post [with Zemanta]" href="http://reblog.zemanta.com/zemified/9adbcef4-51b4-44ea-89d8-1bdaa5871e47/"><img class="zemanta-pixie-img" style="border: medium none; float: right;" src="http://img.zemanta.com/reblog_e.png?x-id=9adbcef4-51b4-44ea-89d8-1bdaa5871e47" alt=" Statistics and Data links roundup"  title="Statistics and Data links roundup photo" /></a><span class="zem-script more-related pretty-attribution"><script src="http://static.zemanta.com/readside/loader.js" type="text/javascript"></script></span></div><div class="su-linkbox" id="post-314-linkbox"><div class="su-linkbox-label">Link to this post!</div><div class="su-linkbox-field"><input type="text" value="&lt;a href=&quot;http://www.kai-arzheimer.com/blog/statistics-and-data-links-roundup/&quot;&gt;Statistics and Data links roundup&lt;/a&gt;" onclick="javascript:this.select()" readonly="readonly" style="width: 100%;" /></div></div>]]></content:encoded> <wfw:commentRss>http://www.kai-arzheimer.com/blog/statistics-and-data-links-roundup/feed/</wfw:commentRss> <slash:comments>1</slash:comments> </item> <item><title>Web-scraping made easy: outwit</title><link>http://www.kai-arzheimer.com/blog/screenscraping-made-easy-outwit/</link> <comments>http://www.kai-arzheimer.com/blog/screenscraping-made-easy-outwit/#comments</comments> <pubDate>Sat, 01 Aug 2009 21:46:53 +0000</pubDate> <dc:creator>kai</dc:creator> <category><![CDATA[Data and Methods]]></category> <category><![CDATA[Political Science]]></category> <category><![CDATA[departements]]></category> <category><![CDATA[france]]></category> <category><![CDATA[outwit]]></category> <category><![CDATA[perl]]></category> <category><![CDATA[python]]></category> <category><![CDATA[R]]></category> <category><![CDATA[scraping]]></category> <category><![CDATA[screen]]></category> <category><![CDATA[web scraper]]></category><guid isPermaLink="false">http://www.kai-arzheimer.com/blog/?p=287</guid> <description><![CDATA[These days, a bonanza of political information is freely available on the internet.  Sometimes this information comes in the guise of excel sheets, comma separated data or other formats which are more or less readily machine readable. But more often than not, information is presented as tables designed to be read by humans. This is [...]]]></description> <content:encoded><![CDATA[<div class="zemanta-img" style="margin: 1em; display: block;"><div class="wp-caption alignright" style="width: 205px"><a href="http://commons.wikipedia.org/wiki/Image:EAN-13-ISBN-13.svg"><img title="EAN-13 bar code of ISBN-13 in compliance with ..." src="http://upload.wikimedia.org/wikipedia/commons/thumb/2/28/EAN-13-ISBN-13.svg/195px-EAN-13-ISBN-13.svg.png" alt="195px EAN 13 ISBN 13.svg Web scraping made easy: outwit" width="195" height="124" /></a><p class="wp-caption-text">Image via Wikipedia</p></div></div><p>These days, a bonanza of political information is freely available on the internet.  Sometimes this information comes in the guise of excel sheets, comma separated data or other formats which are more or less readily machine readable. But more often than not, information is presented as tables designed to be read by humans. This is where the gentle art of screen scraping, <a class="zem_slink" title="Web scraping" rel="wikipedia" href="http://en.wikipedia.org/wiki/Web_scraping">web scraping</a> or spidering comes in. In the past, I have used kludgy <a class="zem_slink" title="Perl" rel="homepage" href="http://www.perl.org/">Perl</a> scripts to get electoral results at the district level off sites maintained by the French ministry of the interior or by universities (very interesting if you do not really speak/read French). A slightly more elegant approach might be to <a href="http://polmeth.wustl.edu/tpm/tpm_v14_n2.pdf" target="_blank">use R&#8217;s builtin Perl-like capabilities for doing the job, as demonstrated by Simon Jackman</a>. Finally, <a href="http://kops.ub.uni-konstanz.de/volltexte/2009/7652/pdf/doering_2008.pdf" target="_blank">Python is gaining ground in the political science community</a>,  which has some very decent libraries for screen/web scraping &#8211; see this <a href="http://www.drewconway.com/zia/?p=585" target="_blank">elaborate post on Drew Conway&#8217;s Zero Intelligence Agents blog</a>. But, let&#8217;s face it: I am lazy. I want to spend time analysing the data, not scraping them. And so I was very pleased when I came across outwit, a massive plugin for the firefox browser (Linux, Mac and Windows versions available) that acts as a <a href="http://www.outwit.com/" target="_blank">point-and-click scraper</a>.</p><p><span id="more-287"></span></p><div id="attachment_288" class="wp-caption alignright" style="width: 310px"><a href="http://www.kai-arzheimer.com/blog/wp-content/uploads/2009/08/outwit-1.png"><img class="size-medium wp-image-288" title="outwit-1" src="http://www.kai-arzheimer.com/blog/wp-content/uploads/2009/08/outwit-1-300x178.png" alt="outwit 1 300x178 Web scraping made easy: outwit" width="300" height="178" /></a><p class="wp-caption-text">French Départements (from Wikipedia)</p></div><p>Say you need a dataset with the names and Insee numbers for all the French Départements. The (hopefully trustworthy) <a href="http://en.wikipedia.org/wiki/Departments_of_France" target="_blank">Wikipedia page</a> has a neat table, complete with information on the Prefecture and many tiny coats of arms which are of absolutely no use at all. We could either key in the relevant data (doable, but a nuisance), or we could try to copy and paste the table into a word processor, hoping that we do not lose accents and other funny characters, and that WinWord or whatever we use converts the <a class="zem_slink" title="HTML element" rel="wikipedia" href="http://en.wikipedia.org/wiki/HTML_element">HTML table</a> into something that we can edit to extract the information we really need.</p><p>Or you we could use outwit. One push of the button loads the page</p><div id="attachment_291" class="wp-caption alignright" style="width: 310px"><a href="http://www.kai-arzheimer.com/blog/wp-content/uploads/2009/08/outwit-22.png"><img class="size-medium wp-image-291" title="outwit-2" src="http://www.kai-arzheimer.com/blog/wp-content/uploads/2009/08/outwit-22-300x178.png" alt="outwit 22 300x178 Web scraping made easy: outwit" width="300" height="178" /></a><p class="wp-caption-text">Scraping a table with outwit</p></div><p>into a sub-window, a second push (data-&gt;tables) extracts the HTML tables on the page. Now, we can either mark the lines we are interested in by hand (often the quickest option) or use a filter to selfect them. One final click, and they are exported as a <a href="http://www.kai-arzheimer.com/blog/wp-content/uploads/2009/08/Departments_Of_France.csv">CSV</a> file that can be read into R, OpenOffice, or Stata for post processing and analysis.</p><p>While I&#8217;m all in favour of scriptable and open-source tools like Perl, Python and R, outwit has a lot to go for it if all you need is a quick hack. Outwit also has functions to mass-download files (say PDFs) from a page and give the unique names. If the job is complex, there is even more functionality under the hood, and you can use the point-and-click interface to program you own scraper, though I would tend use a real programming language for these cases. At any rate, outwit is a useful and free tool for the lazy data analyst.</p><div class="zemanta-pixie" style="margin-top: 10px; height: 15px;"><a class="zemanta-pixie-a" title="Reblog this post [with Zemanta]" href="http://reblog.zemanta.com/zemified/ff2f2488-87b5-4e7d-b17a-9b44a263ee57/"><img class="zemanta-pixie-img" style="border: medium none; float: right;" src="http://img.zemanta.com/reblog_e.png?x-id=ff2f2488-87b5-4e7d-b17a-9b44a263ee57" alt=" Web scraping made easy: outwit"  title="Web scraping made easy: outwit photo" /></a><span class="zem-script more-related pretty-attribution"><script src="http://static.zemanta.com/readside/loader.js" type="text/javascript"></script></span></div><div class="su-linkbox" id="post-287-linkbox"><div class="su-linkbox-label">Link to this post!</div><div class="su-linkbox-field"><input type="text" value="&lt;a href=&quot;http://www.kai-arzheimer.com/blog/screenscraping-made-easy-outwit/&quot;&gt;Web-scraping made easy: outwit&lt;/a&gt;" onclick="javascript:this.select()" readonly="readonly" style="width: 100%;" /></div></div>]]></content:encoded> <wfw:commentRss>http://www.kai-arzheimer.com/blog/screenscraping-made-easy-outwit/feed/</wfw:commentRss> <slash:comments>1</slash:comments> </item> <item><title>Software for Social Network Analysis: Pajek and Friends</title><link>http://www.kai-arzheimer.com/blog/software-for-social-network-analysis-pajek-and-friends/</link> <comments>http://www.kai-arzheimer.com/blog/software-for-social-network-analysis-pajek-and-friends/#comments</comments> <pubDate>Tue, 08 Jul 2008 21:27:06 +0000</pubDate> <dc:creator>kai</dc:creator> <category><![CDATA[Data and Methods]]></category> <category><![CDATA[Political Science]]></category> <category><![CDATA[analysis]]></category> <category><![CDATA[bibliometrics]]></category> <category><![CDATA[citation]]></category> <category><![CDATA[network]]></category> <category><![CDATA[networks]]></category> <category><![CDATA[perl]]></category> <category><![CDATA[R]]></category> <category><![CDATA[science]]></category> <category><![CDATA[sna]]></category> <category><![CDATA[social]]></category> <category><![CDATA[social networks]]></category> <category><![CDATA[software]]></category> <category><![CDATA[stata]]></category><guid isPermaLink="false">http://polsci.wordpress.com/2008/07/08/software-for-social-network-analysis-pajek-and-friends/</guid> <description><![CDATA[After trying a lot of other programs, we have chosen Pajek for doing the analyses and producing those intriguing graphs of cliques and inner circles in Political Science. Pajek is free for non-commercial use and runs on Windows or (via wine) Linux. It is very fast, can (unlike many other programs) easily handle very large networks, produces decent graphs and does many standard analyses.]]></description> <content:encoded><![CDATA[<p>Our project on <a href="http://www.kai-arzheimer.com/social-networks-in-political-science.html" target="_blank">social (citation and collaboration) networks</a> in <a href="http://polsci.wordpress.com/2008/06/30/social-networks-in-british-political-science/" target="_blank">British</a> and <a href="http://polsci.wordpress.com/2008/06/26/social-networks-in-political-science/" target="_blank">German political science</a> involves networks with <a href="http://www.kai-arzheimer.com/networkpics/ukcoauthorbigger.png" target="_blank">hundreds and thousands of nodes</a> (scientists and articles). At the moment, our data come from the <a class="zem_slink" title="Social Sciences Citation Index" rel="wikipedia" href="http://en.wikipedia.org/wiki/Social_Sciences_Citation_Index">Social Science Citation Index</a> (part of the <a class="zem_slink" title="ISI Web of Knowledge" rel="wikipedia" href="http://en.wikipedia.org/wiki/ISI_Web_of_Knowledge">ISI web of knowledge</a>), and we use a bundle of rather eclectic (erratic?) scripts written in Perl to convert the ISI records into something that programs like Pajek or Stata can read. Some canned solutions (<a href="http://vlado.fmf.uni-lj.si/pub/networks/pajek/WoS2Pajek/">Wos2pajek</a>, <a href="http://nwb.slis.indiana.edu/index.html" target="_blank">network workbench</a>, <a href="http://www.umu.se/inforsk/Bibexcel/" target="_blank">bibexcel</a>) are available for free, but I was not aware of them when I started this project, did not manage to install them properly, or was not happy with the results. Perl is the Swiss Army Chainsaw (TM) for data pre-processing, incredibly powerful (my scripts are typically less than 50 lines, and I am not an efficient programmer), and every time I want to do something in a slightly different way (i.e. I spot a bug), all I have to do is to change a few lines in the scripts.<br /> After trying a lot of other programs available on the internet, we have chosen Pajek for doing the analyses and producing those intriguing graphs of <a href="http://www.kai-arzheimer.com/networkpics/pvscoauthors.png" target="_blank">cliques and inner circles in Political Science</a>. <a href="http://pajek.imfm.si/doku.php" target="_blank">Pajek</a> is closed source but free for non-commercial use and runs on Windows or (via wine) Linux. It is very fast, can (unlike many other programs) easily handle very large networks, produces decent graphs and does many standard analyses. Its user interface may be slightly less than straightforward but I got used to it rather quickly, and it even has basic scripting capacities.</p><div id="attachment_39" class="wp-caption alignleft" style="width: 190px"><a href="http://www.amazon.com/gp/product/0521602629?ie=UTF8&amp;tag=polscipolblo-20&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=047206"><img class="size-medium wp-image-39" src="http://polsci.files.wordpress.com/2008/07/pajek1.jpg?w=180" alt=" Software for Social Network Analysis: Pajek and Friends" width="180" height="254" title="Software for Social Network Analysis: Pajek and Friends photo" /></a><p class="wp-caption-text">The Missing Manual</p></div><p><span id="more-37"></span></p><p>The only thing that is missing is a proper manual, but even this is not really a problem since Pajek&#8217;s creators have written a very accessible introduction to social network analysis that doubles up as documentation for the program (order from<a href="http://www.amazon.co.uk/gp/product/0521602629?ie=UTF8&amp;tag=polscipolblo-21&amp;linkCode=as2&amp;camp=1634&amp;creative=6738&amp;creativeASIN=0472069691" target="_blank"> amazon.co.uk</a>, <a href="http://www.amazon.com/gp/product/0521602629?ie=UTF8&amp;tag=polscipolblo-20&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=047206" target="_blank">amazon.com</a>, <a href="http://www.amazon.de/exec/obidos/ASIN/0521602629/diridisarzhei-21" target="_blank">amazon.de</a>. However, Pajek has been under constant development since the 1990s (!) and has acquired a lot of new features since the book was published. Some of them are documented in an <a href="http://vlado.fmf.uni-lj.si/pub/networks/book/appendix4.pdf" target="_blank">appendix</a>, others are simply listed in the very short document that is the official <a href="http://vlado.fmf.uni-lj.si/pub/networks/pajek/doc/pajekman.pdf" target="_blank">manual for Pajek</a>. You will want to go through the many presentations which are available via the Pajek wiki.</p><p>Of course, there is much more software available, often at no cost. If you do program Java or Python (I don&#8217;t), there are several libraries available that look very promising. Amongst the stand-alone programs, <a href="http://visone.info/" target="_blank">visone</a> stands out because it can easily produce very attractive-looking graphs of small networks. Even more software has been developed in the context of other sciences that have an interest in networks (chemistry, biology, engineering etc.)<br /> Here is a rather messy collection of <a href="http://del.icio.us/kai17/sna+software" target="_blank">links to sna software</a>. Generally, you will want something that is more systematic and informative. Ines Mergel has recently launched a <a href="http://inesmergel.wordpress.com/2008/02/24/tapping-on-the-wisdom-of-the-crowd-social-network-analysis-software-tools-on-wikipedia/" target="_blank">bid for creating a comprehensive software list</a> on wikipedia. The resulting <a href="http://en.wikipedia.org/wiki/Social_network_analysis_software" target="_blank">page on social network analysis software</a> is obviously work in progress but provides very valuable guidance.</p><p>Technorati-Tags: <a class="performancingtags" rel="tag" href="http://technorati.com/tag/sna">sna</a>, <a class="performancingtags" rel="tag" href="http://technorati.com/tag/software">software</a>, <a class="performancingtags" rel="tag" href="http://technorati.com/tag/political%20science">political science</a>, <a class="performancingtags" rel="tag" href="http://technorati.com/tag/network">network</a>, <a class="performancingtags" rel="tag" href="http://technorati.com/tag/analysis">analysis</a>, <a class="performancingtags" rel="tag" href="http://technorati.com/tag/perl">perl</a>, <a class="performancingtags" rel="tag" href="http://technorati.com/tag/citation">citation</a>, <a class="performancingtags" rel="tag" href="http://technorati.com/tag/bibliometrics">bibliometrics</a>, <a class="performancingtags" rel="tag" href="http://technorati.com/tag/networks">networks</a>, <a class="performancingtags" rel="tag" href="http://technorati.com/tag/social">social</a>, <a class="performancingtags" rel="tag" href="http://technorati.com/tag/social%20networks">social networks</a></p><div class="zemanta-pixie" style="margin-top: 10px; height: 15px;"><a class="zemanta-pixie-a" title="Zemified by Zemanta" href="http://reblog.zemanta.com/zemified/a3986ca0-45a1-4839-9f0d-d468fa4350c7/"><img class="zemanta-pixie-img" style="border: medium none; float: right;" src="http://img.zemanta.com/reblog_e.png?x-id=a3986ca0-45a1-4839-9f0d-d468fa4350c7" alt=" Software for Social Network Analysis: Pajek and Friends"  title="Software for Social Network Analysis: Pajek and Friends photo" /></a><span class="zem-script more-related"><script src="http://static.zemanta.com/readside/loader.js" type="text/javascript"></script></span></div><div class="su-linkbox" id="post-37-linkbox"><div class="su-linkbox-label">Link to this post!</div><div class="su-linkbox-field"><input type="text" value="&lt;a href=&quot;http://www.kai-arzheimer.com/blog/software-for-social-network-analysis-pajek-and-friends/&quot;&gt;Software for Social Network Analysis: Pajek and Friends&lt;/a&gt;" onclick="javascript:this.select()" readonly="readonly" style="width: 100%;" /></div></div>]]></content:encoded> <wfw:commentRss>http://www.kai-arzheimer.com/blog/software-for-social-network-analysis-pajek-and-friends/feed/</wfw:commentRss> <slash:comments>2</slash:comments> </item> </channel> </rss>
<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Minified using disk: basic
Page Caching using disk: enhanced
Database Caching using disk: basic
Object Caching 1883/2020 objects using disk: basic

Served from: www.kai-arzheimer.com @ 2012-02-07 09:40:32 -->
