Oct 292012

Like social networks, multilevel data structures are everywhere once you start thinking about it. People live in neighbourhoods, neighbourhoods are nested in municipalities, which make up provinces – well, you get the picture. Even if we have no substantive interest in their effects, it often makes sense to control for structures in our data to get more realistic standard errors.

Now the good folks over at the European Social Survey have reacted and spent the Descartes Prize money on compiling multilevel information and merging them with their own data. So far, the selection is a little bit disappointing in some respects. Homicide rates, for instance, are reported on the national level only. But there are some pleasant surprises (I guess due to Eurostat, who collect such things): We get unemployment, GDP growth and even student numbers at the NUTS-3 level. Since you asked, NUTS is the Nomenclature of (subnational) Territory, and level 3 is the lowest level for which comparative data are normally published.

Regrettably, the size and number of level 3 units is not necessarily comparable across countries: For Germany, level 3 corresponds to about 400 local government districts, while France is divided into 96 European Departments. But if you need to combine top-notch survey data with small(ish) regional data, it’s a start, and not a bad one.

Jan 212012

In the past, I did a lot of multi-level modelling with MLwiN 2.02, which I quickly learned to loath. Back in the late 1990s, MLwiN was perhaps the first ML software that had a somewhat intuitive interface, i.e. it allowed one to build a model by pointing and clicking. Moreover, it printed updated estimates on the screen while cycling merrily through the parameter space. That was sort of cool, as it could take minutes to reach convergence, and without the updating, one would never have been sure that the program had not crashed yet. Which it did quite often, even for simple models.

Worse than the bugs was the lack of proper scriptability. Pointing and clicking  loses its appeal when you need to run the same model on 12 different datasets, or when you are looking at three variants of the same model and 10 recodes of the same variable. Throw in the desire to semi-automatically re-compile the findings from these exercises into two nice tables for inclusion in LaTeX again and again after finding yet another problem with a model, and you will agree that any  piece of software that is not scriptable is pretty useless for scientists.

MLwiN’s command language was unreliable and woefully underdocumented, and everything was a pain. So I embraced xtmixed when it came along with Stata 9/10, which solved all of these problems.

runmlwin presentation (pdf)

But xtmixed is slow with large datsets/complex models. It relies on quadrature, which is exact but computationally intensive. MLwiN works with approximations of the likelihood function (quick and dirty) or MCMC (strictly speaking a Bayesian approach, but people don’t ask to many questions because it tends to be faster than quadrature). Moreover, MLwiN can run a lot of fancy models that xtmixed cannot, because it is a highly specialised program that has been around for a very long time.

Enter the good people over at the Centre for Multilevel Modelling at Bristol, who have come up with runmlwin, an ado that essentially makes the functionality of MLwiN available as a Stata command, postestimation analysis and all. Can’t wait to see if this works with Linux, wine and my ancient binaries, too.

Mar 212009

MLwiN is one of the granddaddies of multi-level modelling software (the other being HLM).  Essentially, it is a 1990s-ish looking and sometimes quirky GUI wrapped around  an old DOS program (MLn). The one feature that set MLwiN apart in the late 1990s is point-and-click interface that allows you to build the equations for a multi-level in a stepwise fashion. The underlying command language is still slightly confusing and less than well documented, and some of the modern features (such as modelling categorical dependent variables) are implemented as external macros, which does not need to concern you unless something goes horribly wrong, which happens occassionally.

That said, MLwiN is reasonably fast, does now incorporate modern MCMC estimators, has an interface with WINBUGS and can be convinced to do most things you would possibly want to do with it.  I bought version 1.10 ca. 1998, received free upgrades to 2.02 and good support well until 2004/2005 or so.  These days, Stata, R and MPlus can all estimate multi-level models, but working with MLwiN may still be worthwhile for you (by the way, you can download the free stata2mlwin addon from UCLA academic technology to export your variables from Stata to MLwiN).

Rather amazingly, MLwiN is now freely available for anyone working in UK universities: just enter your details including your ac.uk-email, and few days later, they will send you a download link.