Jan 142012

I’m currently working on an analysis of the latest state election in Rhineland-Palatinate using aggregate data alone, i.e. electoral returns and structural information, which is available at the level of the state’s roughly 2300 municipalities. The state’s Green party (historically very weak) has roughly tripled their share of the vote since the last election in 2006, and I want to know were all these additional votes come from. And yes, I’m treading very careful around the very large potential ecological fallacy that lurks at the centre of my analysis, regressing Green gains on factors such as tax receipts and distance from next university town, but never claiming that the rich or the students or both turned to the Greens.

One common problem with this type of analysis is that not all municipalities are created equal. There is a surprisingly large number of flyspeck villages with only a few dozen voters on, whereas the state’s capital boasts more than 140,000 registered voters. Most places are somewhere in between. Having many small municipalities in the regression feels wrong for at least two reasons. First, small-scale changes of political preferences in tiny electorates will result in relatively large percentage changes. Second, the behaviour of a relatively large number of voters who happen to live in a small number of relatively large municipalities will be grossly underrepresented, i.e. the countryside will drive the results.

My PhD supervisor, who did a lot of this stuff in his time, used to weigh municipalities by the size of their electorates to deal with these problems. But this would lead to pretty extreme weights in my case. Moreover, while voters bring about electoral results, I really don’t want to introduce claims about individual behaviour through the back door.

My next idea was to weigh municipalities by the square root of the size their electorates. Why? In a sense, the observed behaviour is like a sample from the underlying distribution of preferences, and the reliability of this estimate is proportional to the square root of the number of people in a given community. But even taking the square root left me with weights that were quite extreme, and the concern regarding the level of analysis still applied.

Then I realised that instead of weighing by size, I could simply include the size of the electorate as an additional independent variable to correct for potential bias. But this still left me exposed to the danger of extreme outliers (think small, poor, rural communities where the number of Green voters goes up from one to four, a whopping 300 per cent increase) playing havoc with my analysis. So I began reading up on robust regression and its various implementations in Stata.

The basic idea of robust regression is that real data are more likely than not a mixture of (at least) two mechanisms: the “true model” whose coefficients we want to estimate one the one hand, and some other process(es) that contaminate the data on the other. If these contaminating data points are far away from the multivariate mean of the x-Variables (outliers) and deviate substantially from the true regression line, they will bias the estimates.

Robust regression estimators are able to deal with a high degree of contamination, i.e. they can recover the true parameters even if there are many outliers amongst the data points. The downside is that the older generation of robust estimators also have a low efficiency (the estimates are unbiased but have a much higher variance than regular OLS-estimates).

A number of newer (post-1980) estimators, however, are less affected by this problem. One particular promising approach is the MM estimator, that has been implemented in Stata ados by Veradi/Croux (MMregress) and by Ben Jann (robreg mm). Jann’s ado seems to be faster and plays nicely with his esttab/estout package, so I went with that.

The MM estimator works basically by identifying outliers and weighing them down, so it amounts to a particularly sophisticated case of weighted least squares. Using the defaults, MM claims to have 85 per cent of the efficiency of OLS while being able to deal with up to 50 per cent contamination. As you can see in the table, the MM estimates deviate somewhat from their OLS counterparts. The difference is most pronounced for the effect of tax receipts (hekst).

robreg mm has an option to store the optimal weights. I ran OLS again using these weights (column 3), thereby recovering the MM estimates and demonstrating that MM is really just weighted least squares (standard errors (which are not very relevant here) differ, because robreg uses the robust variance estimator). This is fascinating stuff, and I’m looking forward to a forthcoming book by Jann and Veradi on robust regression in Stata (to be published by Stata Press in 2012).

                     OLS              MM            WLS

greenpct2006        0.193***        0.329***        0.329***
                 (0.0349)        (0.0592)        (0.0278)

hekst               0.311***        0.634***        0.634***
                 (0.0894)         (0.124)        (0.0688)

senioren          -0.0744***       -0.100***       -0.100***
                 (0.0131)        (0.0149)       (0.00994)

kregvoters11      -0.0125        -0.00844        -0.00844
                 (0.0146)       (0.00669)       (0.00982)

kbevdichte         -0.433        -0.00750        -0.00750
                  (0.464)         (0.330)         (0.326)

uni                 1.258           0.816           0.816
                  (1.695)         (0.765)         (1.137)

lnunidist          -0.418**        -0.372**        -0.372***
                  (0.127)         (0.113)        (0.0918)

_cons               8.232***        7.078***        7.078***
                  (0.627)         (0.663)         (0.461)
Enhanced by Zemanta
Dec 082008

Finally, the call for papers for the ECPR’s 5th conference (at Potsdam, September 10-12 2009) is out. Our section on the Radical Right will consist of the following nine panels:

  • The Radical Right in Central and Eastern Europe
  • The Internationalisation of the Radical Right
  • Will Fascism return?
  • On the Borderline Between Protest and Violence: Political Movements of the New Radical Right
  • Consequences of the surge of anti-immigration parties
  • The Radical Right in Western Europe
  • Inside the Radical Right: An Internalist Perspective
  • Party-based Euroscepticism in Western and Eastern Europe
  • Neighbourhood Effects Revisited: the Visualisation of Immigrants and Radical Right-Wing Voting

Each panel can have up to five paper givers, so the section offers us a chance to bring together cutting edge research on the Populist/Extreme/Radical Right from various subfields (parties, voters, rational choice, normative theory – you name it). Please submit your abstract via the the electronic submission system to the appropriate panel(s).

Technorati-Tags: , , , , , , , , , , , ,

Aug 292008

Everyone just seems to know that the voters of the Extreme Right hate foreigners in general and immigrants in particular, but robust comparative evidence for the alleged xenophobia – Radical Right vote link is scarce. Moreover, many of the published analyses are based on somewhat outdated (i.e. 1990s) data, and alternative accounts of the extreme right vote (the “unpolitical” protest hypothesis and the hypothesis that the Far Right in Western Europe attracts people with “neo-liberal” economic preferences, championed by Betz and Kitschelt in the 1990s) do exist. Just a few days ago, a journal has accepted a paper by me in which I test these three competing hypotheses using (relatively) recent data from the European Social Survey and a little Structural Equation Modelling. As it turns out, protest and neo-liberalism have no statistically significant impact on the Extreme Right vote whatsoever. Anti-immigrant sentiment, however, plays a crucial role for the Extreme Right in all countries but Italy. Its effects are moderated by party identification and general ideological preferences. Moreover, the effect of immigrant sentiment is moderate by general ideological preferences and party identification. I conclude that comparative electoral research should focus on the circumstances under which immigration is politicised. Wasn’t it blindingly obvious?

Technorati-Tags: , , , , , , , , , , , ,

May 282008

It’s almost unbelievable: after some six months of communication problems with the publishers, my recent book on the extreme right vote in Western Europe since the 1980s is finally out and ready for you to order and read (qualification: if you read German). If you don’t read German, you might still be interested in a brief English summary of my findings on the Extreme Right vote, including various presentations and other goodies.