## Eight new German polls

Over the last fortnight, eight new polls have been published: two by Insa, two by Forsa, and one apiece by Dimap, Emnid, GMS, and FGW. For GMS, it is only the second poll conducted since the beginning of this year (a third one was published early in January, with fieldwork partly carried out in late December). As usual, there is quite a bit of variation in the data (hence the pooling), although there is no such shocker as the (unweighted) FGW poll done in mid-February which put the SPD at 42 per cent, twice of what they had been polling in early January.

## The SPD and the Christian Democrats are in a dead heat, but …?

Probably no “but” so far. Back in late January, the SPD’s fortunes began to rise rapidly thanks to their new candidate Martin Schulz (aka St Martin), and by early February, support for the two major parties had become statistically indistinguishable. For the last ten days or so, the model has put the CDU about a percentage point ahead of the SPD, but that does not mean a thing: the credible intervals are still more or less identical. The SPD’s rise and rise in the polls has stopped in March, but the Schulz effect is still very much a thing.

## The Left and the AfD

Since January, the AfD has lost a bit of steam (i.e. about 2 points) in the polls, but it is still the strongest opposition party at just under 10 per cent. The model-based line in the graph suggests that over the last couple of weeks, the party has recovered a wee bit, but that movement is negligible (have a look at the scale), just a teensy-weensy wiggle well within a wide credible band. Similarly, the Left is basically where it was two (and four, and six) weeks ago at 7.5 per cent.

## The Greens and the FDP

The Greens had a bad start into the campaign. Just when they had selected two Spitzenkandidaten from the more conservative wing of the party (signalling that they might enter a CDU/CSU/Green coalition), the SPD pulled a Schulz on them by choosing a leader who would attract leftist voters and would at least ponder the prospects of an SPD/Left/Green coalition. The Greens lost a couple of points in January and early February but have been perfectly stable since. Conversely, the FDP may or may not have won a point since February. Like with the AfD, the wiggling is not impressive, given the width of the credible intervals. But importantly, this interval has not touched the electoral threshold of five per cent during the last three months, though the threshold is never that far away.

## Conclusion

The polls are noisy as ever, but the model, which tries to account for house effects and random errors, suggests that the noise is just that, and that so far there was no real movement in political support in March. This picture is at least plausible: Apart from the AfD, all parties have selected their respective Spitzenkandidaten by now, it’s still six months to go,  and there have been no major domestic events.

At current levels of support, the Bundestag would have two (relatively) major parties that could continue the Grand coalition, although it’s not quite clear which one would hold the chancellorship, and four almost identically-sized minor parties, commanding between six and ten per cent of the vote. In such a relatively fragmented parliament, the chances (in a purely statistical sense) for forming a SPD/Left/Green coalition are slim: Only in 15 per cent of all draws from the (simulated) joint distribution of support is there a sufficient red-red-green majority. At 12 per cent, the chances of a numerical majority for a “Jamaica” coalition (Christian Democrats, FDP, Greens) are even lower. In none of the the simulations is either a SPD/Greens or Christian Democrats/FDP coalition feasible. It would be Grand Coalition, or new elections, or minority cabinet rule (a first). But remember the mantra: It’s six months to go, we’re very much talking about political mood, not firm intentions here, and this is just a model that tries to average over many different noisy and potentially biased polls.

With just under seven months to go until the German federal election, I have recently begun once more to pool the pre-election polls from seven major survey firms. Since January, when the date for the election was set and the Spitzenkandidaten were selected, results from 35 polls with a median sample size of about 1900 have been published: nine apiece by Emnid and Forsa, five by Dimap, five by Insa, four by FGW, two by Allensbach, and a single one by GMS.

Easily the most exciting event in the (long) campaign so far has been the #Schulzzug: the mostly unexpected leak/announcement on January 24 that Sigmar Gabriel would be replaced as party leader and (presumptive) candidate by Martin Schulz, the former president of the European Parliament. Support for the SPD in the polls had hovered at historically low levels of just over 20 per cent for months, but the Schulz candidacy re-energised party members and resulted in lots of (mostly positive) media coverage so far. Subsequently, support for the party leaped up in the polls, even overtaking support for the Christian Democrats in some of them.

But most movement in the polls is noise, and so we would like to know if the Schulz bounce is real. The data basically say: yes.

[]

The figure shows that support for the SPD begins to rise a couple of days before Schulz’s candidacy was announced, but this is probably an artefact. The model assumes that true support normally changes very little from one day to the next, but these are unusual circumstances, and so the ascent was probably steeper than the graph suggests. At any rate, the estimated level of support for the SPD in February was somewhere between 30 and 35 per cent, whereas it was between 20 and 24 per cent early in January. The model’s priors may play a role here (though they should be quickly overwhelmed by the data), but it is obvious that there was a gap of at least 10 percentage points between the two major parties in January that has essentially closed now. Support for the CDU and the SPD is virtually indistinguishable, and the Christian Democrats are rightfully worried.

What this means for the election is a different question. Estimated levels of support for both parties have been essentially constant for the last four weeks or so. The SPD has unexpectedly closed the gap, but it has stopped gaining. The Christian Democrats are not doing much worse than at the same point in the cycle four years ago. And once voters learn more about Schulz (who is a known unknown in Germany), the Schulz effect may wear off.

It’s that time of the electoral cycle again: With just under seven months to go until the federal election in September, I feel the urge to pool the German pre-election polls. I’ve burnt my fingers four years ago when I was pretty (though not 100%) sure that the FDP would clear the five per cent threshold (they failed for the first time in more than six decades), but hey – what better motivation to try it again?

## Why bother with poll-pooling?

Lately (see Trump, Brexit), pre-election polls have been getting a bad rap. But there is good evidence that by and large, polls have not become worse in recent years. Polls, especially when taken long before the election, should not be understood as predictions, because people will change their mind on how to vote, or will not yet have made up their mind – in many cases, wild fluctuations in the polls will eventually lead to an equilibrium that could have been predicted many months in advance. Rather, polls reflect the current state of public opinion, which will be affected by campaign and real-world effects.

Per se, there is nothing wrong with wanting to track this development. The problem of horse race journalism/politics is largely a problem of over-interpreting the result of a single poll. A survey of 1000 likely voters that measures support for a single party at 40% would have a sampling error of +/- 3 percentage points if it was based on a simple random probability sample. In reality, polling firms rely on complicated multi-stage sampling frames, which will result in even larger sampling errors. Then there is systematic non-response: some groups a more difficult to contact than others. Polling firms therefore apply weighs, which may hopefully reduce the resulting bias but will further increase standard errors. And then there are house effects: Some quirk in their sampling frame or weighing scheme may cause polling firm A to consistently overreport support for party Z. So in general, if support for a party or candidate rises or drops by some three or four percentage points, this may create a flurry of comments and excitement. But more often than not, true support in the public may be absolutely stable or even move in the opposite direction.

Creating a poll of polls can alleviate these problems somewhat. By pooling information from many adjacent polls, more accurate estimates are possible. Moreover, house effects may cancel each other out. However, a poll of polls will not help with systematic bias that stems from social desirability. If voters are generally reluctant to voice support for a party that is perceived as extremist or otherwise unpopular, that will affect all polls in much the same way.

Moreover, poll-pooling raises a number of questions to which there are no obvious answers: How long should polls be retained in the pool over which one wants to average? How can we deal with the fact that there are sometimes long spells with no new polls, whereas at other times, several polls are published within a day or two? How do we factor in that a change in polling that is reflected across multiple companies is more likely to reflect a true shift in allegiances?

## The method: Bayesian poll-pooling

Bayesian poll-pooling provides a principled solution to these (and other) issues. It was pioneered by Simon Jackman in his 2006 article on “Pooling the polls over an election campaign”. In Jackman’s model, true support for any given party $P$ is latent and follows random walk: support for $P$ today is identical to what it was yesterday, plus (or minus) some tiny random shock. The true level of support is only ever observed on election day, but polls provide a glimpse into the current state of affairs. Unfortunately, that glimpse is biased by house effects, but if one is willing to assume that house effects average out across pollsters, these can be estimated and subsequently factored into the estimates for the true state of support for any given day.

The Bayesian paradigma is particularly attractive here, because it is flexible and because it easily incorporates the idea that we use polls to continuously update our prior beliefs about the state of the political play. It’s also easy to derive other quantities of interest from the distribution of the main estimates, such as the probability that there is currently enough support for the FDP to enter parliament, and, conditional on this event, that a centre-right coalition would beat a leftist alliance.

In my previous misguided attempt to pool the German polls, I departed from Jackman’s model in two ways. First, I added a “drift” parameter to the random walk to account for any long term trends in party support. That was not such a stupid idea as such (I think), but it made the model to inflexible to pick up that voters were ditching the FDP in the last two weeks before the election (presumably CDU supporters who had nursed the idea of a strategic vote for the FDP). Secondly, whereas Jackman’s model has a normal distribution for each party, I fiddled with a multinomial distribution, because Germany has a multi-party system and because vote share must sum up to unity.

The idea of moving to a Dirichlet distribution crossed my mind, but I lacked the mathematical firepower/Bugs prowess to actually specify such a model. Thankfully, I came across this blog, whose author has just done what I had (vaguely) in mind. By the way, it also provides a much better overview of the idea of Bayesian poll aggregation. My own model is basically his latent primary voting intention model (minus the discontinuity).

The one thing I’m not 100% sure about is the “tightness” factor. Like Jackman (and everyone else), the author assumes that most movement in the polls is noise, and that true day-to-day changes are almost infinitesimally small. This is reflected in the tightness factor, which he arbitrarily sets to 50000 after looking at this data. Smaller numbers make for more wiggly lines and wider confidence intervals, because more of the variability in the data is attributed to true change. Unfortunately, this number does not translate to a readily interpretable quantity of interest (say a true day-to-day change of 0.5 per cent).

After playing with smaller and even larger values, I came up with a cunning plan and made “tightness” a parameter in the model. For the first six weeks of polling data, the estimate for tightness is about an order of magnitude lower in a range between 3500 and 10000. Whether it is a good idea to ask my poor little stock of data for yet another parameter estimate is anyone’s guess, and I will have to watch how this estimate changes, and whether I’m better of to fix it again.

## The data

Data come from seven major polling companies: Allensbach, Emnid, Forsa, FGW, GMS, Infratest Dimap, and INSA. The surveys are commissioned by various major newspapers, magazines, and TV channels. As far as I know, Allensbach is the only company that does face-to-face interviews, and INSA is the only company that relies on an internet access panel. Everyone else is doing telephone interviews. The headline margins are compiled and republished by the incredibly useful wahlrecht.de website: http://www.wahlrecht.de/umfragen/index.htm, which I scrape with a little help from the rvest package.

## Replication (Updated)

With the vote mostly counted in the US, PS have posted a useful summary of the Political Science Forecasting Models for that infamous election.

By and large, and in neat contrast to the current fad for self-flagellation, the augurs of the discipline have done well. Eight of the ten predictions that were published in PS got the winner of the popular vote right. Not that it would make a difference. Somewhat ironically, Norpoth’s Primary Model that I had (incorrectly) credited  on that gloomy Wednesday morning with predicting a Trump victory performed worst.  But in fairness to HN, his model has by far the longest lead.

Apparently, I said something funny the other day 😉

Which publishers are the most relevant for Radical Right research? Good question.

## Radical Right research by type of publication

Currently, most of the items in the The Eclectic, Erratic Bibliography on the Extreme Right in Western Europe (TM) are journal articles. The books/chapters/articles ratios have shifted somewhat over the years, reflecting both general trends in publishing and my changing reading habits, and by now the dominance of journal articles is rather striking.

## The most important journals for Radical Right research (add pinch of salt as required)

One in three of this articles has been published in one of the four apparent top journals for Radical Right research: the European Journal of Political Research, West European Politics, Party Politics, and Acta Politica. I say ’apparent’ here, because this result may be a function of my (Western) Eurocentrism and my primary interest in Political Science and Sociology. Other Social Sciences are underrepresented, and literature from national journals that publish in other languages than English is virtually absent.

But hey: Laying all scruples aside, here is a table of the ten most important journals for Radical Right research:

Journal No. of articles
European Journal of Political Research 38
West European Politics 35
Party Politics 24
Acta Politica 22
Electoral Studies 15
Parliamentary Affairs 13
Patterns of Prejudice 12
Comparative European Politics 10
Comparative Political Studies 10
Government and Opposition 9

Neat, isn’t it?

I did a similar analysis nearly two years ago. Government and Opposition as well as Comparative European Politics are new additions to the top ten (replacing Österreichische Zeitschrift für Politikwissenschaft and Osteuropa), but otherwise, the picture is much the same. So if you publish on the Radical Right and want your research to be noticed, you should probably aim for these journals.

For the past 15 years or so, I have maintained an extensive collection of references on the Radical/Extreme/Populist/New/Whatever Right in Western Europe. Because I love TeX and other command line tools of destruction, these references live in a large BibTeX file. BibTeX is a well-documented format for bibliographic text files that has been around for decades and can be written and read by a large number of reference managers.

Because BibTeX is so venerable, it’s unsurprising that there is even an R package (RefManageR) that can read and write BibTeX files, effectively turning bibliographic data into a dataset that can be analysed, graphed and otherwise mangled to one’s heart’s desire. And so my totally unscientific analysis of the Radical Right literature (as reflected in my personal preferences and interests) is just three lines of code away:

```library("RefManageR")
tail(sort(table(unlist(ex\$year))),5)
```
year publications
2014 34
2012 38
2000 42
2002 54
2015 57

So 2012, 2014 and 2015(!) saw a lot of publications that ended up on my list, but 2000 and particularly 2002 (the year Jean-Marie Le Pen made it into the second round of the French presidential election) were not bad either. 2013 and 2003 (not listed) were also relatively strong years, with 33 publications each.

To get a more complete overview, it’s best to plot the whole time series (ignoring some very old titles):

There is a distinct upwards trend all through the 1990s, a post-millenial decline in the mid-naughties (perhaps due to the fact that I completed a book manuscript then and became temporarily negligent in my collector’s duties, but I don’t think so), and then a new peak during the last five years, undoubtedly driven by recent political events and countless eager postdocs and PhD students. I’m just beginning to understand the structure of data objects that RefManageR creates from my bibliography, but I think it’s time for some league tables next.

## An update on the state of the Handbook of Electoral Behaviour

The forthcoming Sage Handbook of Electoral Behaviour has just “moved into production”. That is certainly a good thing, but no, I don’t know what that entails exactly either. Editing such a tome is great fun if you observe a small set of simple rules:

1. Pick great authors whose work doesn’t need editing in the first place.
2. Work with great colleagues who do the remaining bits of heavy lifting, and
3. try not to get in their way.

Thanks to my following these golden rules, the book should be out in late 2016.

## Draft chapter: Electoral Research and Technology – free for now

My own contribution has been rather modest: I’ve penned a (and finally revised) a chapter on electoral research and technology. That again was a fun exercise, as I’m going on and on about about the highly seductive structure of multi-level and other complex data, the joy of social network analysis, the temptation of spatial regression, and even (in passing) the adventures of Bayesian statistics. The cool thing about being one’s own editor is that there is not much editorial interference.

Now that the book is in “production” (see above), it should be out by the end of the year, but you can read the draft of “Psephology and Technology, or: The Rise and Rise of the Script-Kiddie” here. Heck, there is even a Psephology and Technology PDF available for download.

Taking a walk whilst running two variants of a slightly dodgy LTA in parallel on 64 of this baby’s 35,000-odd cores (to please a grumpy reviewer). Feeling like a proper scientist for a change. Needless to say that the whole thing shut down 35 minutes into the world’s fastest MPlus run, because the wardrobe-sized cooling unit broke down. Never happened on my desktop.

Mogon. Image Credit: ZDV JGU Mainz

On a balmy evening in August, I was lounging in the garden with a dead-tree copy of Perspectives on Politics (as you do), and stumbled across a rather spirited, well-written editorial attack by Jeffrey C. Isaac on the Data Access & Research Transparency (DA-RT) manifesto. So far, I had had only the vaguest awareness of the DA-RT movement (which took off five years ago), presumably because I’m a happy little data junkie for whom most of the demands and ideas make intuitive sense. Nonetheless, I can see the merit of some of the counter-arguments,  and Isaac provides some interesting context. Although I went to APSA a couple of weeks later, I did not attend any DA-RT-related panels, and basically forgot about it.

## What is DA-RT, and why should you care?

And now, three months later, controversy is all over what remains of the Political Science blogosphere. Over at the Plot, Isaac repeats his key arguments against DA-RT, and comments on some recent developments. If you are not interested in the context provided by his original article, that is the place to look. At the Duck, Jarrod Hayes is also not very fond of DA-RT, stressing the problems it would create for researchers doing qualitative interviews. On the other hand, Nicole Janz, who is on a worthy mission to promote reproducibility in Political Science with her replication blog, mocks a recent “petition”, so far signed by more than 600 colleagues (linked from the article) as “Political Scientists Trying to Delay Research Transparency”. John Patty, who has signed up DA-RT, gives the petition another beating:  “Responding To A Petition To Nobody (Or Everybody)”. There is even a twitter handle @DARTsupporters, that posts the odd congratulatory message when another journal editor signs up to DA-RT (follow at your own peril, as this implies supporting DA-RT, according to the bio). I predict much fun and merriment ahead in Political Science in months to come.