The Polls Are Noisy
Just over five weeks before the Bundestag election, there is much merriment about the current state of play. Support for the Liberals has been consistently below the electoral threshold of five per cent for months, which implies that Merkel’s coalition would not be able to continue after September. Consequently, everyone is very excited about a more recent series of polls, which put the party at exactly five per cent. But even with n=2000, an exact confidence interval would range from 4 to 6 per cent. Add multistage sampling, house effects, and the fact that people do not necessarily know how they will vote in September, and you end up with a lot of noise.
How much noise exactly? The good folks at wahlrecht.de publish marginal distributions for six major companies that regularly conduct random polls. I wrote a little program to collects everything published since January (the exact election date was officially determined on February 8 but was negotiated between the parties in January). Here are the results (click to enlarge).
Most polls are in the field for two to seven days, so I anchored them at their midpoints. My current data set spans 31 weeks, with just under 4 polls conducted each week.
Polling the German Election Polls
It’s obvious that there is a lot of variation in these 120 data points, making claims that this party has declined while another one surges rather dubious (though they still make excellent headlines). Poll aggregation is one possible and increasingly way out of this conundrum, so I decided to get my hands dirty, install (r)jags and bite the Bayesian bullet (something I have meant to do for years).
My model is rather simplistic (I’ll post the code once it has stabilised). I assume that reported voting intentions are distributed multinomial, and that they depend on a) latent party support and b) house effects. I further make the rather heroic assumption that house effects are random with a mean of zero. Latent party support, on the other hand, follows a random walk, possibly with a drift: This week’s support is last week’s support plus some random change due to political events, plus some constant that accounts for steady up- or downward trends.
The Bayesian framework seems particularly appropriate here as it is technically and conceptually easy to come up with predictions for September 22, but I refrain from incorporating any prior beliefs and put vague distributions on all parameters. As far as I can tell (and that does not mean a lot), the model seems to converge without problems.
Somewhat surprisingly,the 95 per cent credibility interval (the shaded area around the trend line) is rather narrow for both the CDU and the SPD, implying that we can learn a lot from pooling many noisy polls. Support for Merkel’s Christian democrats was largely stable over the last seven months at about 42 per cent. This would make them the strongest party by far, although they are far away from the lofty 50 per cent they reached in some polls in April and June. According to the model, support for the Socialdemocrats is similarly stable, though at a much lower level of 26 per cent.
The predictions (to the right of the vertical line that marks the beginning of the last week included in the model) are less precise than the estimates, and become more vague as they extend towards election day, but it seems almost certain that the CDU will be the strongest party in the new parliament by a fair margin.
The model is also very confident about levels of support for the smaller parties. Green support peaked in March, but current and predicted levels are still above 12 per cent, which would be an improvement on the 2009 result. But since the SPD is so weak, the probability of a “Red-Green” majority in the next Bundestag is estimated to be (much) less than one per cent.
Support for the Left is estimated at about seven per cent, well above the threshold (the dashed line), but also well below their very strong result in 2009.
Finally, the FDP has shown an upward trend over the last 10 weeks or so and is projected to cross the threshold just in time for the election. The model estimates the probability of the FDP returning to parliament at 67 per cent.
Predicting the Inevitable
In reality, a sufficient number of potentially CDU voters might support the FDP for tactical reasons, pushing up that number towards certainty. But the coalition could come to an end even if that manoeuvre succeeds: The odds that the coalition garners more votes than the three left parties together are only slightly better than even at 58 per cent.
A “Red-Red-Green” coalition (or rather a Red-Green government tolerated by the Left), however, seems politically infeasible, suggesting a return to a Great Coalition lead by Angela M. with a subjective probability of at least 90 per cent.
If (if!) I take these estimates seriously for just one moment, that means that probability of Ms Merkel retaining her office is roughly 96 per cent. Let’s see how the next batch of polls plays havoc with this figure, shall we?