An Imperial Duck. Because.

My wonderful PhD students are running a series of short online surveys, two of them with a slightly unusual and rather intriguing format. If you read German, do them a favour and click on the link. You should be done in 10 minutes or less. And while you’re about it, share the link with your networks to make this a bit less of a convenience sample.

When I drove home from work a couple of days ago, I noticed a policeman flagging down precisely every tenth car in the other lane and directing the drivers towards a lay-by. He was in uniform, wearing hi-vis gear and his government-issued Walther, so non-compliance was clearly not an issue. The scene was completed by a large billboard, stating that this was no ordinary vehicle spotcheck but rather a road use survey. I badly want these guys on our team.

Those old enough to remember that Bill Murray had a career before Lost in Translation (or to remember Bill Murray) will instantly recognise this scene: Punxsutawney Phil is predicting six more electoral cycles of political misery for Germany’s Liberal Democrats. Granted that the animal is a bit on the small side, but first, This is not America, and second, the choice of rodent is rather apt: Aren’t we all guinea pigs when it comes to policy making?

Punxsutawney Phil predicting six more cycles of electoral Misery

The hopeful candidate molesting the furry bugger promises  that he will listen, not ignore (whom?). He might change his mind once the beast sinks its front teeth into that yummy finger.

The Pirates are running a rather cheap electoral campaign: No faces (models or not) but only drawings in their trademark orange/blue tones. Their stinginess even extends to the meaning of their slogans. I was a bit thrown off by “Borders are so 80″, then discovered the small “er”, so borders are so 1980s, apparently. Well, yes, I get the implication for Europe. But why is there a “Herzschlag” (heartbeat? or heart attack???) between fear and courage, and why would that make me vote for the Pirates? I have a feeling that Literal Campaign Video Clips might become a thing very soon.

Pirates Posters: Say what?

The local Liberal Democrats never fail to amaze me. Just when I thought it could not get any better, I found another gem for my ever growing collection.

Local Campaigns: The Hour of Amateurs

“Höhenflug” is the act of (figuratively) ascending to some higher plane (not an imminent danger here) but losing touch in the process. “Bodenhaftung” is literally grip (get one, please!) or traction, so best illustrated by sitting on a tractor. What better way to show that you are down to earth (pun intended) and in no way out of touch than riding this nifty little machine in your best dark suit, as any local farmer would? Bonus points for gratuitous use of “frischer Wind” (a breath of fresh air), quite possibly the most overused phrase in German politics and code for not being incumbent.

## The Problem: Assessing Bias without the Data Set

While the interwebs are awash with headline findings from countless surveys, commercial companies (and even some academics) are reluctant to make their raw data available for secondary analysis. But fear not: Quite often, media outlets and aggregator sites publish survey margins, and that is all the information you need. It’s as easy as $How to Measure Survey Bias without Having Access to the Raw Data (Surveybias Example 2/3)$.

## The Solution: surveybiasi

After installing our surveybias add-on for Stata, you will have access to surveybiasi. surveybiasi is an “immediate command” (Stata parlance) that compares the distribution of a categorical variable in a survey to its true distribution in the population. Both distributions need to be specified via the popvalues() and samplevalues() options, respectively. The elements of these two lists may be specified in terms of counts, of percentages, or of relative frequencies, as the list is internally rescaled so that its elements sum up to unity. surveybiasi will happily report k $How to Measure Survey Bias without Having Access to the Raw Data (Surveybias Example 2/3)$s, $How to Measure Survey Bias without Having Access to the Raw Data (Surveybias Example 2/3)$ and $How to Measure Survey Bias without Having Access to the Raw Data (Surveybias Example 2/3)$ (check out our paper for more information on these multinomial measures of bias) for variables with 2 to 12 discrete categories.

## Bias in a 2012 CBS/NYT Poll

A week before the 2012 election for the US House of Representatives, 563 likely voters were polled for CBS/The New York Times. 46 per cent said they would vote for the Republican candidate in their district, 48 per cent said they would vote for the Democratic candidate. Three per cent said it would depend, and another two per cent said they were unsure, or refused to answer the question. In the example these five per cent are treated as “other”. Due to rounding error, the numbers do not exactly add up to 100, but surveybiasi takes care of the necessary rescaling.

In the actual election, the Republicans won 47.6 and the Democrats 48.8 per cent of the popular vote, with the rest going to third-party candidates. To see if these differences are significant, run surveybiasi like this:


. surveybiasi , popvalues(47.6 48.8 3.6) samplevalues(46 48 5) n(563)
------------------------------------------------------------------------------
catvar |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
A'           |
1 |  -.0426919   .0844929    -0.51   0.613     -.208295    .1229111
2 |  -.0123999   .0843284    -0.15   0.883    -.1776805    .1528807
3 |   .3375101   .1938645     1.74   0.082    -.0424573    .7174776
-------------+----------------------------------------------------------------
B            |
B |   .1308673   .0768722     1.70   0.089    -.0197994    .2815341
B_w |   .0385229   .0247117     1.56   0.119    -.0099112    .0869569
------------------------------------------------------------------------------

Ho: no bias
Degrees of freedom: 2
Chi-square (Pearson) = 3.0945337
Pr (Pearson) = .21282887
Chi-square (LR) = 2.7789278
Pr (LR) = .24920887




Given the small sample size and the close match between survey and electoral counts, it is not surprising that there is no evidence for statistically or substantively significant bias in this poll.

An alternative approach is to follow Martin, Traugott and Kennedy (2005) and ignore third-party voters, undecided respondents, and refusals. This requires minimal adjustments: $How to Measure Survey Bias without Having Access to the Raw Data (Surveybias Example 2/3)$ is now 535 as the analytical sample size is reduced by five per cent, while the figures representing the “other” category can simply be dropped. Again, surveybiasiinternally rescales the values accordingly:


. surveybiasi , popvalues(47.6 48.8) samplevalues(46 48) n(535)
------------------------------------------------------------------------------
catvar |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
A'           |
1 |  -.0162297   .0864858    -0.19   0.851    -.1857388    .1532794
2 |   .0162297   .0864858     0.19   0.851    -.1532794    .1857388
-------------+----------------------------------------------------------------
B            |
B |   .0162297   .0864858     0.19   0.851    -.1532794    .1857388
B_w |   .0162297   .0864858     0.19   0.851    -.1532794    .1857388
------------------------------------------------------------------------------

Ho: no bias
Degrees of freedom: 1
Chi-square (Pearson) = .03521623
Pr (Pearson) = .85114329
Chi-square (LR) = .03521898
Pr (LR) = .85113753



Under this two-party scenario, $How to Measure Survey Bias without Having Access to the Raw Data (Surveybias Example 2/3)$ is identical to Martin, Traugott, and Kennedy’s original $How to Measure Survey Bias without Having Access to the Raw Data (Surveybias Example 2/3)$ (and all other estimates are identical to $How to Measure Survey Bias without Having Access to the Raw Data (Surveybias Example 2/3)$‘s absolute value). Its negative sign points to the (tiny) anti-Republican bias in this poll, which is of course even less significant than in the previous example.

There may be a European election on, but around here, the big one is the local elections. In the plural: On my last count, I will have to vote for town mayor, town council, municipal mayor, municipal council, district council and perhaps even leader of the district council, though I’m not 100 per cent sure re the last one.

Important as they may be, local elections are the domain of the amateurs, as the old saying goes amongst German Political Scientist.1 To make things slightly worse, councillors are elected under an open list system (with not threshold), so there are some incentives to cultivate a personal vote, and quite some margin for error. So far, I have spotted few real howlers but then the Liberal Democrats (FDP), wiped out in the last Bundestag election and poised to do badly in the EP2014, decided to go for this year’s Bad Pun Award.

Another Campaign Poster from Hell

So the guy on the poster is literally fishing (or at least holding a rod while wearing a suit) in clear water (im Klaren, which, if you push it, could be read as a pun-within-the-pun on alcohol), as opposed to fishing in murky waters (im Trüben fishen). The latter used to mean “cheating” but has also acquired connotations of being lost. Say what?

But there is more. The candidate is also “ortsnah” (local, in a technical sense that never, ever applies to persons), as opposed to “weltfremd” (unworldly, stuck inside an ivory tower). One might argue that, on some level loosely attached to logic “ortsnah” and “weltfremd” are not exactly opposites but rather awkwardly related concepts. But quite possibly someone sensed a tension between “ort” (the local place) and “welt” (world) and decided that nothing says “local guy” quite like a misguided rhetorical flourish. With PR guys like this, who needs political enemies?

Footnotes:

1

In a recent publication (Arzheimer & Evans 2014), we propose a new multinomial measure B for bias in opinion surveys. We also supply a suite of ado files for Stata, surveybias, which plugs into Stata’s framework for estimation programs and provides estimates for this and other measures along with their standard errors.  This is the first instalment in a mini series of posts that show how our commands can be used with real-world data. Here, we analyse the quality of a single French pre-election poll.

## Installing surveybias for Stata

You can install surveybias directly from this website (net from http://www.kai-arzheimer.com/stata), but it may more convenient to install from SSC ssc install surveybias

## Assessing Bias in Presidential Pre-Election Surveys

. use onefrenchsurvey

The French presidential campaign of 2012 attracted considerable political interest. Accordingly, numerous surveys were fielded. onefrenchsurvey.dta (included in our package) contains data from one of them, taken a couple of weeks before the actual election. The command I will discuss in this post is called (*drumroll*) surveybias and is the main workhorse in our package. surveybias needs exactly one variable as a mandatory argument: the voting intention as measured in the survey, which is appropriately called “vote” in this example. Moreover, surveybias requires an option through which must submit the true distribution of this variable. Absolute or relative frequencies will do just as well as percentages, since surveybias will automatically rescale any of them.

Ten candidates stood in the first round of the French presidential election in 2012, but only two of them would progress to the run-off. While surveybias can handle variables with up to twelve categories, requesting estimates for very small parties increases the computational burden, may lead to numerically unstable estimates and is often of little substantive interest. In onefrenchsurvey.dta support for the two-lowest ranking candidates has therefore been recoded to a generic “other” category. The first-round results, which serve as a yardstick for the accuracy of the poll, are submitted in popvalues(). For other options, have a look at the documentation.


. surveybias vote, popvalues(28.6 27.18 17.9 9.13 11.1 2.31 1.15 1.79 0.8)
______________ ________________________________________________________________
vote       Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
______________ ________________________________________________________________
A´
Hollande   -.0757639   .0697397    -1.09   0.277    -.2124512    .0609233
Sarkozy    .0477294   .0689193     0.69   0.489    -.0873499    .1828087
LePen   -.0559812   .0823209    -0.68   0.496    -.2173271    .1053648
Bayrou    .3057213   .0953504     3.21   0.001     .1188379    .4926047
Melenchon   -.0058251   .0988715    -0.06   0.953    -.1996096    .1879594
Joly   -.0913924   .2154899    -0.42   0.671    -.5137449      .33096
Poutou   -.8802476   .4482915    -1.96   0.050    -1.758883   -.0016125
DupontAigna   -.5349338   .3031171    -1.76   0.078    -1.129032    .0591648
other    .1841789   .3177577     0.58   0.562    -.4386147    .8069724
______________ ________________________________________________________________
B
B    .2424193   .0767485     3.16   0.002     .0919949    .3928437
B_w    .0965423    .039022     2.47   0.013     .0200605    .1730241
______________ ________________________________________________________________

Ho: no bias
Degrees of freedom: 8
Chi-square (Pearson) = 18.695468
Pr (Pearson) = .01657592
Chi-square (LR) = 19.540804
Pr (LR) = .01222022



The top panel lists the Ai for the first eight candidates plus the “other” category alongside their standard errors, z- and p-values, and confidence intervals. Ai is a party-specific, multi-party version of Martin, Traugott, and Kennedy’s measure A and reflects bias for/against any specific party. By conventional standards (p 0.05), only two of these values are significantly different from zero: Support for François Bayrou was overestimated (A4 = 0.31) while support for Philippe Poutou was underestimated (A7 = -0.88).

Poutou was the little known candidate for the tiny “New Anticapitalist Party”. While he received more than twice the predicted number of votes (1exp(-0.88) 2.4), the case of Bayrou is more interesting. Bayrou, a centre-right candidate, stood in the previous 2007 election and came third with a very respectable result of almost 19 per cent, taking many political observers by surprise. In 2012, when he stood for a new party that he had founded immediately after the 2007 election, his vote effectively halved. But this is not fully reflected in the poll, which overestimates his support by roughly a third (exp(0.31) 1.35). This could be due to (misguided) bandwagon effects, sampling bias, or political weighting of the poll by the company.

The lower panel of the output lists B and Bw, a weighted version of our measure. B, the unweighed average of the Ais absolute values, is much higher than Bw. This is because the estimates for all the major candidates with the exception of Bayrou were reasonably good. While support for Poutou and also for Dupont-Aignan was underestimated by large factors, Bw heavily discounts these differences, because they are of little practical relevance unless one is interested specifically in splinter parties.

As outlined in the article in which we derive B, B’s (and Bw’s) sampling distribution is non-normal, rendering the p-value of 0.002 somewhat dubious. surveybias therefore performs additional χ2-tests based on the Pearson and the likelihood-ratio formulae, whose results are listed below the main table. In this case, however, both tests agree that the null hypothesis of no bias is indeed falsified by the data.

While their p-values are clearly higher than the one resulting from the inappropriate z-test on B, they are close to the p-value for Bw. This is to be expected, because the upward bias and the non-normality become less severe as the number of categories increases, and because the weighting reduces the impact of differences that are small in absolute numbers but associated with large values on the log-ratio scale.

surveybias leaves the full variance-covariance matrix behind for your edification. Parameter estimates, chi-square values and probabilities are available, too, so that you can easily test all sorts of interesting variables about bias in this poll.