Something old, something new, something borrowed, something true? A comment on Lister’s ‘Institutions, Inequality and Social Norms: Explaining Variations in Participation’

Kai Arzheimer*

Department of Government, University of Essex, Wivenhoe Park, Colchester CO4 3SQ

Abstract: Michael Lister makes a useful contribution to the discussion on aggregate variables that foster or depress turnout by drawing attention to societal factors, but his analysis is fraught with methodological problems. While his article builds on an interesting theoretical argument about the impact of institutions on attitudes, his claims about causal relationships are not backed by data. There is no rationale for the selection of countries, and most explanatory variables are actually constant within countries. The specification of the model is problematic in many ways. A careful re-analysis shows that the t-values reported in Lister’s article are far too large, while the estimates are unstable and dependent on the selection of observations. Moreover, the effects are trivial in terms of their political implications. There is no robust evidence for a universal, politically relevant relationship between inequality and turnout.

Keywords: (1) Turnout, (2) Time-Series-Cross-Sectional data, (3) Inequality, (4) Re-Analysis

Preprint. Definitive version will appear in the British Journal of Politics and International Relations (BJPIR) 2008, p.689-697, doi: 10.1111/j.1467-856x.2008.00336.x .

PDF-Version of this paper

HTML-Version of this paper

Replication data for "‘Something old, something new, something borrowed, something true? … "’ hdl:1902.1/10558 UNF:3:2UNq+CMPvmjb7Aat9NvpKw==

1 Introduction

During the last 15 years, the (aggregate) analysis of electoral turnout in liberal democracies has become a minor industry. A recent survey of the relevant literature (Geys2006) lists not fewer than 83 empirical studies that relate turnout to a plethora of institutional, political and social factors. Amongst these, population size, the closeness of the respective contest, and (a rather less surprising finding) compulsory voting emerge as the most important independent variables. Michael Lister’s (2007) recent article in this journal is a valuable addition to this discussion, because by focusing on social inequality, he draws our attention to a whole host of societal factors that have by and large been neglected so far. Moreover, Lister’s contribution is one of the few studies that analyses turnout over time and in a cross-national perspective, whereas the majority of the analyses looks at subnational units, often in a cross-sectional perspective.

There are, however, a number of methodological and substantive issues with Lister’s analysis that call the validity of his findings into question: First, Lister’s account of causal relationships is highly problematic, second, the methodology is not appropriate given the low and unequal number of elections per year, third, most variables in the model are constant or near-constant within countries, and forth, even if there is a statistically significant relationship between inequality and turnout, it is trivial. In what follows, I will address these points in turn.

2 Is there an effect of inequality on turnout?

2.1 Causality


Macro level welfare state institutions PIC PIC Micro level internalised norms/expectations electoral participation (f )(b)(d)(a)(c)(e): +/- ? Squares represent observed variables, ovals represent variables for which there are no data. The solid line is the single observed relationship, dashed lines represent hypothesised, the dotted line represents the confounding effects of inequality on attitudes.

Figure 1: Observed and unobserved variables and relationships in the causal chain

Lister’s central argument is that the institutions of the welfare state shape citizens’ expectations (or norms) and thereby their political behaviour (Lister2007, 25). More specifically, he argues that welfare state institutions which are based on universalist principles provide ‘more support for norms of solidarity’ (Lister2007, 25). These norms encourage electoral participation both directly and indirectly. Means-tested welfare programs, on the other hand, have opposite effects (Lister2007, 25). Building on Coleman’s (1994, 7) framework for sociological explanation, his argument can be reconstructed in a slightly simplified way by employing three causal statements (see Figure 1):

  1. Features of the welfare state (a macro-level variable) affect internalised norms and expectations, i.e. individual attitudes.
  2. These individual attitudes have an impact on an individual’s decision to participate in a national election
  3. These individual decisions constitute turnout, another macro-level variable

Amongst these, only the third statement is unproblematic since it involves a purely mechanical aggregation (given that in liberal democracies, people are normally not prevented from voting in any systematic way). Statements (a) and (b) on the other hand are rather bold claims about the consequences and antecedents of individual variables which can never be proven right or wrong on the basis of macro-data. Ever since Robinson (1950) published his famous paper on ecological correlation, social scientists have struggled with the problem of ecological fallacies, i.e. the impossibility of deriving valid conclusions about individual behaviour from the aggregate measures.

Even the most advanced statistical techniques in the field that aim at making probabilistic statements about the likely strength of relationships between micro-level variables (say race and voting in a two-party competition, see King 1997) rely on information about the distribution of micro-level variables on the macro-level. In the absence of such information on the distribution of individual norms and expectations, nothing can be said about the validity of statements (a) and (b). Moreover, the analysis presented in Lister relies on another unobserved relationship, namely the causal connection between the institutions of the welfare state and inequality between households (d). While the nature of the welfare state’s institutions at any given point t (‘universalistic’ vs. ‘liberal’ or ‘minimal’ arrangements) will arguably have a substantial effect on inequality between households at t, it will hardly completely determine the current level of inequality. Rather, a whole host of other factors including the global and the national economy, the system and level of taxation, the previous level of inequality at points t - 1,t - 2, as well as the previous nature of welfare state institutions and all sorts of unintended consequences and side-effects of previous policy will affect the current level of inequality, making this a rather crude measure of welfare state arrangements. Finally, over and above serving as an indicator for welfare state arrangements, inequality in itself can easily have a positive or a negative impact on one’s internalised norms and attitudes, thereby either masking or exaggerating the importance of causal effects that work along path (a). On the one hand, very low and falling levels of inequality (which are presumably associated with very high tax rates) could encourage the parties of the centre and the right to mobilise the middle classes, which would ceteris paribus lead to an increase in turnout. On the other hand, a high level of inequality would provide the working class with an incentive to vote in order to achieve a more comprehensive welfare state — this is the logic of the ‘democratic class struggle’ (Anderson and Davidson1943).1

To summarise, while Lister’s article builds on a complex framework involving three aggregate and two micro-level variables, claims about three causal relationships which are crucial for the argument and a fourth which can potentially distort the results are not and cannot be backed by data. Therefore, any conclusions from the analysis are confined to claims about the relationship between inequality and turnout on the aggregate level (i.e. ‘polities which face a high level of inequality will ceteris paribus experience a high/lower turnout than those with a more egalitarian distribution of resources’).

2.2 Data and Modelling

2.2.1 Data

The analysis presented by Lister relies chiefly on a single source which is in the public domain: the ‘Comparative Welfare States Data Set’ (Huber et al.2004) that compiles information for 18 countries from a variety of sources, covering the time-span from 1960-2000. This data set provides information on turnout (VTURN, drawn from Mackie and Rose 1982 and the reports in the European Journal of Political Research) as well as on a number of control variables such as a (chain) index of GDP per capita (RGDPC), the strength of bicameralism (STRBIC), the presence and strength of federalism (FED),2 the proportionality of the electoral system (SINGMEMD), and whether the respective country has a presidential system (PRES). These are merged with information on the focal independent variable income inequality (INEQ), which comes from the University of Texas Inequality Project (2004), and a report on compulsory voting (COMPVOTE) which was compiled by IDEA (Gratschew2001).

While Lister dismisses welfare state spending data very quickly, comparing income inequality across time and countries is full of pitfalls (Atkinson and Brandolini2000). There is no discussion of the quality and particular features of the data from the University of Texas Inequality Project whatsoever, and alternative data sources such as the Luxembourg Income Study (Atkinson2004) that might well be better suited for the research problem at hand are not even considered.

More generally relying on data sets in the public domain has clear advantages in terms of availability and replication, yet it imposes equally clear restrictions on the selection of observations and the time-frame (1963-93), which are not addressed in the text. This notwithstanding, it would have been helpful to discuss the rationale for not including Norway and New Zealand in the analysis, although these countries are covered by the sources, since the inclusion/exclusion of a single country can substantially affect the results of the regression model (see section 2.2.3 below).3


Variable β x β ×x
C -2.449 1.000 -2.449
COMPVOTE 0.216 0.225 0.077
SINGMEMD -0.004 0.574 -0.002
FED -0.031 0.527 -0.017
PRES -0.126 0.155 -0.020
STRBIC 0.115 0.698 0.081
RGDPC -4.060×10-6 1.559×104 -0.063
INEQ -0.015 33.373 -0.517
VTURNLAG 0.055 83.054 4.551
1.613

Table 1: Mean turnout rate implied by Lister’s model

To replicate Lister’s findings as closely as possible, a data set (available from hdl:1902.1/10558 UNF:3:2UNq+CMPvmjb7Aat9NvpKw==) was constructed in the following way: from the Comparative Welfare States Data Set, the 15 countries under study were selected. For those, all election years between 1963 and 1993 were retained (averaging over the 1974/1982 elections for the UK and Ireland), resulting in 136 observations with non-missing values for turnout, federalism, presidentialism, bicameralism, and GDP per capita.

Best efforts notwithstanding, it proved impossible to reproduce Lister’s findings exactly, although the differences are fairly small.4 More troubling is the fact that at first glance, the coefficients reported by Lister do not seem to sum up. If one plugs in the mean values5 for all the independent variables, the expected turnout is a staggering 161 per cent (see Table 1). However, the magnitude of the coefficients and the constant reveals that the author has converted turnout6 from its original percentage scale to a relative frequency scale and then applied a logit transformation to the new variable to account for the fact that the dependent variable is bound to the interval [0;100].7

Accordingly, lagged turnout (VTURNLAG) was constructed by taking the turnout from the previous election for all election years as outlined in footnote 4 while LOGITVTURN was constructed as ln (-TURNOUT-∕100-)
 1-(TURNOUT∕100). Then, a variable was created that reflects Bingham Powell’s assessment of the proportionality of each electoral systems.8 Information on compulsory voting is a dummy variable which takes the value ‘1’ for Australia, Belgium, and Italy and ‘0’ for all other countries. Finally, information on inequality from the University of Texas Inequality Project (2004) was matched to the data set. Since this information is missing for four French, one Italian and two British election years, the number of observations is further reduced to 129 observations.

2.2.2 Inappropriate Methodology

The data constitute a Time-Series-Cross-Sectional (TSCS) or panel arrangement: (very short) time series from n = 15 countries are pooled and analysed jointly. In political science, this design became extremely popular after Beck and Katz (19951996) suggested that the familiar Ordinary Least Squares (OLS) estimator can be applied to this specific data structure as long as the standard errors are ‘panel corrected’ (PCSE) to account for the dependencies amongst observations.

Following Beck and Katz (1995, 636), a generic TSCS model can be written as

y  = x ☐ +ε    with  i = 1,...,N;t = 1,...,T
 i,t   i,t    i,t
(1)

where y is the dependent variable, x is a vector of independent variables (including the constant), ε is a random error term, and observations are indexed by unit/country (i) and time (t). With TSCS data, the standard assumptions of regression analysis are likely to be violated as one would expect the ε to be ‘non-spherical’ that is, contemporaneously correlated, heteroscedastic, and serially correlated (Beck and Katz1995, 636), rendering ‘raw’ standard errors invalid and thereby giving rise to confidence intervals that are too narrow and significance tests that are too generous.9

(Positive) contemporaneous correlation, which is a consequence of unobserved factors increasing or decreasing turnout in several countries at the same time, cannot be ruled out completely but is unlike to pose huge problems for the analysis since the unit of observation is the national election.10 On the other hand, heteroscedasticity (the variance of ε is greater in country i than in country i + 1) is very likely to occur in the case of turnout: in countries where voting is compulsory, the variance of ε is bound to be smaller than in other polities.

Finally, the presence of (positive) serial correlation (the impact of some unobserved factor that affects turnout in country i at time t will still be felt at t + 1 and possibly t + 2,t + 3,) can be taken for granted. Like most analysts of TSCS data, Lister follows the suggestions by Beck and Katz and accounts for this problem by including the lagged dependent variable (LDV).

This leads, however, to an intricate complication. The length of the election period varies both over time and across countries, e.g. it is fixed at four years in the US, comes empirically very close to the same value in the UK and in Italy, and varies between one year and five years in Canada (where there is a moderate upward trend in the duration of the election period). As a consequence, the autocorrelation of ε will also vary across countries and over time. The approach chosen by Lister does not deal with this, and even if it did, estimating a multitude of autocorrelations poses obvious problems, especially given the low N and T. While one could hope that the inclusion of the LDV somehow ameliorates the situation, exactly the same problems apply to the coefficient for the LDV which should again vary with the length of the election period. Therefore, any findings should be interpreted with extra caution.


PCSE
Bootstrap
GEE
LOGITVTURN
LOGITVTURN
LOGITVTURN
(1)
(2)
(3)
COMPVOTE 0.427*** 0.427* 0.360*
(3.498) (2.223) (2.505)
SINGMEMD -0.005 -0.005 -0.021
(-0.156) (-0.109) (-0.545)
FED 0.036 0.036 0.022
(0.721) (0.631) (0.429)
PRES -0.122 -0.122 -0.095
(-1.488) (-1.208) (-1.268)
STRBIC 0.062 0.062 0.070
(1.557) (1.079) (1.586)
RGDPC -0.000 -0.000 -0.000
(-1.387) (-1.058) (-0.547)
INEQ -0.024 ** -0.024 -0.021
(-3.207) (-1.872) (-1.599)
VTURNLAG 0.056*** 0.056*** 0.059***
(10.142) (7.145) (7.709)
Constant -2.130*** -2.130* -2.516 **
(-3.326) (-2.262) (-2.674)
R2 0.882 0.882
n 129 129 129

* p<0.05, ** p<0.01, *** p<0.001.
Instead of standard errors, t-values are given in brackets to maximise the comparability with Lister’s findings. PCSE were estimated using the xtpcse procedure in Stata 10 with the casewise option for the computation of the covariance matrix. The number of replications for the bootstrap is 200. GEE estimates assume a first-order autoregressive process for the errors. GEE standard errors are based on the ‘robust’ (Huber-White) estimate for the variance.

Table 2: A replication of Lister’s model with Panel Corrected and Bootstrapped standard errors

Given the likely presence of heteroscedasticity and autocorrelation, applying the corrections outlined by Beck and Katz seems to be a sensible strategy at first glance. However, the approach by Beck and Katz was developed for balanced panels consisting of say 10 to 40 time periods (Beck and Katz 1995, 640-642; Beck 2007, 97). In the data set compiled for this analysis, T 10 in only six countries with a maximum of 13 in Denmark while the number of panel waves is just 7 to 9 in five other countries and extremely low (4 to 6) in four other countries. Under these conditions, PCSEs are not guaranteed to perform well (see Shor et al. 2007 for a review of the associated problems).

As a simple safeguard, a non-parametric bootstrapping procedure (Efron and Tibshirani1993) was applied, that is, 200 samples of n = 129 were drawn from the original data set (with replacement), and the analysis was repeated for each of these samples, thereby simulating the process that generated the data. Since each of these samples is slightly different from the others, the parameter estimates will vary, too. This variation generally provides a realistic approximation for the standard error in circumstances where the distributional assumptions might not hold. The results are shown in column 2 of Table 2. Compared with the first column, the t-values are substantially reduced, rendering all independent variables except compulsory voting and the LDV insignificant.

However, the unequal and rather low T suggests an alternative robustness check with an estimator that does not rely on the time-series nature of the data. Amongst the host of estimators available for panel data, Generalised Estimating Equation Models (GEE) have recently gained prominence in political science because they can accommodate complex structures for the correlation of ε and are fairly robust against misspecification (Zorn2001).11 As it turns out, this method yields almost identical point estimates, and again, compulsory voting and the LDV emerge as the only variables with statistically significant effects (column 3). The upshot is that the calculation of PCSE in Lister’s original analysis of the turnout data is not appropriate and leads him to overconfident conclusions.

2.2.3 Unit Effects and Lack of Variation over Time

But there are even more fundamental issues with this analysis of turnout and inequality. First, one must be sure that the units (countries) can be pooled, i.e. that (roughly) the same slope coefficient(s) prevail(s) in all countries. In the turnout data set, a rigorous check of this assumption that would involve the estimation of country-specific models is impossible because the institutional control variables are constant or almost constant within countries.12 Second, one must check for the presence of unit effects, i.e. for country-specific intercepts.13 If units are pooled and unit effects are not accounted for, massive bias can result. For instance, if some variable x has a moderately positive effect on y within two countries A and B, and the average value of x is higher in country B while the overall level of y is higher in country A, a coefficient with a negative sign might be estimated unless country-specific intercepts are included in the model.14 Unfortunately, it is not possible to test for unit effects since the institutional variables do not vary within countries and are therefore perfectly collinear with the country-specific intercepts. Moreover, there are non-trivial linear dependencies between the independent variables: federalism and bicameralism correlate at r = 0.41,15 the correlation between federalism and proportionality is -0.4816 and even inequality and proportionality correlate at -0.44.


PIC

Figure 2: Inequality over time

To make things worse, the focal independent variable is ‘sluggish’ (Beck2001Wilson and Butler2007Plümper and Troeger2007), i.e. inequality varies a lot between countries but does not vary much within most countries (see Figure 2). While there are marked upward trends in Australia, Belgium, and the UK, and some apparently random variation in the Netherlands, inequality is largely stable elsewhere. Therefore, roughly 80 per cent of the total variation occurs between countries.

In a similar fashion, the variation of turnout within most countries is very moderate if compared to the variation between countries (see Figure 3). Turnout is consistently close to 100 per cent in Australia and Belgium (where compulsory voting is enforced) and still very high (i.e. above 80 per cent) in Austria, Denmark, Italy, and Sweden,17 whereas the figure for Japan hovers consistently around 70 per cent, and turnout in the US is permanently below or just above 60 per cent. Across the whole sample, only about 10 per cent of the total variation in turnout (or its transformation) occurs within countries while between-country differences account for the lion’s share of the variation.


PIC

Figure 3: Turnout over time

There are other issues here. Although the inclusion of the LDV was championed by Beck and Katz, the LDV is likely to cause problems. Estimates will be biased even if the errors are uncorrelated, and inconsistent in the presence of correlation amongst the errors (Ostrom1990, 62-65).18 There is a whole host of alternative dynamic specifications (Wilson and Butler2007, 106), and, as Wilson and Butler demonstrate, these can give wildly different estimates in many cases.

Yet, the most fundamental problem of the analysis at hand is this: like many (if not most) other comparative data sets, the turnout data are plagued by collinearity and a lack of intra-unit variation and are therefore not very informative (Western and Jackman1994).19 With most of the variation in both the dependent and the independent variables occuring between countries, one can be quite sure that polity-level factors have an effect on turnout, but it is not possible to disentangle the relative effects of the various variables that are constant (like federalism), do not vary much (like inequality) or are constant but not included in the model (unit effects). There is something about the US that depresses turnout while there is something about Australia that drives turnout close to its theoretical maximum, but while registration procedures and compulsory voting are highly likely suspects, it is not possible to prove that these factors are decisive.

No methodology, however advanced, can overcome this basic lack of independent pieces of information. Given this fundamental problem, it is not surprising that the estimates for the effect of inequality (and the other independent variables) are rather unstable and depend on the inclusion/exclusion of certain observations. This can be most easily demonstrated by removing all observations from a given year or country from the sample. For instance, the coefficient for inequality is reduced from -0.024 to -0.018 if the four elections in 1971 are excluded. By contrast, the coefficient goes up to -0.028 if the four observations in 1970 are excluded. The impact of excluding a single country is even more dramatic: if Austria (eight observations), a country with average inequality but high turnout rates, is removed from the sample, the coefficient rises to -0.038. Excluding Sweden (ten observations), a country with low inequality and high turnout, reduces the estimate to -0.016. Even excluding single observations can have a discernible impact on the estimates: without the Australian general election of 1993, the estimate for the coefficient is -0.028, while excluding the Dutch general election of 1971 brings it down to -0.017. In other words, removing a single observation from the sample can result in a change of the estimate that is roughly equivalent to one standard error.


PIC

Figure 4: Turnout and inequality

So is there anything at all that can be said about the relationship between inequality and turnout? The short answer turns out to be ‘no, not really’. One very basic approach is to ignore the institutional control variables as well as the potential impact of the GDP and to analyse the bivariate relationship on a per-country basis (see Fisher 2007 for a related bivariate analysis of turnout and the left vote).20 Figure 4 shows the respective scatter plots, with country-specific linear regression lines overlaid. This figure is quite revealing. Leaving aside the very low variation along both the x- and the y-axis in most countries, only five polities — Austria, France, West Germany, the Netherlands, and Sweden — display a clearly negative relationship between inequality and turnout, and even this statement requires qualification. Fitting any sort of trend to four data points (France) is obviously risky, and the variation of inequality is extremely low in Sweden, Austria, and West Germany. Moreover, the clear-cut negative trend in Austria and West Germany hinges on one outlying election respectively, which happens to be the rather unusual first election immediately after unification in Germany. This leaves the Netherlands as the only real example for the negative relationship between inequality and turnout. In all other countries, the relationship is weakly positive or close to nil.


PCSE
LOGITVTURN
VTURN
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
INEQ 0.022 0.022 0.025 0.020 0.203 0.203 0.234 0.183
(1.615) (1.585) (1.870) (1.315) (1.302) (1.299) (1.628) (1.050)
RGDPC -0.000*** -0.000*** -0.000*** -0.000*** -0.000*** -0.000*** -0.000*** -0.000***
(-3.376) (-3.437) (-6.835) (-5.757) (-4.268) (-4.270) (-7.987) (-6.402)
VTURNLAG 0.029** 0.027* 0.321* 0.319*
(2.642) (2.446) (2.308) (2.288)
AUS -0.579*** -0.587*** -0.679*** -0.692*** -4.055*** -4.065*** -5.165*** -5.244***
(-9.056) (-9.008) (-14.626) (-12.045) (-5.489) (-5.491) (-9.793) (-8.380)
BEL -0.407*** -0.414*** -0.500*** -0.499*** -2.674** -2.681** -3.698*** -3.599***
(-3.486) (-3.461) (-4.361) (-3.693) (-2.854) (-2.854) (-4.516) (-3.740)
CAN -1.343*** -1.376*** -1.903*** -1.897*** -14.497*** -14.539*** -20.689*** -20.585***
(-5.854) (-5.907) (-33.006) (-25.188) (-4.877) (-4.881) (-21.366) (-15.623)
DEN -0.760*** -0.775*** -0.974*** -0.998*** -4.950*** -4.968*** -7.315*** -7.534***
(-8.805) (-8.789) (-23.430) (-18.640) (-4.652) (-4.658) (-17.463) (-14.211)
FIN -1.293*** -1.323*** -1.783*** -1.793*** -13.641*** -13.679*** -19.051*** -19.126***
(-6.747) (-6.785) (-29.264) (-24.846) (-5.493) (-5.497) (-19.386) (-16.668)
FRG -0.714*** -0.726*** -0.893*** -0.909*** -4.707*** -4.722*** -6.687*** -6.811***
(-6.346) (-6.390) (-11.516) (-10.368) (-3.830) (-3.837) (-8.307) (-7.454)
IRE -1.568*** -1.609*** -2.247*** -2.241*** -17.468*** -17.520*** -24.972*** -24.890***
(-5.539) (-5.601) (-28.574) (-23.998) (-4.991) (-4.996) (-24.652) (-20.513)