Terminology matters for science. If people use different words for the same thing, or even worse, the same word for different things, scientific communication turns into a dialogue of the deaf. European Radical Right Studies are a field where this is potentially a big problem: we use labels like “New”, “Populist”, “Radical”, “Extreme” or even “Extremist” with abandon.

But how bad is it really? In a recent chapter (author’s version, not paywalled), I argue that communication in Radical Right studies still works. Texts using all 50 shades of “Right” are still cited together, indicating that later scholars realised they were all talking about (more or less) the same thing.

I have written a number of short blogs about the change in terminology over time, the extraction of the co-citation network, and the interpretation of the findings. But sometimes, all this reading is getting a bit much, and so I tried something different: using some newfangled software for noobs, I turned my findings into a short video. Have a look for yourself and tell me what you think.

Topic modelling does not work well for (my) research paper abstracts

The Radical Right Research Robot is a fun side project whose life began exactly one year ago. The Robot exists to promote the very large body of knowledge on Radical Right parties and their voters that social scientists have accumulated over decades. At its core is a loop that randomly selects one of the more than 800 titles on my online bibliography on the Extreme/Radical Right every few hours and spits it out on twitter.

Yet the little android’s aim was always for some sort of serendipity, and so it tries to extract meaning from the abstracts (where available), sometimes with rather funny consequences. The robots’s first idea was to make use (structural) topic modelling. There are some implementations available in R and the first results looked promising, but in the end, topic modelling did not find meaningful clusters of papers that could easily be labelled with a common theme. One possible reason is that the abstracts are short, and that there are relatively few (less than 400) of them. And so the Robot reverted to using a small and fairly arbitrary set of keywords for identifying topics.

This approach produced some embarrassing howlers like this one:

Or this one (clearly the robot has a thing for media studies – who doesn’t?):

There are two problems here: first, even a single instance of a keyword in a given abstract is enough to trigger a classification, and second, the bot’s pedestrian implementation would classify an abstract using the last keyword that it detected, even if it was the most peripheral of several hits. Not good enough for world domination, obviously.

Newsmap works reasonably well for classifying topics in research paper abstracts

Looking for an alternative solution, the robot came across newsmap (now also available within quanteda), a geographical news classifier developed by Kohei Watanabe. Newsmap is semi-supervised: it starts with a dictionary of proper nouns and adjectives that all refer to geographical entities, say

'France': [Paris, France, French*]
'Germany': [German*, Berlin]
...


But newsmap is able to pick up additional words that also help to identify the respective country with high probability, e.g. “Macron”, “Merkel”, “Marseille”, “Hamburg”, or even “Lederhosen”. In a (limited) sense, it learns to identify geographical context even when the country in question is not mentioned explicitly.

But the algorithm is not restricted to geographical entities. It can also identify topics from a list. An so these days, the robot starts with a dictionary of seed words that is work in progress but looks mostly like this at the moment:

'religion & culture': [muslim*, islam*, relig*, cultur*]
'media': [TV, newspaper*, journalis*]
'group conflict': [group*,contact, prejudice, stereotyp*, competition]
...


Results are not perfect, but at least they are less embarrassing than those from the simple keyword approach. One remaining problem is that newsmap tags each abstract with (at most) one topic. In reality, any given article will refer to two or more themes in the literature. Topic models are much more attractive in this respect, because they treat each text as a mixture of topics, and so the robot may have to revisit them in the future.

Reprise: The co-citation network in European Radical Right studies

In the last post, I tried to reconstruct the co-citation network in European Radical Right studies and ended up with this neat graph.

Co-citations within top 20 titles in Extreme / Radical Right studies

The titles are arranged in groups, with the “Extreme Right” camp on the right, the “Radical Right” group in the lower-left corner, and a small number of publications that is committed to neither in the upper-left corner. The width of the lines represents the number of co-citations connecting the titles.

What does the pattern look like? The articles by Knigge (1998) and Bale et al. (2010) are both in the “nothing in particular” group, but are never cited together, at least not in the data that I extracted. One potential reason is that they are twelve years apart and address quite different research questions.

Want to watch a video of this blog?

Apart from this gap, the network is complete, i.e. everyone is cited with everyone else in the top 20. This is already rather compelling against the idea of a split into incompatible two incompatible strands. Intriguingly, there are even some strong ties that bridge alleged intellectual cleavages, e.g. between Kitschelt’s monograph and the article by Golder, or between Lubbers, Gijsberts and Scheepers on the one hand and Norris and Kitschelt on the other.

While the use of identical terminology seems to play a minor role, the picture also suggests that co-citations are chiefly driven by the general prominence of the titles involved. However, network graphs can be notoriously misleading.

Modelling the number of co-citations in European Radical Right studies

Modelling the number of co-citations provides a more formal test for this intuition. There are $\frac{20\times 19}{2}=190$ counts of co-citations amongst the top 20 titles, ranging from 0 to 5476, with a mean count of 695 and a variance of 651,143. Because the variance is so much bigger than the mean, a regression model that assumes a negative binomial distribution, which can accommodate such overdispersion, is more adequate than one built around a Poison distribution. “General prominence” is operationalised as the sum of external co-citations of the two titles involved. Here are the results.

VariableCoefficientS.E.p
external co-citations0.0004.00002<0.05
same terminology0.4240.120<0.05
Constant2.8520.219<0.05

The findings show that controlling for general prominence (operationalised as the sum of co-citations outside the top 20), using the same terminology (coded as “extreme” / “radical” / “unspecific or other” does have a positive effect on the expected number of co-citations. But what do the numbers mean?

The model is additive in the logs. To recover the counts (and transform the model into its multiplicative form), one needs to exponentiate the coefficients. Accordingly, the effect of using the same terminology translates into a factor of exp(0.424) = 1.53.

What do these numbers mean?

But how relevant is this in practical terms? Because the model is non-linear, it’s best to plot the expected counts for equal/unequal terminology, together with their areas of confidence, against a plausible range of external co-citations.

Effect of external co-citations and use of terminology on predicted number of co-citations within top 20

As it turns out, terminology has only a small effect on the expected number of co-citations for works that have between 6,000 and 8,000 external co-citations. From this point on, the expected number of co-citations grows somewhat more quickly for dyads that share the same terminology. However, over the whole range of 6,000 to 12,000 external co-citations, the confidence intervals overlap and so this difference is not statistically significant.

Unless two titles have a very high number of external co-citations, the probability of them being both cited in a third work does not depend on the terminology they use. Even for the (few) heavily cited works, the evidence is insufficient to reject the null hypothesis that terminology makes no difference.

While the analysis is confined to the relationships between just 20 titles, these titles matter most, because they form the core of ERRS. If we cannot find separation here, that does not necessarily mean that it does not happen elsewhere, but if happens elsewhere, that is much less relevant. So: no two schools. Everyone is citing the same prominent stuff, whether the respective authors prefer “Radical” or “Extreme”. Communication happens, which seems good to me.

Are you surprised?

• Arzheimer, Kai. “Conceptual Confusion is not Always a Bad Thing: The Curious Case of European Radical Right Studies.” Demokratie und Entscheidung. Eds. Marker, Karl, Michael Roseneck, Annette Schmitt, and Jürgen Sirsch. Wiesbaden: Springer, 2018. 23-40. doi:10.1007/978-3-658-24529-0_3
@InCollection{arzheimer-2018,
author = {Arzheimer, Kai},
title = {Conceptual Confusion is not Always a Bad Thing: The Curious Case of
booktitle = {Demokratie und Entscheidung},
publisher = {Springer},
pages = {forthcoming},
year = 2018,
url =
doi = {10.1007/978-3-658-24529-0_3},
pages = {23-40},
html =
editor = {Marker, Karl and Roseneck, Michael and Schmitt, Annette and Sirsch,
Jürgen},
}

Research question

For a long time, people working in the field of European Radical Right Studies could not even agree on a common name for the thing that they were researching. Should it be the Extreme Right, the Radical Right, or what? Utterly unimpressed by this fact, I argue in a in-press contribution that this sorry state has not seriously hindered communication amongst authors. Do I have any evidence to back up this claim? Hell yeah! Fasten your seatbelts and watch me turning innocent publications into tortured data, or more specifically, a Radical Right network of co-citations. Or was it the Extreme Right?

Want to watch a video of this blog?

How to turn citations into data

Short of training a hypercomplex and computationally expensive neural network (i.e. a grad student) to look at the actual content of the texts, analysing citation patterns is the most straightforward way to address the research question. Because I needed citation information, I harvested the Social Science Citation Index (SSCI) instead of my own bibliography. The Web of Science interface to the SSCI lets you save records as plain text files, which is all that was required. The key advantage of the SSCI data is that all the sources that each item cites are recorded, too, and can be exported with the title. This includes (most) items that are themselves not covered by the SSCI, opening up the wonderful world of monographs and chapters. To identify the two literatures, I simply ran queries for the phrases “Extreme Right” and “Radical Right” for the 1980-2017 period. I used the “TS” operator to search in titles, abstracts, and keywords. These queries returned 596 and 551 hits, respectively. Easy.

This is the second in a series of three posts. Click here for the first part of this mini series

But how far separated are the two strands of the literature? To find out, I first looked at the overlap between the two. By overlap, I mean items that use both phrases. This applies to 132 pieces, or just under 12 per cent of the whole stash. This is not a state of zilch communication, yet by this criterion alone, it would seem that there are indeed two relatively distinct literatures. But what I’m really interested in are (co-)citation patterns How could I beat two long plain text lists of articles and the sources they cite into a usable data set?

When you are asking this kind of question, usually “there is an R package for that”™, unless the question is too silly. In my case, the magic bullet for turning information from the SSCI into crunchable data is the wonderful bibliometrix package. Bibliometrix reads saved records from Web of Science/SSCI (in bibtex format) and converts them into data frames. It also provides functions for extracting bibliometric information from the data. Before I move on to co-citations, here’s the gist of the code that reads the data and generates a handy list of the 10 most-cited titles:

library(bibliometrix)
M <- convert2df(D, dbsource = "isi", format = "bibtex")
# remove some obviously unrelated items
M <- M[-c(65,94,96,97,104,105,159,177,199,457,459,497,578,579,684,685,719,723),]
M <- M[-c(659,707),]
M <- M[-c(622),]

results <- biblioAnalysis(M, sep = ";")
S=summary(object = results, k = 10, pause = FALSE)
#Citations
CR <- citations(M, field = "article", sep = ".  ")
CR$Cited[1:10]  So what are the most cited titles in Extreme/Radical Right studies? The ten most cited sources in 726 SSCI items SourceNumber of times cited Mudde (2007)160 Kitschelt (1995)147 Betz (1994)123 Lubbers et al. (2002)97 Norris (2005)90 Golder (2003)86 R.W. Jackman & Volpert (1996)77 Carter (2005)66 Arzheimer & Carter (2006)65 Brug et al. (2005)65 Importantly, this top ten contains (in very prominent positions) a number of monographs. The SSCI itself only lists articles in (some) peer-reviewed journals. Without the citation data, we would have no idea which non-peer-reviewed-journal items are important. Having said that, the situation is still far from perfect: We only observe co-citation patterns through the lens of the 1,000+ odd SSCI publications. But that’s still better than nothing, right? What about the substantive results of this exercise? The table clearly shows the impact that Cas Mudde’s 2007 (“Populist Radical Right”) book had on the field. It is the most cited and at the same time the youngest item on the list, surpassing the much older monographs by Betz (“Radical Right Wing Populism”) and Kitschelt (“Radical Right”). Two other monographs by Carter (“Extreme Right”) and Norris (“Radical Right”) are also frequently cited but appreciably less popular than the books by Betz, Kitschelt, and Mudde. The five other items are journal articles with a primarily empirical outlook and mostly without conceptual ambitions. Taken together, this suggests that the “Extreme Right” label lacked a strong proponent whose conceptual work was widely accepted in the literature. Once someone presented a clear rationale for using the “Radical Right” label instead, many scholars were willing to jump ship. Getting to the co-citation network: are the Extreme / Radical Right literatures separated from each other? If this was indeed the case, the literature should display a low degree of separation between users of both labels. Looking for co-citation patterns is a straightforward operationalisation for (lack of) separation. A co-citation occurs when two publications are both cited by some later source. By definition, co-citations reflect a view on the older literature as it is expressed in a newer publication. When two titles from the “Extreme Right” and “Radical Right” literatures are co-cited, this small piece of evidence that the literature has not split into two isolated streams. The SSCI aims at recording every source that is cited, even if the source itself is not in the SSCI. This makes for a very large number of publications that could be candidates for co-citations (18,255), even if most of them are peripheral European Radical Right studies, and a whopping 743,032 actual co-citations. To get a handle on this, I extracted the 20 publications with the biggest total number of co-citations and their interconnections. They represent something like the backbone of the literature. How did I reconstruct this network from textual data? Once more, R and its packages came to the rescue and helped me to produce a reasonably nice plot (after some additional cleaning up) NetMatrix <- biblioNetwork(M, analysis="co-citation",network = "references", sep = ". ") # Careful: we are not interested in loops and not interested in separate connections between nodes. We convert the latter to weights g <- graph.adjacency(NetMatrix,mode="max",diag=FALSE) # Extract the top 20 most co-cited items f <- induced_subgraph(g,degree(g)>quantile(degree(g),probs=(1-20/ length(V(g))))) # Now build a vector of relevant terms (requires knowledge of these titles) # 1: extreme, 2: radical, 3:none/other # Show all names V(f)$name
term <- c(3,2,1,1,2,1,1,2,1,2,3,2,2,2,3,1,1,1,1,1)
mycolours <- brewer.pal(3, "Greys")
V(f)$term <- term V(f)$color <- mycolours[term]


Co-citation analysis: results

So, what are the results? First, here is the top 20 of co-cited items in the field of Extreme/Radical Right studies:

The twenty most co-cited sources in 726 SSCI items
SourceCo-citations within top 20Total co-citations
Kitschelt (1995)7457700
Mudde (2007)7408864
Lubbers et al. (2002)6005212
Norris (2005)5685077
Golder (2003)5644687
Betz (1994)5426151
R.W. Jackman & Volpert (1996)4774497
Brug et al. (2005)4623523
Arzheimer & Carter (2006)4603551
Knigge (1998)4453487
Carter (2005)3893291
Arzheimer (2009)3763301
Ignazi (2003)3442876
Ivarsflaten (2008)3343221
Ignazi (1992)3313230
Rydgren (2007)3003353
Bale (2003)2973199
Brug et al. (2000)2762602
Meguid (2005)2462600
Bale et al. (2010)1342449

Many of these titles are familiar, because they also appear in the top ten of most cited titles and are classics to boot. And here is another nugget: for each title, a substantial share of about 10 per cent of all co-citations happen within this top twenty. This is exactly the (sub)network of co-citations I’m interested in. So here is the plot I promised:

Co-citations within top 20 titles in Extreme / Radical Right studies

But what does it all mean? Read the second part of this mini series, or go to the full article (author’s version, no paywall):

• Arzheimer, Kai. “Conceptual Confusion is not Always a Bad Thing: The Curious Case of European Radical Right Studies.” Demokratie und Entscheidung. Eds. Marker, Karl, Michael Roseneck, Annette Schmitt, and Jürgen Sirsch. Wiesbaden: Springer, 2018. 23-40. doi:10.1007/978-3-658-24529-0_3
@InCollection{arzheimer-2018,
author = {Arzheimer, Kai},
title = {Conceptual Confusion is not Always a Bad Thing: The Curious Case of
booktitle = {Demokratie und Entscheidung},
publisher = {Springer},
pages = {forthcoming},
year = 2018,
url =
doi = {10.1007/978-3-658-24529-0_3},
pages = {23-40},
html =
editor = {Marker, Karl and Roseneck, Michael and Schmitt, Annette and Sirsch,
Jürgen},
}

This is me, about once per year, when I bemoan my lack of R-coolness whilst simultaneously enjoying my Stata-efficiency.

The autumn/winter edition of the ever more Eclectic, ridiculously Erratic Bibliography on the Extreme Right in Western Europe is overdue well on its way, and it’s gonna be YUGE! Make it even YUGEr by sending me your candidates (books, chapters, journal articles) for inclusion. The geographical focus remains on (Western) Europe, but I am also interested in general (e.g. conceptual, methodological, psychological etc.) right-wing stuff. Self nominations are welcome. Obviously, no guarantee for inclusion whatsoever. If you have a DOI and/or a well-formatted bibtex entry, that’s spiffy, but as long as the reference is complete, I’m not too fussed about the format. Put your reference(s) in a comment right here, send me an email (kai.arzheimer AT gmail.com), DM me, or leave a comment on the Facebook page.

So: three more surveys. No kidding

Finality finally got a bit more final: just to annoy me (now here is a narcissist), three further surveys were published today (already yesterday in Germany). One of them is only new-ish: Emnid was in the field from September 14-21, so I take their data as a snapshot of the world as it was on September 17 (last Sunday). Forsa interviewed from September 18 to September 21, resulting in a mid-point of September 19 (Tuesday), while Insa did all their fieldwork on Thursday/Friday. But does this new information in any way alter the expectations? The short answer is:

It makes no difference

Here is a comparison of the overall estimates. They are virtually identical. The CDU/CSU is up by one point, but that is due to different rounding. The probability of the AfD coming third is now up at 99.6 per cent (from 96 per cent) and the point estimate for their lead over the Left is up, too, but again, that is due to rounding – the credible interval is much the same.

 yesterday today Median 95 HDI Median 95 HDI CDU/CSU 35 [34-37] 36 [34-37] CDU/CSU lead 14 [12-16] 14 [12-16] SPD 22 [21-23] 22 [21-23] FDP 9 [9-10] 9 [9-10] Greens 8 [7-9] 8 [7-9] Left 10 [9-10] 10 [9-10] AfD 11 [10-12] 11 [10-12] AfD lead 1 | [0-2.4] |     2 | [0.4-2.7]

No more graphs, because they would look the same. Coalition options do not change.  If the polls are right on average and the poll aggregation works, Grand Coalition and Jamaica are the mathematically possibilities. To be honest, in six per cent of my simulations a coalition of the Christian Democrats and the AfD would achieve a majority, but that is inconceivable.
That’s it. Move on. Nothing to see here until Sunday evening, which happens to be Sunday noon on my personal timescale.

I just suppressed the urge to insert the word ’countdown’ into the headline. See what I’m doing here? We have four more polls by Allensbach and Forsa (published on Tuesday), and by FGW and GMS (published on Thursday), and presumably, these are the last that we are going to see before election day. Do they change the story?

First, let’s note that FGW has the very latest data: they interviewed on Wednesday and Thursday and published the results immediately. A very short fieldwork period raises issues of representativeness, but they have been in the business for about 40 years now, so let us assume they know what they are doing, shall we? Second, unlike most pollsters, FGW always publishes both raw (but presumably weighted) data (what they call the political mood) and estimates that take into account party identification and other long-term factors (their ’projection’). So far, I have always used the former, but we have reached the point where the forecast becomes the nowcast, and so the only thing we get this time is their projection, which I treat like if they were raw data, using last week numbers of undecided and non-voters (both not very realistic, I suppose).

GSM was in the field from Thursday last week until Wednesday, but because I peg every poll to the mid-point of their fieldwork, their data are three days older than FGW’s for modelling purposes. Things get bit confusing then: Forsa were in the field from September 11 to September 15, and Allensbach even from September 6 to September 14, but then sat on their data. So their findings came out on Tuesday but are less recent than the Insa poll I talked about last time round. In other words: By putting this information in the model, I’m adjusting our estimate of where public opinion was a week ago, which then feeds into my guess where it is right now (or rather where it was two days ago). It’s a good thing that this is almost over.

Countdown

Ok, I succumbed. Couldn’t resist. etc.

Support for the Christian Democrats has further declined. The last estimate is 35 per cent [34-37], which would be six points less than in 2013. But the Social Democrats are down, too. The estimate for their current level of support is 22 per cent [21-23], so the CDU/CSU’s lead is still 14 points [12-16].

The FDP bounce remains elusive, and the Greens are weak

If there is a last-minute rush towards the FDP, it’s not reflected in the polls. But the party (not currently in parliament) is doing well, and much better than a few months ago when it was far from certain that they would return to federal politics. Estimated support for them is 9 per cent[9-10], which puts them ever so slightly ahead of the Greens (8 per cent [7-9].

Is the AfD finally pulling ahead of the Left?

After going to great lengths to explain why the race for 3rd place is irrelevant and how the Left is better positioned to win it anyway, the AfD is finally pulling (or rather inching) ahead. The final estimate for their current support is 11 per cent [10-12] (which would be a far cry from the levels of support they enjoyed in 2016), while the Left is put at 10 per cent [9-10]. With the four new polls factored in, the chance of the AfD coming third is now a whopping 96 per cent. The size of their likely lead is a single point [0-2.4].

Overall estimates and coalitions

I (and the pollsters) have been embarrassingly wrong before, but it seems almost impossible that we are not heading for a six-party parliament. It’s also quite clear that there will be no SPD-lead coalition government (unless the SPD could somehow persuade the Greens, the FDP, and the Left to work with them, and even that might not be sufficient). Unless there is a last-minute bounce for the FDP or the Christian Democrats that does not affect the other party (i.e. a shift from the radical to the moderate right), there will be no centre-right government

The two most likely outcomes remain a continuation of the Grand Coalition (not necessarily in the best interest of the SPD), or a Jamaica coalition (if the FDP and the CSU and the Greens can work together). Interesting times ahead.

German Elections: Three more polls

We Anoraks are all getting a little jittery here. It’s 134 hours until closing time and there will be only a small handful of polls coming in in the next couple of days, so is there anything new that may be divined from the latest crop, published today (Insa), on Saturday (Emnid), and on Friday (FGW)? Not really. First, the Emnid poll is not new, but new-ish: fieldwork began on September 7, almost a week before Infratest’s (alleged) shock poll. Second, the three polls mostly agree:

EmnidFGWINSA
CDU/CSU363636
SPD222322
GREENS887
FDP9109
LEFT10911
AfD111011

Third, they are broadly in line with the last (Friday) set of estimates. Of course, that does not mean that the pollsters have it right. It just means that public opinion as measured by the various survey houses seems to be rather stable at the moment.

The Christian Democrats are still leading

Support for the Christian Democrats has been flagging recently, but they still have a solid lead of about 14 points over the Social Democrats. The credible interval for the gap is 13-16 per cent. The current estimate for the Christian Democrats is 37 per cent [36-38], which would make them  the strongest party by far but would also imply a substantial loss compared to their result in the 2013 election (41.5%). The estimate for the SPD is 23 per cent [21-24], which is virtually identical to their worst ever result (in 2009).

The FDP and the Greens seem to be safely in

Speaking of virtual, it seems virtually impossible that these two minor parties will not clear the electoral hurdle. Then again, look at what happened in 2013. Right now, the FDP is ever so slightly ahead of the Greens, but the enormous attention they are currently getting from the chattering classes is not (yet?) reflected in the polls. Either way, their likely return from the electoral dead would be a significant event in German politics.

The Left and the AfD remain tied

Even the Wallstreet Journal is very excited about the idea of the AfD becoming Germany’s “third” party (technically, the CSU is competing for that title, too, but that is a different story). According to the model, however, the chances of the AfD ending up in this position are just 28%. Although predictions of support are almost identical – 9.5% [8.7-10.3] vs 9.7% [8.9-10.5] – the model gives the Left a much better chance (53%) of coming out tops. This is neatly illustrated here:

However, the relevant information (in my view) is still this: we are heading for a six/seven party parliament, with four minor parties of almost equal strength

Coalitions …

After factoring in the three latest polls, the options remain essentially the same: In all simulations there is a majority for both a Grand Coalition and a Jamaica arrangement. There is also tiny (0.5%) chance of a centre-right (CDU/CSU + FDP) coalition. If the polls are correct, nothing else will work. As I said before: Move on. Not much to see here.