#indyref on Wikipedia

My colleague Taha Yasseri and I are currently working on a Fell Fund project on social media data and election prediction, looking especially at data from Google and Wikipedia (first paper out soon; will also be presenting on that at IPP 2014 which should be great). As part of that we thought we’d have a bit of fun looking at Scotland’s independence referendum on Wikipedia.

For election prediction the method is relatively straightforward: examine readership stats on the party Wikipedia pages of the country in question, and see which page is read the most (of course that doesn’t correspond straight away to election results – would that life were so simple – and the idea of the project is to see what corrections and biases need to be accounted for to make it work). It isn’t quite so clear how to do that for Scotland, but (just for fun really) we compared the following pages:


First we look at the UK and Scotland -> interesting how Scotland has leapfrogged the UK in the last days of the independence campaign. Points to a yes victory?


In terms of flags, though, the Union Jack is well ahead of the Saltire, peaking in the last few days. Is it a last minute outbreak of unionism?


In terms of national dishes, meanwhile, Haggis has been dominating Fish and Chips for the full period of the campaign, with interest in Haggis especially spiking in the last couple of days.

Well, one of these graphs will predict the winner of the referendum: we just don’t know which one ;-) More seriously, I think its interesting how most of these terms are spiking in the days before the vote, showing again how the social web really responds to political events.

UPDATE: Taha has passed me the comparison of the Yes and No campaign pages, as below. Yes for a narrow win following months of No dominance – you heard it here first.


Creating transnational political links with euandi & Facebook

Jonathan Bright

euandi front

Just wanted to put up a quick plug for the euandi voting advice application [VAA] which has recently been launched by the European University Institute. I was one of the 100 or so political scientists across Europe who got together to produce the application. Fill in a short questionnaire and it will tell you the extent to which other parties share your values (both in your country and across Europe), as well as telling you which other areas in Europe contain like minded individuals.

My political europe

There are lots of VAAs around at the moment; the novelty with this one is that there is the option to then go on to connect to people with similar political views through Facebook, with the aim of (for example) getting a European Citizens’ Initiative started. Overall aim is to promote more transnational politics in the European Parliament elections, which are at the moment almost overwhelmingly dominated by national political concerns. Pretty neat stuff and it will be interesting to see what comes of it over the next month or so.

Media effect or media replacement?

Taha Yasseri, Jonathan Bright

Online political information seeking, at least in the data we’ve gathered so far, happens in short, concentrated bursts. When we began the project, I (JB) was hoping that these bursts would tell us something about how people inform themselves about contemporary democratic politics. However we quickly saw in our first post that the peak of information seeking activity falls after the election itself takes place.

How can this be explained? So far we’ve been toying with two theories, developed out of the observations below. One: this behaviour is driven by news media coverage. People see the elections reported on TV or in the papers, then look them up online to find out more. If the peak in news coverage coincides with the day of the election, and its aftermath, then its logical that the peak of information seeking would occur shortly after that.

The second is that this behaviour is instead kind of replacing news coverage. People want to know the result of the election, especially if they participated, but for whatever reason the news media does an ineffective job of informing them of the result, so they look online instead.

One way of trying to distinguish between these two theories is by looking at information seeking activity during the European Parliament elections in countries with different election dates. The European elections in 2009 ran from the 4th to the 7th of June, but the final results were only announced on the 7th. Countries voting on the 4th, 5th and 6th would therefore have had a kind of information gap, whereby voters couldn’t find out the precise result of the elections. If information seeking is driven by a media effect, we might expect these countries to peak on the 8th (when the results are reported). If it is driven by a media replacement strategy, we would expect it to come the day after the relevant country’s election. Right?

Below are the info seeking graphs for the Netherlands and the Czech republic. As usual we are looking at page views of the Wikipedia page for the 2009 European Parliament elections in the language of the country of interest (so the Dutch and Czech versions). The Netherlands voted on the 4th of June, while the Czech Republic voted on the 5th.




Somehow, these graphs offer support for both theories, because they contain two peaks, one the day after the elections, and one the day after the results were reported. Well, none of this is perfect of course. Just because they can’t report the final result doesn’t mean the media can’t report (I believe) regional results within their country, and I think exit polls are also allowed. Finally they may just cover the elections on the day, even without reporting the results in detail. So media effects could still be the driving force. More importantly perhaps, the two theories aren’t really mutually exclusive.

In future work we are going to look at other Wikipedia pages which are more specific to the country in question. This will allow us to look at other early voting countries which don’t have a unique language (Austria, the UK, Ireland and perhaps Cyprus if the stats are high enough).

Outliers on the electoral information cycle

Taha Yasseri, Jonathan Bright

In the last post we looked at patterns of access to the Wikipedia article on the European Parliament election, 2009 identified an electoral information cycle which consists of a build up period, a peak of information seeking, and a period of decline. In 14 of the 19 language groups we looked at the dimensions of this pattern were very similar: that is, build up periods featured little information seeking until the few days before the election, peaks were short in duration and occurred just after the election, and periods of decline were very rapid.

In this post we want to focus on the 5 languages which didn’t fit into this trend, which were Estonian, Italian, Maltese, Norwegian and Dutch. Conveniently, each one of these languages overlaps almost perfectly with a single country, meaning that we can engage in some speculation about how the characteristics of different national systems affect patterns of electoral information seeking.

Both Malta and Norway featured very low absolute numbers of views to the web page in question, which presumably relates to the very low absolute population of Maltese speakers, the fact that Norwegian citizens don’t vote in EU Parliamentary elections (Norway does adopt lots of EU legislation through EFTA, so a very politically motivated Norwegian might still get interested to an extent), and of course the fact that English is widely spoken in both countries, meaning that they would have access to the English version of WIkipedia.

Of more interest to us are the patterns emerging from the Estonian, Dutch and Italian cases. Each one is distinct so we’ll look at each of them in turn.


Above is the situation in the Netherlands, which is unusual in that it presents two peaks of activity. The first falls the day after the Dutch portion of the EU elections took place (on the 4th of June), the second on the day after the last day of the election period defined by the EU as a whole. Both events are likely to provoke media coverage and hence drive searching, consistent with our media effects thesis expounded in the last post. However it is also worth noting that the European Commission prohibits distribution of country specific election results until all results are in from all countries in Europe. This means that Dutch voters would have had a several day wait to find out the results of the election they participated in. It could be in other words that instead of being generated by a media effect, this searching is generated by people who want to know the election result but haven’t been able to find it reported in the media.


Above is the situation in Italy. The country is unusual in that the build up period features a relatively large amount of activity (with a clearly visible weekly cyclical pattern), and also in that the peak of information seeking is sustained over several days. A couple of thoughts emerge. Firstly, Italy is known for having an electoral silence law which prevents opinion polling in the 15 days before the election, and any type of campaigning in the day or so before it. It could be that this relatively high build up period is a result of a dearth of information in the media about the elections.

Secondly, the Italian part of the European Parliament elections took place over two days (Saturday afternoon and Sunday). This may explain the longer peak, as people who participated on Saturday might search for the results on Sunday, whilst others might look on Monday.



Finally, we present the view from Estonia. The absolute numbers in this country are very small meaning that we shouldn’t read too much into it. However the pattern is highly unusual, with a peak falling well in advance of the election, meaning that we wanted to say something about it.

The Estonian part of the 2009 European Parliament elections was unusual for two reasons. Firstly, it featured a dramatic 17 point rise in turnout. Secondly, against a backdrop of frustration with the major parties, two independent candidates gained a significant proportion of the votes. How does this relate to an early peak? Honestly it’s not quite clear: perhaps more would need to be known about the specifics of the campaign. Nevertheless it feels relevant that this unusual pattern of information seeking coincided with an unusual electoral result.

Conclusions: Of course this is all speculation at the moment, however what emerges from this is an alternative thesis to the “media effect” discussed in the last post. Rather than seeking electoral information in response to reporting in the news media, voters may seek it because of a lack of such reporting (either because of an electoral silence law or a delay in the publication of the results). Such effects may be sharpened in an environment with new or unusual candidates. This is important for our overall question because it suggests that those who are seeking the information are likely to be those who participated in the election.

Next steps: see if any of these ideas can find support in the results from other countries. Did other countries which voted on the 4th of June have a twin peak style pattern? Are there any other electoral silence laws? What about victories for independent candidates?



The electoral information cycle

Taha Yasseri, Jonathan Bright

When do people start getting interested in elections, and how does this differ in different countries? In this post we try to get a handle on this question looking at data drawn from Wikipedia. Wikipedia is a useful resource because it has editions in a huge variety of languages, even if the absolute penetration of each varies.

We pull out information on daily readership statistics for the Wikipedia article on the European Parliament election, 2009 for May and June. The election itself was held between 4-7 June in the 27 member states of the EU. We look at 19 different language versions of the article, all of which are either official EU languages or at least widely spoken in one region. As most of the languages we look are roughly unique to one European country (with some caveats) this gives us an idea of how public attention to the issue of the election builds up and then decays past election time in each country.

se-wikipedia-euelections article

pl-wikipedia-euelections article

en-wikipedia-euelections article

The figures above shows the case of Swedish, Polish and English Wikipedias by way of example. The overall picture can be broken down into three periods: pre-election, the moment of the election itself, and post-election. Several things are worth highlighting. First, although there is clearly some activity before election time, there isn’t really a “build up” until just a few days before the election. Second, the peak in activity broadly coincides with the election itself, but actually falls just after it. Thirdly, attention decays very quickly after the peak back towards the base level. However note that the height of the peak can vary significantly from one language to another.

Out of the 19 language editions that we have studied, 13 + English follow this three stage pattern very closely. The figure above shows a curve collapse for these 13 language editions after normalisation to the maximum of the daily page view of each language (in the next post we will focus on the 5 remaining outlier language editions).

We speculate that this graph shows the majority of electoral information seeking occurs in response to publicity generated by media events, and hence reflects the continued importance of the mainstream media for the functioning of democracy. Somewhat paradoxically however, these media seem to stimulate major public interest in the European Parliament elections only after they take place, rather than pushing people to inform themselves before participating.

For our purposes, the next question is of course whether the height of the peaks, which represents several thousand individuals going to the page at election time, corresponds to any electoral outcomes (e.g. turnout). We will tackle this in a future post.