Big Data: Lessons learned from the UK election polls debacle

Author

ACAPODIC

May 25, 2015

Earlier this month the UK Conservative party won the British general election outright, conquering a majority of 331 of 650 parliamentary seats to Labours 232. None of the major pollsters predicted this outcome. Their research generally pointed to a hung parliament with a slight advantage to the Conservatives over Labour.

Unsurprisingly, this divergence stirred up an inflamed debate about how the pollsters could get it so wrong (here from BBC) and how such mistakes can be avoided in the future (here from the British Polling Council).

Why is this relevant to big data projects?

Big Data is a key technology for democratising and accelerating access to multiform data and making better decisions (here for the Capgemini point of view).

The similarities between a poll and a big data project may not be apparent; however a poll is actually a “small” big data project. It ingests multiform data (telephone and web surveys, political, socio-demographic and geographic information), distils it (produces a forecast within a confidence band) and enables decision making (e.g. how to tweak a political campaign). 

In fact, there are some important lessons to be learned from the polls debacle…

Don’t fall in love with black boxes

Make sure you clearly understand which data are being used to inform decisions and how they are “distilled”. Excessive trust in advanced analytic algorithms and machine learning can bring sour surprises. 

For example, not knowing exactly what is the bias induced by using “web-polls” in which the responding population is self selected, might have had significant consequences on the accuracy of the forecasts even when using sophisticated methodologies to correct the bias

Don’t oversell

Be realistic in setting the scope and expectations of the insights that your projects can create, and make sure all stakeholders understand them. The possibilities of Big Data are enormous, but the constraints of actual systems and data are real.

The more realistic the expectations, the less likely someone will end up eating up their own hat.

Fail fast – succeed faster

Sometimes, even the most meticulously prepared projects go wrong because of sudden changes or other unexpected events. 

For example, there is wide consensus that a large portion of electors made up their minds at the last minute and the turnout was lower than expected. This uncertainty could not be integrated in the models and contributed to the inaccuracy.

What to do to avoid these mistakes? Well, nothing…

Mistakes will happen. Be prepared to accept them, quickly understand the root cause of the issue and adapt your infrastructure to improve the output. Let the “Fail fast – succeed faster” mantra become part of your company’s culture.

And remember, your users are not expecting perfection – they are expecting you to get it right ASAP.

And if you don’t believe me – I can organize a poll to demonstrate it. 

Note: This is the personal view of the author and does not reflect the views of Capgemini or its affiliates. Check out the original post here.


There are 2 comments

  • Week 22 | import digest - 06/01/2015 18:30
    […] This Cap Gemini piece outlines why election polling is a like a Big Data project.  It goes on to imply that selection bias played a big role in confounding recent pre-election predictions in the UK: […]

  • Articles | Pearltrees - 05/28/2015 01:45
    […] When his son accused him of trying to train him to be a better driver, Mr Pratt agreed that was what he was trying to do. Continue reading the main story “Start Quote The way we've done insurance now compared to what we can do is sloppy - most people are actually overpaying” Sloppy business. Big Data: Lessons learned from the UK election polls debacle. […]

Great ! Thanks for your subscription !

You will soon receive the first Content Loop Newsletter