What is the future of big data? It will be all about predictions! Predictions based on data has come into our world and we often do not even know it. In many cities in the US, for example, it is no longer a coincidence when you meet an police officer: they are getting dispatched based on the models from George Mohler, a seismologists who has helped to predict where the next crime is about to happen (read more here). When you get a flyer in your mail, it might be because your next door retailer tries to predict what you need. Sometimes they do this too well: Target once made it into the news (read more here) because they knew that an underage girl was pregnant long before her own father knew it.
But let’s not get carried away by the big data world. Predictions are nothing new. It used to be the magicians like the famous Alexander Seer who already promised in the beginning of the last century to “know, see and tell” it all. Despite being new – Predictions based on data is the most difficult data product we have. Technically, the difference between predictions and recommendation engines (read here about them) is small. Most recommendations could be re-phrased as a prediction. The difference is in our own free will.
Data products differentiate by the amount of support they need before they create ‘actionable insights’ out of the data. Benchmarking, the most basic data product, needs an “analyst” to make sense. Recommendations need the “user” to decide what to do next. And predictions? Predictions need no one. Read that sentence again! Predictions know the answer – no further need to investigate or choose.
The idea was that the more data we have the better our recommendation engines will be – so that they become a prediction. This view best was summarized in the idea of “end of theory“. Chris Anders (@chrisanders) argued that in the future we will have sufficient data to predict anything, and thus there will be no need for theoretical models anymore.
But often it is not the amount of data that matters to create a good prediction. For example, the Incas predicted the best time to plant crops. Their dataset might have been as little as 3560 data points (= 10 years) – nothing in our big data world. 500 years later we have companies like Google that measure a lot about our online behavior. But despite all this data, predictions are not necessarily easy. For example, New York Times bestselling business author Carol Roth once complained in her blog that Google infers that she is a male over age 65, when in fact she is a woman decades younger.
Why is this? Because not all of the data Google has aggregated is really helpful for the specific prediction they try to make. That not all data is useful was best seen with the onset of social media. Suddenly there was massive data and many of us thought that this could predict amazing things. For example we saw many companies claiming that they can predict stock price movement by social media content. Most of them (if not all) have vanished by now, since it turns out that social media chatter is just to “noisy” and thus can not really help with the prediction.
(Taken and adapted from Pieterjan Vandaele under the creative commons license)
Allow me to make a prediction about data products as such: Predictive algorithms will become more and more part of our life, and will probably change our society more then the Internet has. The Internet enabled us to do things faster and more conveniently. However predictions based on our data trails aims even farther because they enable us to forecast human behavior in a way we never could before.
The biggest danger for the success of predictions is us – the “user” – not yet understanding that a prediction is just a trained algorithm that might go wrong. Even if the right data set was used – the wisdom of the crowd – that powered the algorithm might not be the right crowd for us. Think about the student who is required to change its major because he went “off track” for too long and thus the algorithm assumes a low likelihood of success (read more here). Such strict rules might be the end of ”out-of-the-box” thinking.
Our world is full of wrong predictions – even if they were based on data – and a wrong prediction might easily destroy our future. But when we learn as consumers to take predictions as what they are – as likelihoods that advise (not dictate) our lives – predictions based on data will benefit all of us.
Do you want to learn more? Subscribe to my newsletter to get some free resources about data products.