The goal of big data: Making the unusual usual


Steve Jones

October 11, 2016

“Speak to Lana, she saw something like that last year,” or “Ask Louis to do it, he’s seen that a million times.” We’ve all heard similar phrases within business — phrases that are uttered when something unusual happens, linking people together to help them get more experience in order to solve the challenge of those unusual circumstances. Imagine an 18-year-old being thrown directly into Major League Baseball, facing the best pitcher in the league: Everything that he is about to see is unusual, and the odds of failing are massive. The unusual is what trips us up and can cause us to fail.

While some folks still talk about the three V’s — velocity, variety and volume — with regard to big data, I think we are well beyond worrying about what makes data big. Instead the focus is “Why bother with big data?” For me, it comes down to a very simple statement:

“Making the unusual usual”

What does that mean? Well, any system is able to do the happy path — able to understand how to react well in the circumstances that normally occur, and even able to handle the normal exceptions. But if an event only happens once in a year, the odds are that there will not be a coded exception path, the information won’t be available. If that once-a-year event can occur in any one of 50 countries, then it’s possible that a single country won’t see the event happen for decades. How can you learn and react to something you’ve never seen? A single system can’t. A single person can’t, but the “hive mind” of an organization is able to see the event and find who has seen it before. The strength of the response depends on the strength of the personal networks.

Big data, however, can use techniques such as machine learning to include all the events from every country. So while that event might only occur once a year, the big data system can see 10 such events over a 10-year period and be able to recognize the causes and build a standard response plan to resolve the situation. Thus when it occurs in a new country, it can be automatically detected and resolved, potentially even without a person knowing it occurred. By taking the “big” view you are actually able to solve much smaller problems that occur less often, but which can cause disproportionate disruption as a result.

The ability to identify the unusual is critical in areas such as cybersecurity, with attacks becoming ever more sophisticated. By being able to understand what is “normal” and then identifying what is unusual, it is possible to more quickly identify threats and build plans to resolve them. Solutions that only look at one network and one environment are going to be less able to learn and identify the signs that indicate a new type of threat.

The same is true in medical science: Being able to target drugs at specific genetic indicators is critical. Some indicators might be rare, but with a world population of 7 billion people, a few percentage points is a lot of people — it means being able to cure more diseases more effectively. While the challenge is unusual in the population as a whole, it’s something that becomes usual when looked at for an identified subset, but a subset that can only be effectively found by looking at the whole picture.

In manufacturing, being able to identify the unusual could mean looking at all the robots in a company and at the weather conditions to understand that when there’s a monsoon in Arizona, the Phoenix plant should start working more like the factory in Malaysia for a day or so, because they really know how to handle humidity. What is unusual in one place is the norm in another.

Big data technologies, therefore, are not simply about doing what we did before — reporting. On a bigger scale, it’s about being able to automatically identify and react to more conditions, to have a more accurate forecast. It’s not simply a linear prediction but a complex model that factors in huge numbers of variables and is able to react to the reality of what is happening.

So now imagine that 18-year-old ballplayer. Now he’s automatically filled with all the information from every baseball player who has ever played. Every pitch and every swing stored directly in his muscle memory; every unusual event already in his head. He draws upon a rapidly updating model as he sees the ball, sees the pitch, starts the swing…

That’s the goal of big data — to make “going yard” the norm.

This article was written by Steve Jones from CIO and was legally licensed through the NewsCred publisher network.

Comment this article

Great ! Thanks for your subscription !

You will soon receive the first Content Loop Newsletter