Big data is sexy. Data scientists are the unicorns of the job market right now. Some days, it feels as though we are living right on the edge of some science fiction utopian future.
But unicorns and sci-fi aside, for businesses, implementing something like a big data strategy has to be more than sexy: it has to be practical.
In my book, Big Data in Practice, I outline 45 different practical use cases in which companies have successfully used analytics to deliver extraordinary results.
These are some of my favorites.
How big data is used to drive supermarket performance
Wal Mart is the largest retailer in the world and the world’s largest company by revenue, with more than two million employees and 20,000 stores in 28 countries.
With operations on this scale it’s no surprise that they have long seen the value in data analytics. In 2004, when Hurricane Sandy hit the US, they found that unexpected insights could come to light when data was studied as a whole, rather than as isolated individual sets.
Attempting to forecast demand for emergency supplies in the face of the approaching Hurricane Sandy, CIO Linda Dillman turned up some surprising statistics. As well as flashlights and emergency equipment, expected bad weather had led to an upsurge in sales of strawberry Pop Tarts in several other locations. Extra supplies of these were dispatched to stores in Hurricane Frances’s path in 2012, and sold extremely well.
Timely analysis of real-time data is seen as key to driving business performance – as Walmart Senior Statistical Analyst Naveen Peddamail runs Wal Mart’s Data Cafe and tells me: “If you can’t get insights until you’ve analysed your sales for a week or a month, then you’ve lost sales within that time. Our goal is always to get information to our business partners as fast as we can, so they can take action and cut down the turnaround time. It is proactive and reactive analytics.”
Peddamail gives an example of a grocery team struggling to understand why sales of a particular produce were unexpectedly declining. Once their data was in the hands of the Cafe analysts, it was established very quickly that the decline was directly attributable to a pricing error. The error was immediately rectified and sales recovered within days.
Sales across different stores in different geographical areas can also be monitored in real-time. One Halloween, Peddamail recalls, sales figures of novelty cookies were being monitored, when analysts saw that there were several locations where they weren’t selling at all. This enabled them to trigger an alert to the merchandising teams responsible for those stores, who quickly realized that the products hadn’t even been put on the shelves. Not exactly a complex algorithm, but it wouldn’t have been possible without real-time analytics.
Wal Mart tell me that the Data Café system has led to a reduction in the time it takes from a problem being spotted in the numbers to a solution being proposed from an average of two to three weeks down to around 20 minutes.
How big data drives success in manufacturing
Rolls-Royce manufactures enormous engines that are used by 500 airlines and more than 150 armed forces. These engines generate huge amounts of power, and it’s no surprise that a company used to dealing with big numbers have wholeheartedly embraced Big Data.
In high-stakes manufacturing, failures and mistakes can cost billions – and human lives. It’s therefore crucial the company are able to monitor the health of their products to spot potential problems before they occur.
Rolls-Royce put Big Data processes to use in three key areas of their operations: design, manufacture and after-sales support.
Paul Stein, the company’s chief scientific officer, says: “We have huge clusters of high-power computing which are used in the design process. We generate tens of terabytes of data on each simulation of one of our jet engines. We then have to use some pretty sophisticated computer techniques to look into that massive dataset and visualize whether that particular product we’ve designed is good or bad.”
The company’s manufacturing systems are increasingly becoming networked and communicate with each other in the drive towards a networked, Internet of Things (IoT) industrial environment. “We’ve just opened two world-class factories in the UK, in Rotherham and Sunderland, making discs for jet engines and turbine blades,” says Stein. “We are moving very rapidly towards an Internet of Things-based solution.”
In terms of after-sales support, Rolls-Royce engines and propulsion systems are all fitted with hundreds of sensors that record every tiny detail about their operation and report any changes in data in real time to engineers, who then decide the best course of action. Rolls-Royce have operational service centres around the world in which expert engineers analyse the data being fed back from their engines.
They can amalgamate the data from their engines to highlight factors and conditions under which engines may need maintenance. In some situations, humans will then intervene to avoid or mitigate whatever is likely to cause a problem. Increasingly, Rolls-Royce expect that computers will carry out the intervention themselves.
Although they don’t give precise figures, the company say that adopting this Big Data-driven approach to diagnosing faults, correcting them and preventing them from occurring again has “significantly” reduced costs.
It has also resulted in a new business model for the company. Obtaining this level of insight into the operation of their products means that Rolls-Royce have been able to offer a new service model to clients, which they call Total Care, where customers are charged per hour for the us of their engines, with all of the servicing costs underwritten by Rolls-Royce. “That innovation in service delivery was a game-changer, and we are very proud to have led that particular move in the industry” says Stein. “Outside of retail, it’s one of the most sophisticated uses of Big Data I’m aware of.”
How big data is transforming health care
California-based cognitive computing firm Apixio were founded in 2009 with the vision of uncovering and making accessible clinical knowledge from digitized medical records, in order to improve healthcare decision making.
A staggering 80% of medical and clinical information about patients is formed of unstructured data, such as written physician notes. As Apixio CEO Darren Schulte explains, “If we want to learn how to better care for individuals and understand more about the health of the population as a whole, we need to be able to mine unstructured data for insights.”
Thus, the problem in healthcare is not lack of data, but the unstructured nature of its data: the many, many different formats and templates that healthcare providers use, and the numerous different systems that house this information. To tackle this problem, Apixio devised a way to access and make sense of that clinical information.
Apixio work with the data using a variety of different methodologies and algorithms that are machine learning based and have natural language-processing capabilities. The data can be analysed at an individual level to create a patient data model, and it can also be aggregated across the population in order to derive larger insights around the disease prevalence, treatment patterns, etc.
The first product to come from Apixio’s technology platform is called the HCC Profiler. The customers for this product fall into two groups: insurance plans and healthcare delivery networks (including hospitals and clinics). Medicare forms a big part of their business, especially those individuals within Medicare who have opted into health maintenance organization (HMO) style plans (called Medicare Advantage Plans), which accounted for nearly 17 million individuals in the US in 2015. Health plans and physician organizations have an incentive to manage the total cost of care for these individuals. To do this, these organizations need to know much more about each individual: What are the diseases being actively treated? What is the severity of their illness? What are various treatments provided to these individuals?
This is much easier to understand when you can access and mine that 80% of medical data previously unavailable for analysis, in addition to coded data found in the electronic record and in billing or administrative datasets.
Traditionally, in order to understand such patient information, experts trained in reading charts and coding the information (“coders”) would have to read the entire patient chart searching for documentation related to diseases and treatment. This is a laborious and expensive way of extracting information from patient records, and one that is fraught with human error. Apixio have demonstrated that computers can enable coders to read two or three times as many charts per hour than manual review alone.
An additional benefit is the computer’s ability to find gaps in patient documentation, defined as a physician notation of a chronic disease in the patient history without a recent assessment or plan. Gaps like this can lead to an inaccurate picture of disease prevalence and treatment, which can negatively affect the coordination and management of patient care.
Schulte explains: “If you don’t get that information right, how can the system coordinate and manage care for the individual? If you don’t know what it is you’re treating and who’s afflicted with what, you don’t know how to coordinate [care] across the population and manage it to reduce costs and improve the outcomes for individuals.”
This article was written by Bernard Marr from Forbes and was legally licensed through the NewsCred publisher network.