Forget Big Data — Small Data Is Driving The Internet Of Things


Mike Kavis, Contributor

February 26, 2015

When people talk about the Internet of Things (IoT) they tend to think about big data technologies like Hadoop where petabyte size datasets are stored and analyzed for both known and unknown patterns. What many people don’t realize is that many IoT use cases only require small datasets. What is small data, you ask? Small data is a dataset that contains very specific attributes. Small data is used to determine current states and conditions  or may be generated by analyzing larger data sets. When we talk about smart devices being deployed on wind turbines, small packages, on valves and pipes, or attached to drones, we are talking about collecting small datasets. Small data tell us about location, temperature, wetness, pressure, vibration, or even whether an item has been opened or not. Sensors give us small datasets in real time that we ingest into big data sets which provide a historical view.

So why is small data important? Small data can trigger events based on what is happening now. Those events can be merged with behavioral or trending information derived from machine learning algorithms run against big data datasets. Here are some examples:

Examples of Small and Big Data

A wind turbine has a variety of sensors mounted on it to determine wind direction, velocity, temperature, vibration, and other relevant attributes. The turbine’s blades can be programmed to automatically adjust to changing wind conditions based on the information instantly provided by small data. These small data sets are also ingested into a large data lake where machine-learning algorithms begin to understand patterns. These patterns can reveal performance of certain mechanisms based on their historical maintenance record, like how wind and weather conditions effect wear and tear on various components, and what the life expectancy is of a particular part.

Another example is the use of smart labels on medicine bottles. Small data can be used to determine where the medicine is located, its remaining shelf life, if the seal of the bottle has been broken, and the current temperature conditions in an effort to prevent spoilage. Big data can be used to look at this information over time to examine root cause analysis of why drugs are expiring or spoiling. Is it due to a certain shipping company or a certain retailer? Are there reoccurring patterns that can point to problems in the supply chain that can help determine how to minimize these events?

Do You Need Big or Small Data?

Despite what some may think, big data is not a requirement for all IoT use cases. In many instances, knowing the current state of a handful of attributes is all that is required to trigger a desired event. Are the patient’s blood sugar levels too high? Are the containers in the refrigerated truck at the optimal temperature? Does the soil have the right mixture of nutrients? Is the valve leaking?

Optimizing these business processes can save companies millions of dollars through the analysis of relatively small datasets. Small data knows what a tracked object is doing. If you want to understand why the object is doing that, then big data is what you seek. So, the next time someone tells you they are embarking on an IoT initiative, don’t assume that they are also embarking on a big data project.

This article was written by Mike Kavis from Forbes and was legally licensed through the NewsCred publisher network.

Great ! Thanks for your subscription !

You will soon receive the first Content Loop Newsletter