A common practice among the purveyors of enterprise IT storage is to begin a product pitch with the following admonition: “Stored data volumes are growing at explosive rates!” Therefor Mr. Customer, you need our product to manage this explosive growth, or else.
The latest buzz word driving the explosive growth of stored data is the Internet of Things (IoT). Check out this infographic and imagine the Zetabytes of data and the billions of dollars’ worth of storage enterprises will need to buy to warehouse all this stuff.
If you are an IT planner wondering how your enterprise will cope with this imagined data tsunami that’s now appearing on the horizon, I suggest a quick reality check:
1. The Internet of Things isn’t new, but the buzzword is.
The idea that you can take any electrical device, give it some rudimentary intelligence and an IP address, and then connect it to the internet isn’t new. People began talking about connecting household items like refrigerators, toasters, and thermostats to networks back in the days of Internet 1.0. Scott McNealy once took the concept to a somewhat absurd level when he offered up the idea of smart paint—tiny computational devices running Java and injected into cans of paint to be layered on room walls. His vision at the time (1998) was Java everywhere and connected to a network—wired or wireless. Note that it’s taken IT years just to get to an understanding of the potential value of IoT. Instrumentation will be altogether another matter.
2. Don’t assume you will have to store all IoT data.
The IoT concept is closely connected to its big brother—Big Data. The idea is that you can use the new, infinitely scalable, cheap and deep, Big Data data analytics platforms like Hadoop or MongoDB to actually do something with all the bits these Internet connected things will emit, these billions of tiny data broadcasters running 24/7. However, the assumption that you will actually store all this data first and then run analytics against it is seriously flawed. It’s more likely that you will want to extract value from this data in real time. And for that you’ll want to look closely at stream processing technologies that analyze data “in motion” meaning that data is analyzed as it is ingested by the system. Here, data is stored after it’s processed rather than before, and even then these systems usually only store a small subset of this data. Examples of available platforms are IBM’s Infosphere Streams, SQLstream and Tibco StreamBase. A related computing concept is Complex Event Processing (CEP).
3. Don’t assume that all IoT data has value.
In fact, most of the data emitted by Internet-connected things can have little to no value in the context of real time analysis, alerting, and event reporting. Consider a system that converges data from a number of different Internet-connected sources to perform identity theft detection in real time. The objective of the system is to stop an event in progress. To do that, you only need to know when data is present that indicate abnormal behavior. Data relating to normal behavior is of no value in this context, so why save it?
4. Don’t assume that you’ll have to own IoT data in order to derive value from it.
Cloud services providers are now aggregating and selling the use of many different types of data—social media data being a prime example. It is now safe to assume that similar services will grow up around IoT data. The enterprise may only need to rent or purchase access to this data in order to derive value. And not just from cloud services providers either. Automobile manufacturers for example are busy instrumenting their new models with sensory devices that can stream data from vehicles wirelessly and they could sell this data as well.
For sure the old adage that data will be stored by someone somewhere applies to the Internet of Things. But in the case of enterprise users, it doesn’t have to land in the data center. Even when it does, it doesn’t have to be treated like transactional database data—saved and protected for an extended period of time. The exceptions will be in cases where the data captured has value in the context of ongoing and future research or saved for future analysis. For example, data emanating from sensors attached to the human body in a healthcare setting has value in real time when analysis of the sensory data indicates a change in the condition of the patient. But this data could also have value for future medical research. Therefore, it should all be stored and maybe forever as is often the case with imaging studies done by teaching hospitals. As one researcher once said, “In the future, we may know the right questions to ask of the data that we don’t know now.”
Yes, the Internet of Things has finally arrived. Your refrigerator will communicate with your bathroom scale and will threaten to lock you out when you get too heavy. And while IoT will generate the expected data tsunami somewhere, there will be ways and time available for enterprise IT to shoulder a burden that may not be as heavy as advertised.