Collecting Big Data From IoT

Author

Robert Vamosi, Contributor

January 23, 2015

If the IoT estimates are correct then we will soon be living a world along with 50 billion interconnected devices within the next 5-10 years. No one, however, seems to be talking about the data that will need to be processed from all these devices, sensors, and monitors. Except one company.

“We think the data collection issue is one of the under-appreciated challenges of IoT,” said Hannah Smalltree, Director of Marketing Operations at Treasure Data. “We make a Cloud service that makes it very easy for people to collect and analyze their data without worrying about managing a complex infrastructure.”

Since launching in 2012, the company now has over 100 customers. Last week Treasure Data announced $15 million in Series B financing, led by Scale Venture Partners with AME Cloud Ventures. The company says this capitalization will allow expansion of its services and operations into the Internet of Things.

Treasure Data’s CTO and co-founder, Kazuki Ohta, told Forbes he’s seeing three main areas of interest within IoT. They are wearables, automotive telematics, and windmills.

Windmills?

Windmills are nothing more than a big sensor, Ohta explained, and the data that’s collected is for operational maintenance. Utilities analysts know that when, for example, five things happen, the unit fails. If a windmill goes offline suddenly, the cost of repairs becomes much higher than if preventive maintenance had occurred. In general, he said, operational maintenance among industrial sites is a growing IoT field.

With automotive telematics, Treasure Data is working with Pioneer, the Japanese electronics company, to collect data from devices that end-user attach to their car’s On Board Diagnostic port. These devices, Ohta explained, typically have 3G Wi-Fi and an SD card to buffer data from over 100 sensors in the typical car today. The data is first buffered onto the local SD card and then uploaded into the Cloud every minute. “Because they are using 3g Wi-Fi,” he said, “sometimes the network is not available, so our buffering and compression encryption really matters.”

“What’s interesting about utilities and telematics is that they have been producing this data for years,” Smalltree said, adding that utilities and telematics companies have been managing Big Data before the term was even coined. “So for them [Treasure Data] is a more efficient way to manage it, whereas wearables are coming clean to this.”

Smalltree said the real interest from wearable device vendors is focused on making the products better. For example, how a firmware update impacts battery life, or the product’s behavior. There’s also value in knowing someone is wearing the watch 24/7, and what trends there might be among the early, die-hard users.

“In the smartwatch case,” Ohta said, “they are collecting the data through mobile phones, via Bluetooth LE, so we provide them our SDK for Android and iOS.” Sometimes an end-user doesn’t synch the wearable device to their mobile application, so they keep buffering it. This may last 30- 90 days before they transfer it to the mobile device, then upload it to the Cloud. “That’s the big difference between wearables and cars.”

As for the security of that data, Ohta said the company relies on the underlying infrastructure of the two service providers they use. One is Amazon Web Service and the other is IDCF, a subsidy of Softbank and IDC Frontiers and Yahoo Japan. In addition he said the company hires a third party security consultant to check their network every quarter. Internally, the company also has a security team to monitor the environment.

“That being said,” Ohta said, “our customer are storing more less-sensitive data, like load data, customer data, and they don’t store email or any sensitive data. Sometimes they even remove the geolocation data.”

Ohta said the company’s initial customer database was Internet companies, including one customer that has more than three thousand servers collecting data before it is streamed up into the Treasure Data Cloud. The technical work Treasure Data did for that one company and the internet gaming companies is now being applied to the IoT devices. He said his company can scale up to millions of devices and reliably transfer that data into the Cloud.

“We’re now ingesting 400K records per second,” Ohta said. “That means one trillion records per month. That’s what we have right now. I want to see the world with 50 billion devices sending the data to us. However, I don’t think that will happen. But if it does, we have the scalability already.”

This article was written by Robert Vamosi from Forbes and was legally licensed through the NewsCred publisher network.

Great ! Thanks for your subscription !

You will soon receive the first Content Loop Newsletter