Data, Like Oil, Is Pretty Useless In Its Raw State

Author

Tom Groenfeldt

September 28, 2016

Data is the new oil, is a popular cliché among big data enthusiasts. The comparison is true on a deeper level than they usually discuss.

Oil sitting thousands of feet below the surface of the Gulf of Mexico has no value until it is pumped out and refined. Similarly, data is everywhere in a large organization, but it has to be stored and analyzed before it generates any value.

And that’s a challenge for CIOs.

“CIOs are in trouble right now, and it is getting worse,” said Stephen Brobst, the chief technology officer at Teradata, a leader in data storage and analysis. “We’ve seen exponential growth in data. If I drop data on the floor and lose it, I am a bad CIO but if my budget grows exponentially to handle it, I am also a bad CIO.”

“The business comes to the CIO and says it wants to do this or that with the company’s data and the CIO says ‘Sorry, we aren’t keeping that data,’ he added.

“But the business is becoming savvy about data as the oil of the 21st century. The business figures the data has value and they want to extract that value. Budgets are flat and data growth is exponential, which is why you see emergence of alt approaches in how to capture, store and exploit data.”

Large organizations like tier-one banks have too much data, and too many diverse types of data, to store on one system, but managing multiple systems is hard, Brobst explained. So Teradata is working with Facebook on an open source platform called Presto, a distributed SQL query engine optimized for ad-hoc analysis at interactive speed and at petabyte scale.

Teradata is developing tools and features that help make Presto valuable to large enterprises in areas such as security, documentation, and  ODBC and JDBC connectors to link it to commercial software. All the Teradata tools for Presto  are entirely open source and can be downloaded from Teradata or GitHub.

“Ten years ago if you had asked if I believed we would be working with Facebook to create open source software that we would give away free, but that is happening,” Brobst said. “We are in joint development with Facebook Apache Open Source.”

Brobst said that Presto is being used by the smart kids in Silicon Valley including Airbnb, Dropbox, Gree, Groupon, and Netflix. For Teradata, Presto is a a good fit with some of its own big data analytics technology.

Dropbox uses it to analyze how people are using the freemium site and determine when their use is become sophisticated enough to offer them to a paid subscription.

The company said Presto complements Teradata QueryGrid software and fits within the Teradata Unified Data Architecture vision.

“Presto provides users the ability to originate queries directly from their Hadoop platform, while Teradata QueryGrid allows queries to be initiated from the Teradata Database and the Teradata Aster Database all through a common SQL protocol.”

CIOs can choose the tools that fit a task. Voice recordings from a call center wouldn’t fit the Teradata database, which is SQL oriented, he explained.

“I would put the recordings into a Hadoop cluster and transform them to text, then put that into a Hadoop Distributed File System and apply natural language processing to do fact extraction and sentiment scoring. Those are very structured and can be loaded into Teradata and accessed through Presto. 

Because Presto operates in memory it doesn’t require the I/O processing of Hive,” he added.

“Presto is Hive 2.0.”

Banks are looking for these capabilities, said Byron Vielehr, group president of depository business at Fiserv. Fresh from a user conference, he said bankers he talked to are trying to figure out how to take all this data that they have and make it into something worthwhile and important.

He asked a EVP from a large retail bank what happens to direct deposit when someone leaves her job.

It goes away, the banker responded.

Yes, said Vielehr, but first the direct depoist jumps as back pay, vacation pay and severance are deposited. That may lead to an opportunity to restructure a loan or create a wealth management opportunity, but only if the bank is capturing and analyzing that data and acting on it.

“With Amazon, every time you log in they put something in front of you,” Vielehr said. “Bankers haven’t been at the leading edge of thinking about data analytics and merging that into a real-time world. People in banks have a lot of data that would let them see what are the next most likely products for a customer.”

As banking becomes increasingly digital, banks will have to use data and digital channels to connect with customers who are no longer walking into a bank lobby where they can interact with tellers and loan officers in person. Banks need to adapt to new cars that can provide real-time stock quotes.

“Five years ago no one thought about automobiles with voice and touchscreen interfaces as a distribution point.”

Fiserv is developing a new platform called Notifi that will embed eventing engines in many of Fiserv’s solutions to provider fraud alerts and generate marketing offers. If a customer carrying a mobile phone walks into an auto dealer on a Saturday morning, the system could push out an auto loan offer, already approved.

Vielehr thinks a customer will be attracted to a loan from the bank where she already has a relationship.

“You just have to get the offer in front of them.”

 

This article was written by Tom Groenfeldt from Forbes and was legally licensed through the NewsCred publisher network.

Comment this article

Great ! Thanks for your subscription !

You will soon receive the first Content Loop Newsletter