Data science remains the surest ticket to Big Salary nirvana. If only it paid equally large dividends for the companies paying those salaries. Part of the problem stems from getting enough qualified people into a market desperate for answers.
But part of the problem stems from asking the wrong questions in the first place.
Big Love For Big Data
Given the furor over Big Data, it’s not surprising to see companies clamoring to hire people with data science expertise. As LinkedIn’s annual analysis of 330 million user profiles shows, statistical analysis and data mining skills top the charts as the hottest of hot skills in 2014, with Big Data-related talents accounting for a third of the top-15 hottest skills.
Given the law of supply and demand, it’s not surprising that data scientists make so much money. How much? Over $123,000 per year, on average, with that number sharply rising each year for the past several years.
Students are hoping to satiate that demand, with record numbers of MBA and engineering candidates rushing to become certified data scientists. (Which, as Mitchell Sanders writes, is somewhat silly, given the unique blend of skills needed to do data science well.)
Over time, as Gartner analyst Alan Duncan posits, salaries will even out. For now, however, the tools and knowledge needed to master modern data are so arcane that companies need to put out big money to have any hope of getting value from Big Data.
Big Delays For Big Data
Which may be one reason so many companies are sitting on the sidelines.
Given this influx of Big Data talent, it would be reasonable to assume more Big Data projects are being launched and delivering real value. This assumption, however, is wrong.
See also: Why Data Scientists Get Paid So Much
While not “hard” data, Gartner analysts Merv Adrian and Nick Huedecker have been tracking Hadoop deployments through surveys of webinar participants over the last year. As Adrian writes, this survey data indicates that Hadoop deployments, the poster child for Big Data, have been “slow to grow so far.”
While he concedes a growing number of pilots as organizations experiment with Hadoop, he points to “no dramatic growth in substantial projects undertaken so far, or substantial additional projects being added to the same cluster and driving growth.”
In other words, Hadoop and, by extension, Big Data, is lumbering along.
This back-of-the-envelope analysis finds support in various other studies that suggest that roughly 70% of organizations still aren’t doing much with Big Data. Part of this derives from a talent shortage.
But part also stems from confusion as to which tools to use, and how. As Bloomberg project lead Matt Hunt insists: “At Bloomberg that we don’t have a big data problem. What we have is a ‘medium data’ problem—and so does everyone else.” Hunt suggests that tools like Hadoop are the wrong tool for the right job: They assume scale that most companies don’t have.
As such, it’s not clear that hiring a rock star data scientist will solve anything. Except, perhaps, if they come with enough training to know when to not use popular but improper tools.
The Big Data Cog
Regardless of tools, as Pam Baker highlights, Big Data often depends on little people like you or I to input it into systems correctly. As she describes, this is wishful thinking at best. And when the data going in is garbage, the resulting analysis will be, too:
If the small data is wrong or missing, the big data analysis is off the mark, too. And small data today is a complete and utter disaster nearly everywhere I look.
Even if we assume perfect data input, it’s still the case that we analyze the data through the lens of our own biases, something that is simply impossible to avoid. Throwing more data at our biases doesn’t remove the problem. It amplifies it.
We are the ones asking questions of our data. We are the ones deciding which data to keep, and which to query. We, in other words, are Big Data’s biggest hurdle.
Maybe paying data scientists more money will solve this. Maybe training a new generation of data-savvy students will help, too.
But it’s also time that we took a deep breath and acknowledged that while data can and should influence more of our decision-making, it’s not an infallible god that will always steer us correctly. Because the data is ultimately all about us, and our own abilities to master it.
Lead image courtesy of Shutterstock