Beyond Excel: What analysts — not data scientists — need

Author

Alteryx and Dean Stoecker

July 22, 2016

This sponsored post is produced by Alteryx.


Harvard Business School (HBS) is filled with nearly identical rooms. With the exception of one: Aldrich 108. In that room there is an odd plaque on the wall. Why? Aldrich 108 was the birthplace of VisiCalc. And HBS knows that the first spreadsheet that came out of there changed business forever.

The spreadsheet enabled bankers, consultants, and accountants to focus on the logical structure of analysis instead of calculation. With the addition of charts, visual basic enablement, and new functions, Excel (VisiCalc’s spiritual successor) only increased its role in a businessperson’s life.

But the era of Excel’s singular prominence is coming to an end. Data levels are massive. Sources of data maintain very different structures. And the data itself is constantly changing. Spreadsheets maintain enormous value, but they can’t handle this data deluge alone.

Most conversations around “big data” problems concentrate on the technologies that allow us to capture and search web-scale data. Tomes of articles in tech blogs and scores of lectures at conferences have touched on things like the Hadoop stack or the potential of Spark. People assume taking advantage of our data is simple if we can only overcome technical limitations.

It’s not.

The next wave of the analytics revolution will require more than just platform technology. The real challenge requires completing the stack and helping business users replicate what they did in Excel sheets. We need to go beyond Excel.

Breaking the spreadsheet

For most of the computing era, Excel has been our de facto analytics tool. Teradata managed central data storage. Traditional business intelligence companies like MicroStrategy and Business Objects maintained the definitions of metrics. And Excel enabled a very long last mile of tasks in the world of analytics.

Obviously, Teradata can’t manage web-scale data. And MS Charts aren’t enough to visualize millions of rows of information. These failings have given rise to innovators in the analytics ecosystem like Cloudera, Tableau, and Looker. But even as we seemingly standardize on Hadoop Distributed File System (HDFS) for web-scale data storage, the stack is far from complete.

Today, what’s missing is in the middle. We’ve got the infrastructure down, and we can visualize the output. But we haven’t touched the unsexy, but necessary stuff in the middle — things like data transformation, data governance, and data lineage. Instead, companies invest tens of millions of dollars in their data infrastructure, and then trust their analytic workloads solely to the kludgy (and fragile) Excel sheets created by their analysts. When managers forecast performance or budget for the next year, they’re left hoping that some analyst down the aisle didn’t forget to update a broken formula or input the most recent raw data output.

This is where innovation is so badly needed.

What real analysts need

Real analysts — not formally trained data scientists — need simple tools that can give them access to the insights locked inside a business’ servers.

When data was small enough to be passed easily around on a USB Key, Excel was a fine general purpose tool. The good news is that open APIs are making it possible for solutions that already exist to take the place of one tool “to rule them all.” Companies like Alteryx, Looker, Zoomdata, Interana and more can each tackle specific problems for end-users. Analysts just need to choose the right tools to get the job done.

And while there are a lot of tools out there, we’re fairly certain that the tools that emerge will help deliver against three needs of real world analysts:

  1. They’ll blend
  2. They’ll abstract
  3. They’ll automate

1. Blending (vLookups at scale)

It used to be that all your data existed in one place. The world was simple then. Now, we add new sources of data to the enterprise ecosystem. Before an analyst generates any sort of insight in Excel, she needs to synthesize all that data. What results is an unwieldy set of vLookup and hLookup formulas that consume all the memory in an analyst’s computer.

While Excel can do some of this data preparation (albeit poorly), analysts need to be able to blend data on the fly. Alteryx, for example, enables this process by allowing users to outline the relationships between vastly different data sets and manipulate their data into usable formats within an easy-to-follow workflow. Other companies assist in this process by simplifying data extraction through SQL interfaces. Regardless of how you approach the problem, the first step in putting this web-scale data to work is blending it together. The days of one data source holding all the pieces of an analyst’s puzzle are over.

2. Abstracting (MS Charts at scale)

Making sense of big data sets is hard. Showing rows and columns is virtually valueless when dealing with millions or billions of data points. But helping analysts get a sense for what’s hidden inside a data source is critical.

This problem wasn’t pronounced for early spreadsheet users. Visual tools like Qlik and Tableau were some of the first solutions to this problem of making sense of data. Today, businesses use algorithms and conversational AI to help take that abstraction even further. BeyondCore, for instance, suggests observations from the data directly. Others are adding natural language processing to their BI capabilities, enabling analysts to just ask the software what the relationships between two data types are within their data sets.

It’s not surprising that humans can’t make sense of web-scale data. But innovators and CIOs alike need to invest in helping them make the best sense of all the data they can. Otherwise all the value we pay for storage and infrastructure is just going to waste.

3. Automating (Macros at scale)

Finally, automation is critical in the new world of analytics. It might have been easy to engage in operational planning once a quarter inside of Excel. After running a calculation, an analyst would hit a button in a macro-enabled spreadsheet to send a report to relevant stakeholders. But forecasts in Excel are next to impossible to build if you’re trying to account for pricing changes in real time.

And the best companies are doing just that.

Unless an analyst can take her insight and use it to enrich a business process or existing applications, she’s going to be a step behind her peers. Companies like Anaplan are filling in some of the gaps that formerly macro-laden workbooks left behind. By building the data manipulation and subsequent business workflows right into the app itself, these companies are helping businesses take action in real time. Other companies’ native integrations with Slack or SharePoint automate the act of spreading the word. Even more are able to push algorithms directly into production, enabling a business analyst to improve product without ever involving IT or developer operations.

Where the spreadsheet used VBA scripts and basic macros to work around automation, truly acting at today’s speed requires far more thoughtful automation.

For years, people have paid attention to the bottom and top of the analytics stack; the Hadoop architectures that caught all our web-scale data and the sexy applications that spit out pretty visuals. But the real impact is going to be made when we start tackling the problems in the middle.

Once we figure out how to help every-day analysts blend, abstract, and automate their analytic workloads, we’ll see an enormous change in business. Until then, we’ll stay stuck waiting on data science teams to catch up with all the unnecessary requests people are making of them, when making a more self-service model for today’s analyst, would be beneficial to every business.

Dean Stoecker is CEO at Alteryx. 

Max Wessel is Vice President at Sapphire Ventures.


Sponsored posts are content that has been produced by a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. The content of news stories produced by our editorial team is never influenced by advertisers or sponsors in any way. For more information, contact sales@venturebeat.com.

This article was written by Alteryx and Dean Stoecker from VentureBeat and was legally licensed through the NewsCred publisher network.

Comment this article

Great ! Thanks for your subscription !

You will soon receive the first Content Loop Newsletter