Thriving On Data #4 – Data Apart Together
What if there would not be one, single source of truth in corporate data after all? Many enterprises are now adapting to a ‘federated’ business reality in which many different sources of data exist and even more different views on how to use it. In order to deal with this – then thrive on it – it requires the smart use of the next generation of Master Data Management and Business Process Management tools, combined with all the goodness that comes from the Big Data technologies wave. But much more, it is a matter of corporate governance in which collaboration to get the right data together is always key. Data that is apart together thus becomes part of a powerful Digital Platform that enables the enterprise to go any current and future way, enabled by data.
As Capgemini’s research with MIT Sloan, the Digital Advantage, found companies that become digital out perform their peers. Most tellingly however it was companies that took the conservative route to digitization that delivered the most managed route towards becoming a digital enterprise. The challenge for any organization looking to become digital is how to leverage all of its data and to enable the business to combine that data. The fashionista approach is to look towards technology silos for point solutions, the conservative approach is to look towards governance and a consistent way for all the business to combine the information to their individual needs.
This view on information was what underpins the Business Data Lake which Capgemini co-innovated with Pivotal. Data Apart Together, is not simply about how you combine data, its about how you enable different business units to combine data in different ways. This is where governance, in particular MDM and RDM deliver huge benefits to businesses. The role of governance here is not to constrain the business by placing a single view upon them but instead to concentrate on how a business can collaborate around information. This view on governance is essential when thinking about how business users actually leverage information.
There has been for many years an approach of focusing on the data schema and trying to create a single consistent view for all parts of a company. The problem is that this doesn’t reflect how people actually use information in their jobs. People look to create personal views that reflect the individual challenges that they and their teams face. Thus the marketing lead for an airline looks towards the customers as being the center of their view, while the maintenance department looks for aircraft to be at the center. To bring disparate data sources together for the business therefore is about enabling them to create the right insight for their problems, or to put it another way, it’s about insight at the point of action.
Governance therefore needs to focus not on so much on schemas, or even data quality, but towards how data sets can be combined and therefore the identifiers that can be used to link those data sets consistently. Data quality therefore becomes a side effect of governance rather than its goal. This approach to governance is essential when looking at Big Data solutions. It is ridiculous to think that you can possibly create a single schema that includes all of the internal and external data that a company uses, information from Facebook and other social media feeds is ever changing, information available from open government sources is being continually added to and unstructured information sets such as email and documents defy any sort of traditional approach.
In the Business Data Lake therefore we have concentrated on governance from a business perspective not from a technical IT schema approach. This approach focuses on enabling collaboration and allowing the business to combine the various data sets within the lake to create their own local views and from there to see where more governance, and data quality, is required rather than creating a central plan which turns out to be wrong. This approach on focusing on identification and the cross reference means that both transactional systems as well as post transactional analytics can leverage the full range of information in an organization in a conservative and managed approach that aligns with the business model and value. Thus delivering on the promise of digitization and giving earlier delivery of its benefits than technology centric approaches.
Data Apart Together is a key trend for businesses and a key trend for IT to recognize how the market has changed. It’s about creating the platform that helps the business brings fragmented data together for its local purposes not how IT tries to impose a single view on information that constrains the agility of the business.
Contribution by Steve Jones
Part of Capgemini’s TechnoVision 2015 update series. See the overview here.