Data Governance can be defined as the set of policies, processes, and technologies necessary to govern data, to maintain data integrity and to control data access and, more and more important, to achieve a good data quality.
Traditionally, data governance was perceived as a matter of security and regulatory compliance. Even though this is a fundamental aspect (above all in specific industries, such as finance or pharmaceuticals). However, data governance should be also referred to the ability to extract value from the main asset available to companies: information. In short, data governance is the necessary foundation to accelerate your data-driven transformation.
Data Governance: technology, organization, involvement
The issue of data governance, as you may have already guessed, is very complex. Although there is a lot of talk about the opportunities of Analytics and Artificial Intelligence, companies, especially smaller ones, struggle to obtain well-structured and usable databases. Those databases are fundamental and should be made available to those who should extract relevant insights.
Not surprisingly, international research tells us that most of the time of Data Scientists (about 80%) is still dedicated to data cleaning and preparation, activities that should be performed before the actual analysis.
To achieve good Data Governance, it is therefore necessary to find the right synergies between technological, organizational and change management issues, involving all the stakeholders in the company.
Technology obviously plays a crucial role: it is not possible to manage large amounts of data with traditional tools such as spreadsheets. The evolution towards Big Data involves the renewal of legacy architectures such as low-performing data warehouses or ETL classical systems (Extract Transform Load). However, this technological evolution must be accompanied by constant attention to data governance and quality. Not surprisingly, more and more Analytics tools, in their commercial proposition, deal with data governance topic. In this context, choosing Cloud Computing can be a winner choice. Using Cloud services, you can have state-of-the-art features and being able to quickly revolutionize your storage and data processing systems.
From an organizational point of view, more mature companies are structuring themselves with the formal definition of roles dedicated to data governance. In the international scenario, we begin to know the Chief Data Officer. This role is an executive role responsible for managing data at a corporate level. Nowadays, it is not so widespread in Italy. At a more operational level, it may be relevant to identify data stewards, or figures responsible for the goodness and quality of a specific data source.
Finally, Data Governance is also an issue of communication and engagement of all stakeholders. We are all, in the company, responsible for the actions we perform on the data and this is even more true when access to data and analysis becomes ever wider.
Why is data quality becoming even more crucial?
As part of Data Governance goals, as mentioned, it is becoming increasingly important to ensure data quality. Quality means compliance of information with expectations, management (and, if possible, elimination) of missing data, correctness of the data format, management of duplicates and so on.
In a Big Data world, guaranteeing these aspects is really a great challenge and, moreover, it is also difficult to monitor and evaluate over time. However, companies are now becoming aware of the need to invest time and resources in data governance issues for two main reasons:
- Single Source of Truth: organizations often struggle with the need to build trust in data. If there is trust, people will start making data-driven decisions.
- Towards Augmented Analytics: only data quality can guarantee the effectiveness and confidence in forecast models (in particular Machine Learning) that are complex and difficult to interpret.
Discover the solutions to GOVERN your data
Four steps to achieve data quality
To conclude, let us try – putting ourselves in the shoes of a company that decides to invest in Analytics – to outline four basic steps to achieve well-governed, usable and good quality data:
- Investing in technological infrastructure: if you start from a traditional context, investing in tools such as next-generation data warehouses or data lakes is unavoidable.
- Data Governance by-design: governance is not a function that can be inserted in a second phase since the migration of data on new systems it is necessary to restructure one’s information assets.
- Create structured documentation: it must be possible to keep track of any changes to the data, the use of new sources and the data that are used, for example, in corporate reporting.
- Make the business feel involved: business users must be well aware of their role as guarantors of good quality data and responsible for the information of their competence.