6 minute read

Improving BI data quality

Data quality in business intelligence is not often recognised as an issue in itself, yet its impact on the reliability of reports and KPIs can be significant. This guidance note offers an approach for improving your CARBON score for BI by one or two points.

Improving your BI data quality from levels 2/3 to level 4

Business intelligence has evolved into a hands-on, everyday tool for decision making, rather than a specialist activity whose outputs are only created by an elite team of practitioners and delivered to a privileged group of executives. Data on business-critical actions – from customers to sales, manufacturing to logistics – has become highly accessible and increasing distributed through self-service BI tools. The term “democratisation of data” describes this expansion of BI’s footprint within the organisation.

Benefits of releasing data and provisioning its users with self-service BI analytics can be widespread: Typical upsides include:

Better customer metrics
Reduced time-to-market
Enhanced profitability
Clearer employee metrics
Deeper insight into revenues/profitability
Faster cycle times on key business decisions (ie, pricing, ranging, assortment, etc)
Live logistics and stock indicators

According to research carried out by Forbes among 437 global BI practitioners in 2016, 81% reported positive benefits from BI, with 45% saying the benefits are very significant and 36% significant. UK companies are below average in gaining these benefits, however, with 69% reporting positive impacts from their BI.

The issue

Whether it is the Monday morning huddle, Friday afternoon wrap-up or any number of business meetings in-between, a common problem plagues all decision makers who rely on business intelligence – reconciling conflicting figures. Time wasted on discussing and deciding which version is the truth can delay and even defer critical decisions which may lead to missed opportunities or increased costs.

In the Business Application Research Center (BARC) annual research study, BI Trend Monitor 2018, master data and data quality management emerged as the number one trend among over 2,700 BI practitioners it surveyed. Data quality was scored 6.9 out of ten for importance, putting it in top spot. Self-service BI was scored 6.4. This indicates that organisations have realised the need to take one step back in the process and address underlying data issues if their moves to democratise data are to deliver value.

According to CEO, Carsten Bange: “A properly-resourced, well-organised and continuous program of data integration and data quality is a must for any data-driven organisation. If you don’t have trust in the underlying data, your company’s reporting systems, and ultimately decision-making, are effectively based on quicksand.”

This problem was also recognised in the Forbes research in which only 41% of UK organisations said they were taking full advantage of the business opportunities afforded by BI compared to a global average of 48%. Issues discovered in the survey included:

Only 46% of organisations have empowered lines of business with unencumbered data access
56% have data silos that are not sharing data
53% say BI is delivering inconsistent or unreliable conclusions, or multiple versions of the truth.

Assessing the scale of the problem

An issue with the data visualisation tools used to deliver widespread access to business intelligence is that they do not resolve the underlying data quality issues and often do not even make them visible. Best practice is to ensure data governance is in place before data is provisioned to these tools (this is the subject of a separate whitepaper). It is also best practice to give ownership of key performance indicators to a central BI function which can assess the validity of these KPIs, identify the relevant data sources, apply remediation to that data where necessary and support output in consistent, agreed formats (this is also the subject of a separate whitepaper).

Before data quality within business intelligence can be addressed, the scale of the problem first needs to be identified. In addition, the customers of BI need to be helped to understand that there is an underlying data quality issue and also that they have a role in resolving it.

One direct method is the “Friday afternoon measurement” proposed by Thomas C. Redman. This involves running a workshop which brings together a group of business executives who rely on BI and asks them to provide a set of BI data. Redman’s methodology takes the following steps:

Assemble a data set – the group is asked to provide a set of sample data, typically 100 records, which are representative of the data they rely on when building reports. Each of these records should contain at least ten variables.
Involve data experts – the workshop will need to contain individuals who are familiar with the records in question and are comfortable “eyeballing” it for errors.
Highlight data errors – mistakes or omissions in the sample data should be highlighted. Colour coding or grading can be used to identify the severity of the error.
Capture the error rate – by counting the number of errors and the degree of severity, the workshop will show what level of data quality problems exist, extrapolating from the sample data.

In exercises carried out by Redman and colleagues, participants were asked to state what level of data quality they deem acceptable. In most cases, the demand was be for 90+% accuracy and completeness. Yet across 75 workshops reported on, only 3% of the sample data sets used were at this level.

The advantage of this method is that it uses live data which is “owned” and relied on by business executives. As experienced users of that data, the error rate which is identified will be easy for them to accept and its implications clear. Redman offers a simple cost calculation for the impact of poor BI data quality which further extends this impact.

The disadvantage of the method is that it relies on both subjective assessment of data accuracy and completeness. It also does not allow for “unknown unknowns”, for example, where a business name is incorrect but not recognised as such by the participants (especially if it is very familiar to them).

Fixing the problem of BI data quality

Data quality broadly is an “evergreen” activity because data itself is constantly in motion – individuals marry, move, die, change their names; businesses merge, acquire, rename themselves, change their trading addresses; products change their specification; accounting practices are adjusted, etc. As a result, BI data quality is a subset of organisational data quality and master data management. But it is one capable of direct, specific remediation. The following steps will serve to improve the underlying quality of the BI which lines of business rely on.

Centralise master data management – allowing each function to apply its own definition of key data variables (customer, product, sale, etc) is what leads to multiple versions of the truth. Establishing a central MDM repository avoids this issue, provided the standards it creates are federated into all core operating systems.
Centralise reporting standards – to support self-service analytics and democratisation of data, common standards need to be agreed for how BI reports are generated and presented. This should include standards for date ranges or banding, for example, right down to specifications for visualisations (colours, chart types). ideally, these should be embedded into BI tools to avoid conflicting outputs.
Centralise KPI setting – the core goal of business intelligence is to provide consistent insight into the performance of functions. Allowing each one to “mark their own homework” by setting their own key performance indicators creates the potential for conflict and confusion. Instead, a central BI team should evaluate all metrics currently in use, identify those of genuine value, eliminate the remainder, propogate surviving KPIs to their appropriate audience and control access to data to prevent “rogue” KPIs being created.
Identify data owners – with agreed data masters and reporting standards, ownership of data that enters operating systems needs to be assigned to specific individuals as part of their role, with direct metrics for how close to the target data quality they are.
Create a process for new BI and data – given the dynamism of the business environment, new reports and KPIs are in constant demand. A process needs to be introduced to evaluate proposed metrics, ensure they conform to agreed standards, provision data when appropriate, and monitor for ongoing conformity.

Conclusion

Democratisation of data and self-service BI and analytics have created the sense that everybody in the organisation can have access to the information they need. What has not followed behind this is the associated obligations – the data being access needs to be properly governed and owned, otherwise the result will be even more versions of the truth being created.

As a step in the maturing of data and analytics within an organisation, the centralisation of business intelligence is often a key first step. It provides early and visible benefits that not only justify the investment into BI, but can also form the basis for expanding data and analytics even further. These include: