Data science is not research and development
Large organisations often have long-standing R&D teams, especially those developing high capital expenditure items or where long-term factory retooling will be involved. These functions will be working to a horizon line of 10-plus years out and therefore have both less pressure to demonstrate short-term outputs and a higher tolerance of failure. While this does not guarantee the success of any products that ultimately emerge – the Boeing 737 Max could be seen as an example of this – R&D in this context is seen as a sequence of small hops towards a large, single end goal.
Data science functions are created outside of this environment and are usually aligned to a specific business function, such as marketing, or to cross-enterprise functions, such as finance, data or business intelligence. This immediately puts them into a context where objectives are likely to be quarterly and annual, rather than decennial. The pace may be even quicker depending on the complexity of the projects being tackled.
Where data science and R&D do have commonality may in the potential to earn tax credits for what they develop (see below).
Keeping a rich mixture of targets
A desirable characteristic in data scientists is that of curiosity – seeking out new problems to solve and new techniques to apply. Having the opportunity to explore complex issues with longer term solutions alongside short-term quick wins is therefore important to the enthusiasm, commitment and retention of these practitioners.
For that reason, a careful eye should be kept on the blend of targets and measures being used for the data science function. If they are predominantly quick wins, such as optimising digital marketing activities, this can become both wearisome and boring for data scientists. Equally, if the goals are too long-term, disillusion can set in because the final outcome is too far off.
Some organisations approach this by applying a points system to data science projects during the prioritisation phase. While data scientists are free to pick up projects that most appeal to them, they also need to demonstrate a balanced portfolio over a given period of time which combines short-term gains and longer-term outcomes. The function leader needs to monitor, reward or sanction practitioners as appropriate based on their points performance.
Three core types of data science projects
Specific uses for data science will depend on the organisation, its marketplace, product type and mix, level of data maturity and ability to operationalise the outputs. Typically, there are three types of activity that can be pursued by all types and scales of organisation. These are:
1. Cash-drivers: All companies need to drive revenue into the business, but this can be particularly urgent for any that are in turnaround (such as some car manufacturers or retailers) or where competition has intensified from digital-first rivals (such as insurance). Giving priority to projects that will generate cash in the short-term is vital in this environment.
2. Efficiency-drivers: Every process in an organisation, from HR to marketing, supply chain to warehousing, has the potential to be improved or optimisation, leading to efficiency gains for the business. While these may be quantified in non-cash terms (eg, reduction in cycle times or staff hours involved), they ultimately impact on the bottom line.
3. Priority-drivers: Demand for data science – or urgency within the organisation for improvements – will always exceed capacity. An ongoing project for the function is therefore using its skills to determine the right order in which projects should be undertaken and/or providing the data-driven evidence for why projects are chosen or refused.
Setting the right metrics
Metrics do not just happen – they need to be real, clearly-defined, ideally repeatable, visible in data sets, agreed by key stakeholders and finance, and preferably associated with rewards or sanctions if they are not met. Among the most advanced users of data science, such as one national logistics business, an accountancy-trained manager – called a benefits realisation manager – is embedded in the function to enable it to demonstrate financial deliverables to a standard acceptable to finance. While this may not be an option for all, it does demonstrate that this degree of specificity and accuracy is possible.
As with the need to keep data scientists across a balanced portfolio of projects, so should the function as a whole have a blend of metrics with both short-term and long-term outcomes in view. As one practitioner noted, some projects will deliver metrics of 50x costs, while others may be net negative (yet still desirable for the organisation).
It is also the case that capacity is limited compared to demand. By rigorously exploring metrics, the function can start to show the opportunity value it has discovered against the value delivered and thereby track its impact on the organisation. When attaching metrics to projects, attention should therefore be paid to deliverability against these potential options:
Hard cash measures (project-based): Every project picked up by the data science function should ideally have a specific cash benefit attached. While this may be relatively easy to identify, such as a reduction in marketing spend through optimisation of targeting or an increase in sales conversion through web site improvements, the harder aspect is getting agreement from the business stakeholder to acknowledge the role of data science in delivering this improvement.
Hard cash measures (company-level): As noted above in relation to projects, this may be a two-step process, ie, converting a reduction in headcount into pounds saved for the business. Where the impact is at a corporate level, it can be harder to draw a direct line between the in puts from data science and the eventual cash benefits, so reaching agreement at the outset on how this will be accounted for is essential.
Soft measures: Some of the goals for data science projects will be softer in nature with no direct financial benefit, such as looking to improve employee satisfaction or customer net promoter score as a result of new, improved processes. These clearly have multiple inputs and rely on processes being executed according to the recommendations from the data science function, but should be accounted for proportionally. In regulated industries, such as financial services, organisations specifically have to demonstrate soft benefits for customers, such as through Treating Customers Fairly rules or showing transactional net promoter scores.
Recognised, attributable measures: Given the complexity of what data science may recommend and the multiple elements involved in making it operational, drawing a direct line to financial or even soft measures may be hard or impossible. One way to step around this obstacle is to agree with stakeholders from the outset a percentage of any incremental gains or cost-savings that will be attributed to the input from this function. Even though it may not seem scientific, it is better to accept an estimated figure than to fail to agree and miss out on the eventual benefit being seen as stemming from data science.
Additional financial measures
Data science is expensive to maintain and run, not least because of the high salaries commanded by its practitioners. As a result, organisations will be looking for every possible positive financial return.
One option is to leverage the R&D tax credit offered by HM Revenue and Customs. The scheme was specifically designed for conventional research leading to the release of innovative new products or services. This can make it challenging for data science to prove it has delivered something similar, especially if its output is absorbed into a process, but it can be done – the national logistics business has been able to gain tax credits for multiple projects as a result of the work done by its benefits realisation manager.
Committing this level of resource may not be possible for all, however it is worth understanding the possibilities, especially if the data science function is of a considerable size. Details of the scheme can be found here.
Conclusion
The complex nature of data science and the dependency on downstream delivery through operations to realise its outputs makes establishing metrics a significant challenge. Where analytics is more focused on building simple models that can be put to work in clearly-defined environments, thereby making benefits recognition more straightforward, data science is often focused on more strategic, multi-stakeholder processes.
Difficult as this may be, it is nonetheless essential or data science risks being exposed as a cost
centre which does not appear to be contributing to the bottom line. Few functions are able to survive that type of scrutiny for long without facing downward pressure. Metrics may be hard, but they are easier than cuts.