1. Context
Public generative AI platforms, such as ChatGPT and Bard, have captured attention and engagement unlike any other digital platform since the early days of search engines. From the C-suite to the frontline, there can be few individuals who have not attempted some form of trial of these new solutions.
All of this has been done for free, despite the enormous development costs involved for the providers. And it has happened at tremendous pace – a Google Trends report on searches for “ChatGPT” shows linear growth from an index of 1 as of 5th December 2022 to an index of 100 as of 20th May. It is the fastest-growing consumer app ever.
Referencing the Steve Jobs quotation that, “computers are bicycles for the mind,” Microsoft chairman and CEO Satya Nadella told the Microsoft Build conference on 23rd May that, “in November 2022, ChatGPT moved us from the bicycle to the steam engine.” The same event heard that generative AI will be built into every app in Windows and across the entire MS stack.
For data functions and data offices, this growth of interest has had conflicting effects. On the one hand, there is an assumption that they hold the key to unlocking this new tool’s benefits because of its inherent roots in data and data science. On the other hand, much of the experimentation and application is happening in functions outside of the data domain, such as marketing and customer experience.
This whitepaper presents an exploration of the fundamental issues which arise out of generative AI becoming part of the new digital suite within a business. Its focus is specifically on large language models (LLMs) which have been the primary engines of this spike in interest and what role these might play in the work of the data function.
2. Evidence base
The main inputs to this whitepaper are from a set of three roundtables held with DataIQ members during May 2023. At each of these, 7 to 8 senior data leaders participated in discussions around key use cases, challenges and opportunities from the adoption of generative AI by their organisation.
In addition, members of the DataIQ advisory board contributed their views and experience during quarterly meetings in May 2023.
3. Assessment method
For each of the key dimensions considered in the adoption of generative AI across this whitepaper, an assessment has been made of three critical factors and a score decided along a sliding scale from low to high:
Benefits – even free services have to pay their way, since working time and supporting resources are involved. Based on early stage use cases discussed, a low score is applied if the outputs are likely to be very limited or difficult to scale; moderate where positive impacts such as efficiencies or incremental gains can be expected, but within a limited domain; high if the scope for deploying a generative AI model into the operational environment is significant.
Complexity – a key driver of the current interest in generative AI has been the ease of use of the interface at the front end. At this level, a score of low is applied since there are few barriers to experimentation. If a model is deployed using an API link, the score rises to moderate as it brings corporate IT into play. A high score is given if the application will require significant input from data and data science to tune the model and engineer data in and out.
Risk – much is currently unknown around providers of generative AI who are in start-up or scale-up mode, whereas the major providers (Microsoft, Google) have highly-familiar infrastructures and globally aligned regulatory practices. A low score is applied where simple experiments are run using such platforms, rising to moderate when specific data is being inputted or alternative providers are in play. A high score is given if there is significant exposure to the business through its use of data, infrastructure or ungoverned use across generative AI platforms.
4. Free, PAYG or API?
Free access is clearly part of the appeal of leading generative AI platforms, allowing any online user to experiment with the technology, practising the skills of prompt engineering and response tuning.
Despite the apparently unlimited usage on offer, free access is actually constrained through tokenisation, although this can be worked around through the use of multiple devices and email addresses. Waiting lists have been a feature of early-stage generative AI platforms as they test market demand. Both constrained access and queuing are likely to feature in the scale-up stage of these platforms given the overhead of supporting high volumes of simultaneous online users.
Early maturity in organisational use is likely to push towards the pay-as-you-go model since this both dilutes the constraints outlined above and also introduces a small element of governance – users have more caution with services they pay for, whereas free services can lead to a libertarian attitude.
Generative AI that passes a proof of concept or becomes central to a process will likely migrate into an API access model or one that is fully deployed within the organisational technology stack. This significantly improves governance as any API call must be controlled through core IT procedures to pass in and out of the corporate firewall. Either approach should be the target end state for any generative AI model likely to be used with a high frequency across company-specific inputs, even where data and IP risks are relatively low (such as marketing email generation).
COMPLEXITY: Free: very low; PAYG: low; API: low/moderate
RISK: Free: moderate/high; PAYG: moderate; API: low
5. Native generative AI
Generative AI is not just being provisioned via standalone services that are clearly badged. The direction of travel for many core business technologies is to embed it natively within solutions, just as Microsoft has applied GPT-4 to its Bing search engine and Google has done the same with Bard.
Common tools with native AI will undoubtedly improve efficiency in business-as-usual tasks by either automating sub-processes or providing intensive user support. At the same time, this risks turning generative AI into a black box that can not be explained or potentially over-ruled.
Where a user is putting sensitive or commercial information into one of these solutions, ethical and governance considerations become more difficult to apply, not least because of the risk of an insight offered being actually an “hallucination”, for example, or data being transferred for processing to an undesirable location.
Any organisation that already has grey IT operating outside of corporate frameworks – which is to say, all organisations – will face this issue. The use of a basic tool like MS Excel to capture, manipulate, analyse and communicate data is one common example.
As cloud computing stacks similarly embed generative AI into their core propositions, it may become difficult to unpick where a data source has been assembled by the data function from a data product that has been automatically created via AI.
BENEFIT – moderate/high
COMPLEXITY: low
RISK: moderate/high
6. Generic v domain-specific
The focus of large language models on offer via public platforms has chiefly been the provision of near-human chat capabilities. As such, they have been trained to respond to prompts in a conversational manner that reflects the average next response identified from within huge training data sets.
One consequence of this approach is that generative AI is generic in the outputs it provides. Sequenced prompts can enrich this homogeneous chat for more specific outputs, for example, by prompting towards certain language, age or gender groups.
In the next stage of generative AI’s evolution, domain-specific solutions will undoubtedly begin to emerge that have been tuned to territories, industry sectors or processes. Such heterogenous solutions will likely not operate in the free or PAYG model, but instead will be premium offerings. However, a key benefit will remain the breadth of domain training built into these models.
BENEFIT – Generic: low; Domain-specific: moderate
COMPLEXITY: Generic: low; Domain-specific: low/moderate
RISK: Generic: low/moderate; Domain-specific: low/moderate
7. Pre-tuned v self-tuned
An important benefit from third-party LLMs is the deep training data used to build their models. Versions of each model can be accessed via API and then tuned on the user organisation’s specific data or against set processes and objectives.
This self-tuning will likely be an increasing area of focus for data science teams, rather than developing and building in-house models, since the scale of the pre-training that built the model is far beyond anything individual organisations could undertake.
If off-the-shelf generative AI models operate on the Pareto principle of an 80:20 ratio of pre-tuning to self-tuning, it is within this self-tuning gap that breakthrough impacts are likely to be achieved, not least because this is where models move from being generic towards being highly specific.
As generative AI platforms mature, however, an important benefit will be the availability of pre-tuned models. Just as the underlying LLM will have been trained on huge data volumes, so these pre-tuned models will benefit from multi-organisational development at a scale that will reduce the need for self-tuning in order to close the effectiveness gap.
Such models are already being offered in a beta state for a wide range of industry-specific requirements, such as media planning or customer classification. At the point where these are broadly adopted and accepted, the current state of benefit and complexity will flip with pre-tuned models offering higher benefits at lower complexity. Risks will remain significant, however.
BENEFIT – Pre-tuned: moderate; Self-tuned: high
COMPLEXITY: Pre-tuned: moderate; Self-tuned: high
RISK: Pre-tuned: moderate/high; Self-tuned: moderate
8. IP and data risks
At this stage of the development and adoption of generic AI via third-party platforms, considerations of data governance and risk are being overshadowed by the urgency of capturing benefits in the short-term. But there will likely be a rapid recognition that controls and guardrails are necessary as proofs of concept emerge and need to be scaled up operationally.
A primary question that needs to be asked is to what extent any generative AI model is learning directly from the data to which it is applied. In the realm of the major, global platforms, it appears that this is not currently an issue. (For clarity, the specifics of the data are not captured, but self-evidently models are learning from the tuning process, just as all technologies currently provide user data back to their developers.)
MS co-pilot does not capture any commercial data which it is applied to. Similarly, while GPT-4 was trained on internet data up to 2021, it does not ingest live data. Training of the model is based on the prompts used, however, and the next iteration will have the advantage of a feedback loop from a world that has discovered and adopted generative AI.
The consequences of this are difficult to predict although are likely to go beyond simple model effectiveness – just as search engines now routinely autofill queries, generative AI will likely apply contextual understanding via the user’s digital footprint.
An emerging consideration relates to any intellectual property which is developed using an underlying generative AI model. This extends to the monetisation of data, for example, where an organisation has direct business partners (suppliers to a supermarket, partner airlines in a loyalty scheme) to which data products and insights are supplied. These clients will self-evidently want the benefits generative AI can provide, yet if these are developed via a third-party platform, the question of ownership, rights and royalties becomes complicated.
A mid- to long-term risk that must be considered is of regulatory intervention which could prevent the ongoing availability of generative AI either globally, within specific territories or for certain processes.
BENEFIT – moderate/high
COMPLEXITY – high
RISK – high
9. Generative AI policy or SaaS policy?
Mitigating the risks outlined with the use of generative AI via third-party platforms will require the creation of clear policies and governance processes. Some organisations , albeit a small minority, do already have data and AI policies in place that can be extended to reflect specific issues with generative AI. For the rest, consideration needs to be given to how to put the necessary guardrails in place. Not introducing these is simply not an option since use of generative AI is already happening, whether formally or informally.
What needs to be considered is how such policies differ from those already applied to the use of other software-as-a-service platforms. Similar principles apply, from caution over entering any sensitive or commercial data through to due diligence around where data processing takes place.
If the sudden emergence of generative AI is understood as just a specific use case within the wider context of SaaS tooling, then the challenge of putting appropriate policies in place becomes less onerous. Where significant reliance is being placed on the use of these services, however, a more specific suite of policies will need to be created and embedded.
BENEFIT – moderate/high
COMPLEXITY – high
RISK – moderate/high
10. Example use cases
Marketing email creation with ChatGPT
A constraint on all marketing activity, no matter how data-driven, has always been the capacity for message generation at scale. While martech has given the marketing function extensive capabilities, generative AI now expands what is possible through skilled prompt engineering. As a platform, ChatGPT is primed for conversational outputs that are likely to have greater resonance with their targets. Multivariate testing becomes quicker and larger in scale as a result, with micro-segmentation capable of being operationalised because message content can be rapidly developed and deployed. Little governance is necessary during this phase unless the marketing content involved is moving towards very specific requirements, in which case it should be swiftly moved off the public platform.
BENEFIT – low/moderate
COMPLEXITY – low
RISK – low/moderate
Data categorisation with MS Azure co-pilot
As part of its latest stack update has fully embedded ChatGPT from OpenAI across its Azure cloud computing stack, meaning machine learning models can be built using an AI supercomputing infrastructure via tools like Jupyter Notebooks and Visual Studio Code, and open-source frameworks like TensorFlow and PyTorch. Microsoft has also embedded its responsible AI principles into this solution. Categorisation is one of the fundamental building blocks of LLMs, meaning they are very effective at identifying and classifying target data sets. Many data science functions have been tasked with this process historically and can now expand this work into large-scale model training and inference. An important benefit is that every step of this process can be undertaken within a single, unified data architecture, removing data engineering blockages or constraints.
BENEFIT – moderate/high
COMPLEXITY – high
RISK – moderate/high
Data insights with Tableau AI+
Data visualisation and analytics vendor Tableau has embedded generative AI into its platform allowing users to understand how predictions and insights are generated and why they are relevant. This means data consumers can ask business questions in natural language and uncover results or explore the “why” behind insights with dynamic visualisations. For users with domain expertise governed, no-code AI can be applied to tasks like predictions, what-if scenario planning and guided model building, including by business stakeholders themselves.
Data science models can be scaled and custom code from R, Python, Einstein Discovery, MATLAB and other extensions can also be integrated.
BENEFIT – moderate/high
COMPLEXITY – moderate
RISK – moderate
Data engineering with GitHub co-pilot
GitHub Copilot uses the OpenAI Codex to suggest code and entire functions in real-time. Trained on billions of lines of code, GitHub Copilot turns natural language prompts into coding suggestions across dozens of languages with autocomplete-style suggestions from an AI pair programmer while coding. This instance of Co-pilot has already increased flow and productivity for its user base by 54%. Use and benefits are specific to those with coding tasks and quality assurance will still be an essential aspect before deployment.
BENEFIT – moderate
COMPLEXITY – moderate
RISK – moderate
Legacy migration using ChatGPT
Probably the most leading-edge use case revealed during the DataIQ AI roundtables related to a migration project. The organisation is heavily reliant on the legacy analytics solution SAS which still runs many business-critical models. Its data office is using generative AI in a private instance to interrogate the underlying code and explain what each model does, then write code in a modern programming language to replicate the same actions. This process will eventually allow a full migration into a new cloud data architecture with no loss of IP from models or interruption of business processes.
BENEFIT – high
COMPLEXITY – high
RISK – moderate/high
11. DataIQ recommendations
- Plot and review usage of third-party generative AI across the enterprise, either via an “amnesty” for self-reporting and/or via log analysis.
- Assemble an AI governance working party to review usage, compare this against existing governance and controls, and map critical areas of exposure.
- Rapidly build use cases that can generate moderate to high benefits with low to moderate risk.
- Develop an AI awareness programme for the organisation to build an understanding of the potential, risks and skills required to ensure the new generation tools are appropriately and effectively deployed.
Take the DataIQ Generative AI assessment to see how prepared your organisation is for the latest AI tools.