{"id":9870,"date":"2024-03-18T11:30:26","date_gmt":"2024-03-18T11:30:26","guid":{"rendered":"https:\/\/members.dataiq.global\/?post_type=article&#038;p=9870"},"modified":"2024-03-18T11:40:43","modified_gmt":"2024-03-18T11:40:43","slug":"3-reasons-why-nobody-is-talking-about-the-cost-of-generative-ai","status":"publish","type":"article","link":"https:\/\/www.dataiq.global\/devstage\/articles\/3-reasons-why-nobody-is-talking-about-the-cost-of-generative-ai\/","title":{"rendered":"3 reasons why nobody is talking about the cost of generative AI"},"content":{"rendered":"<p><span data-contrast=\"auto\">All of these were prime examples of the ability generative AI (genAI) has to categorise extensive data sets, and provide insights in natural language, responding to queries in real-time and improving customer experience along the way, while also taking away a lot of manual effort from human staff.<\/span><\/p>\n<p><span data-contrast=\"auto\">Each of these represented exactly the type of potentially transformational solution that has seen organisations piling onto genAI platforms. Certainly, the early indications from these proof of concepts (PoCs) were positive and the teams involved seemed rightly proud of their work.<\/span><span data-ccp-props=\"{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559739&quot;:0,&quot;335559740&quot;:240}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">But here\u2019s the funny thing \u2013 nobody wanted to talk about how much it had cost to develop these pilots, let alone what expenditure would be required to roll them out and whether there would be a positive return on that investment. The most telling comment came from the media company\u2019s in-house team who replied to my enquiry, \u201cwe\u2019re Data Scientists \u2013 we don\u2019t worry about the cost.\u201d<\/span><\/p>\n<p><span data-contrast=\"auto\">Well, somebody needs to and there are growing rumbles of disquiet about the bills that are rolling in behind experiments, tests, and pilots. What is beginning to dawn on Chief Data Officers, Chief Information Officers and Chief Technology Officers is that genAI represents a triple threat to their budgets. Here are three reasons why talking about costs is hard.<\/span><span data-ccp-props=\"{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559739&quot;:0,&quot;335559740&quot;:240}\">\u00a0<\/span><\/p>\n<p>&nbsp;<\/p>\n<h4><b><span data-contrast=\"auto\">Tokens feel like micro-payments, rather than IT budget-busters<\/span><\/b><\/h4>\n<figure id=\"attachment_9871\" aria-describedby=\"caption-attachment-9871\" style=\"width: 300px\" class=\"wp-caption alignright\"><img loading=\"lazy\" decoding=\"async\" class=\"size-medium wp-image-9871\" src=\"https:\/\/www.dataiq.global\/devstage\/wp-content\/uploads\/DIQ-100Reveal-162HiRes-300x200.jpg\" alt=\"David Reed addressing a crowd about generative AI and the DataIQ 100\" width=\"300\" height=\"200\" title=\"\"><figcaption id=\"caption-attachment-9871\" class=\"wp-caption-text\">David Reed, DataIQ&#8217;s Chief Knowledge Officer and Evangelist, addressing a crowd of data leaders about generative AI and the DataIQ 100.<\/figcaption><\/figure>\n<p><span data-contrast=\"auto\">Tokens are the topline expense which most business-side uses of large language models (LLMs) will quickly encounter. In text-based models, one word equals one token. That does not sound like much and in the world of consumer use of genAI, few will reach the point where they will need to buy extra tokens. If they do, it will be via a series of micro-payments, just like buying tracks on iTunes.<\/span><span data-ccp-props=\"{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559739&quot;:0,&quot;335559740&quot;:240}\">\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Once you delve into B2B use of genAI, for example when querying the call centre product library, you soon realise that these models involve running the entire product data corpus each time it receives a prompt (barring a level of classification and edge-querying on routine terms). Text scales fast and \u2013 when multiplied by a user base in the hundreds \u2013 will soon burn through budget. <\/span><\/p>\n<p><span data-contrast=\"auto\">The video analysis example is even more eye-watering. Instead of one word equalling one token, it would be one frame of video. With a typical frame rate of 24 frames per second, one minute of video would burn 1,440 tokens. Imagine a single insurance claim involving the upload of an entire journey\u2019s dashcam footage, during which an accident occurred, and you realise the scale of the challenge.<\/span><\/p>\n<p>&nbsp;<\/p>\n<h4><b><span data-contrast=\"auto\">Cloud costs are a hidden expense for generative AI<\/span><\/b><\/h4>\n<p><span data-contrast=\"auto\">Most serious adoptions of genAI in B2B are taking place within private cloud tech stacks in order to prevent leakage of commercially sensitive or private information. These require a licence for the LLM involved (unless one has been built in-house) which then operates inside the corporate cloud.<\/span><\/p>\n<p><span data-contrast=\"auto\">To avoid causing the lights to dim across the rest of the organisation\u2019s critical tech infrastructure, genAI will need its own cloud space \u2013 and LLMs involve massive data volumes and huge compute power. During the PoC phase, these parameters will be relatively constrained and the user base well defined.<\/span><\/p>\n<p><span data-contrast=\"auto\">Once a model goes into operation, however, and especially if it is made available enterprise-wide, costs scale rapidly alongside adoption. Demand for access to these tools within business processes will surge, which means IT budgets could quickly take a hammering.<\/span><span data-ccp-props=\"{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559739&quot;:0,&quot;335559740&quot;:240}\">\u00a0<\/span><\/p>\n<p>&nbsp;<\/p>\n<h4><b><span data-contrast=\"auto\">Users are not optimal with their prompts<\/span><\/b><\/h4>\n<p><span data-contrast=\"auto\">The whole point of genAI can be found in the name of one of the best-known examples \u2013 ChatGPT. Interaction should feel like a casual conversation which does not require any coding, technical knowledge, or structure \u2013 you can just throw in a question, and it will try to provide the best possible answer.<\/span><\/p>\n<p><span data-contrast=\"auto\">One consequence of this is that little effort gets made to shorten the prompt-to-outcome process. Instead of one well-crafted query that returns the best answer, multiple iterations are involved. Each time, the model runs in full, and tokens get burned.<\/span><\/p>\n<p><span data-contrast=\"auto\">So why is nobody talking about these costs? Well, they may not be talking, but they are muttering in private. At a recent DataIQ Member event, one healthcare insurance provider admitted that it had turned off its private LLM specifically because of the cost. It is certain not to be the only organisation that has been forced into that decision, nor will it be the last.<\/span><\/p>\n<p><span data-contrast=\"auto\">No doubt there will be emerging practices and solutions to help control costs and support full deployments without destroying IT\u2019s bank balance. For the moment and in their absence, it seems likely that genAI could stall because of the funding gap between a proof of concept and full-scale deployment.<\/span><span data-ccp-props=\"{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559739&quot;:0,&quot;335559740&quot;:240}\">\u00a0<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>David Reed dives into the financial side of generative AI as costs can rapidly spiral if not monitored and there is a necessity to be accurate with the use of tools. <\/p>\n","protected":false},"author":15,"featured_media":9872,"menu_order":0,"comment_status":"open","ping_status":"closed","template":"","format":"standard","meta":{"_acf_changed":false,"content-type":"","_searchwp_excluded":"","footnotes":""},"categories":[129,398],"tags":[217,464,465,411,345,231,367,466],"pillar":[197],"class_list":["post-9870","article","type-article","status-publish","format-standard","has-post-thumbnail","hentry","category-editorial","category-public","tag-ai","tag-budgets","tag-cost","tag-finance","tag-genai","tag-generative-ai","tag-it","tag-tokens","pillar-technology"],"acf":[],"publishpress_future_action":{"enabled":false,"date":"2026-05-21 00:33:18","action":"change-status","newStatus":"draft","terms":[],"taxonomy":"category","extraData":[]},"publishpress_future_workflow_manual_trigger":{"enabledWorkflows":[]},"_links":{"self":[{"href":"https:\/\/www.dataiq.global\/devstage\/wp-json\/wp\/v2\/article\/9870","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.dataiq.global\/devstage\/wp-json\/wp\/v2\/article"}],"about":[{"href":"https:\/\/www.dataiq.global\/devstage\/wp-json\/wp\/v2\/types\/article"}],"author":[{"embeddable":true,"href":"https:\/\/www.dataiq.global\/devstage\/wp-json\/wp\/v2\/users\/15"}],"replies":[{"embeddable":true,"href":"https:\/\/www.dataiq.global\/devstage\/wp-json\/wp\/v2\/comments?post=9870"}],"version-history":[{"count":0,"href":"https:\/\/www.dataiq.global\/devstage\/wp-json\/wp\/v2\/article\/9870\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.dataiq.global\/devstage\/wp-json\/wp\/v2\/media\/9872"}],"wp:attachment":[{"href":"https:\/\/www.dataiq.global\/devstage\/wp-json\/wp\/v2\/media?parent=9870"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.dataiq.global\/devstage\/wp-json\/wp\/v2\/categories?post=9870"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.dataiq.global\/devstage\/wp-json\/wp\/v2\/tags?post=9870"},{"taxonomy":"pillar","embeddable":true,"href":"https:\/\/www.dataiq.global\/devstage\/wp-json\/wp\/v2\/pillar?post=9870"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}