Why messy data will make your company’s AI bill much higher than expected


For all the talk about AI infrastructure, chips, and the enormous amount of electricity now required to support large-scale model training and inference, there remains a quieter part of the story that rarely receives the same level of attention within companies, namely the state of the data on which these systems actually operate.
The International Energy Agency projects that electricity generation to power data centers will increase from 460 TWh in 2024 to more than 1,000 TWh in 2030 and 1,300 TWh by 2035 in its base case scenario, highlighting how quickly energy demand around AI is growing.
Founder and CEO of Coalescence Cloud.
In the United States, the pressure is already visible. The U.S. Department of Energy says data centers consumed about 4.4% of total U.S. electricity in 2023 and are expected to consume between 6.7% and 12% by 2028.
Article continues below
These numbers are important, but they can also make the problem seem distant, almost as if AI sustainability is something that only happens at the hyperscaler level.
In reality, a significant portion of the costs and waste associated with AI starts much closer to home, inside the CRM, the PSA platform, the financial system, the spreadsheet someone still keeps aside because they don’t trust the main dashboard, and the duplicate records that no one took the time to clean up.
Digital load
Far fewer companies are asking a more immediate question about their own environment, namely how much unnecessary digital burden they are creating simply because their data is messy.
AI doesn’t arrive inside an enterprise and start operating on an idealized set of perfectly structured information. He inherits everything that already exists.
If the customer record exists in five places, if revenue is defined slightly differently by sales and finance, if project data is incomplete, if teams still rely on manual solutions because the systems don’t come together, then AI will work in that reality.
The technological system alone will not correct these weaknesses. More often than not, this will make them more visible and more expensive, as each unnecessary workflow, each redundant query, each cycle of human rechecking, and each additional cycle spent trying to validate an output consumes more storage, more processing, and more employee time.
Clean data is not simply data that happens to be clean on a given day. This is data that is understood, governed, curated and aligned across the enterprise in a way that users can trust it.
IBM’s recent work on poor data quality highlights the business side clearly: 43% of operations managers cite data quality as their biggest problem, and more than a quarter of organizations estimate they lose more than $5 million a year due to poor data quality.
Enterprise technology leaders need more than ever to be aware of what happens when these same data problems are integrated into already compute-intensive AI environments.
Poor data quality has always been costly. What’s changing with AI is the speed and scale with which these expenses are increasing.
A broken process that once frustrated a team now has the potential to create repeated burden across multiple systems and models, while eroding confidence in the results meant to make work easier.
Sustainability
The sustainability debate around AI must become more operational. It cannot live on energy supply, carbon targets or infrastructure investment alone. These questions are important, as are the day-to-day realities of what businesses ask their systems to do.
If a company uses AI tools on top of fragmented records, disconnected workflows, and unreliable reporting, then some of the environmental burden of that AI is self-inflicted. The organization spends more IT resources to get answers that should have been easier to obtain in the first place.
Behind all of this is also a governance problem, as a surprising number of organizations are moving even faster on deployment than on ownership and accountability.
Data reliability continues to emerge as one of the biggest barriers to meaningful adoption of AI, and it makes sense.
If no one can clearly explain where key data comes from, who owns it, how it is managed, or why definitions differ across systems, then the business has already created the conditions for unnecessary waste before the model even goes into production.
In this environment, AI becomes another layer of complexity layered on an already unstable foundation.
Organizations that achieve better results are usually those that take the more disciplined path, which often looks less exciting from the outside.
They reduce duplication. They align the logic of the system. They decide what the metrics actually mean and make sure those definitions apply to all teams. They simplify workflows before automating them. They patch ownership before expanding access.
This type of work is rarely presented as an AI strategy, but in practice it is often what differentiates AI programs that become useful from those that quietly create more overhead than value.
Healthier underlying systems
Once the underlying systems become healthier, AI begins to do what executives initially hoped. Forecasts become more reliable because inputs are stable. Customer data becomes more actionable because teams aren’t arguing over whether it’s up to date.
Automation begins to remove work instead of generating additional review cycles. At this point, efficiency improves in a way that matters both economically and operationally and, by extension, also from a sustainability perspective, because the organization is no longer burning resources to compensate for avoidable disruptions.
For businesses trying to understand the growing cost of AI, this is the right place to start. Before considering how to power more models, it’s worth asking how much unnecessary digital burden is already created by unhealthy data.
Before considering sustainability as something outside the business stack, it’s worth recognizing that cleaner systems use resources more intelligently.
And before assuming that AI’s environmental impact is just an infrastructure problem, executives should take a hard look at the state of the data their own companies feed into it every day.
Cleaner data will not solve all the challenges associated with AI infrastructure and energy constraints, but it will make business systems more efficient, more reliable, and more sustainable in immediate and measurable ways.
This is a much better starting point than simply assuming that more calculation will solve what better operational discipline could have avoided.
We have introduced the best AI chatbot for business.
This article was produced as part of TechRadar Pro Insightsour channel that features the best and brightest minds in today’s tech industry.
The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you would like to contribute, find out more here: https://www.techradar.com/pro/perspectives-how-to-submit



