Way too complex: why modern tech stacks need observability


Software failures are inevitable. But they should never turn into disasters causing nationwide devastation.
Whether a failure becomes a major disruption or is immediately identified, diagnosed and corrected depends on how an organization prepares and responds.
Vice President, Portfolio and Strategy, Dynatrace.
Building and delivering robust, resilient software requires deep, end-to-end, AI-driven observability that provides a consistent, unified source of truth about the performance of software environments and the source of any issues that could compromise that performance.
Today’s enterprise software environments are complex, spanning cloud-native applications, multi-cloud deployments, third-party services, APIs, and the growing influence of AI.
These layered environments introduce significant opacity into the software supply chain, making it more difficult to manage risk, performance, and resilience at scale.
The risk of modern technology stacks
Research shows that 42% of organizations expect to experience an incident caused by one of their suppliers. Too often, teams find themselves flying blind when something goes wrong, which can be frustrating and costly.
To operate with confidence, businesses must have a holistic view of their digital supply chain, which is not possible with basic monitoring.
Unlike traditional monitoring, which often focuses on siled metrics or alerts, observability provides a unified, real-time view across the entire technology stack, enabling faster, larger-scale, data-driven decisions.
Implementing AI-driven real-time observability covers all components, from infrastructure and services to applications and user experience.
Observability is a strategic necessity
End-to-end observability is evolving beyond its current role in IT and DevOps to become a fundamental part of modern business strategy. In doing so, observability plays a critical role in managing risk, maintaining availability, and safeguarding digital trust.
Observability also allows organizations to proactively detect anomalies before they become outages, quickly identify root causes across complex distributed systems, and automate response actions to reduce mean time to resolution (MTTR).
The result is faster, smarter and more resilient operations, giving teams the confidence to innovate without compromising system stability, a critical advantage in a world where digital resilience and speed must go hand in hand.
Resilient systems must absorb shock without breaking. This requires both cultural and technical investment, from adopting shared responsibility across teams to adopting modern deployment strategies like canary releases, blue/green deployments, and feature reporting.
Modern strategies only work if teams have clear, real-time feedback, allowing organizations to understand what’s happening, why, and what to do before customers notice a disruption.
Agentic AI: a new level of risk
We have entered the AI era, as organizations adopt generative and agentic AI to accelerate innovation, increase productivity and reduce costs. They also expose themselves to new types of risks.
Agentic AI can be configured to act independently, making changes, triggering workflows, or even deploying code without direct human involvement. This level of autonomy introduces serious challenges that accompany the potential benefits of AI.
For example, a misconfigured agent or malicious prompt can have far-reaching downstream consequences at machine speed, from cost overruns to anomalous behavior to outright failures.
Small ripples can become waves, faster, wider and more difficult to contain. AI-powered real-time observability platforms are essential, not only for monitoring what agents do, but also for understanding how they act, how they interact with other systems, and when intervention is necessary.
Observability helps safely harness the potential of agentic AI and pave the way for autonomous operations.
Guarding against disruptions
Industry leaders must adopt new technologies, including agentic AI, to keep pace with their competitors. At the same time, they must also adapt to the new security and compliance requirements that come with operating in increasingly complex technology stacks.
The best way for organizations to manage this increasing complexity and pressure is to treat observability as a strategic business driver, not just an IT capability. This ensures that each layer of the technology stack is transparent, accountable and resilient by design.
By prioritizing real-time, AI-powered observability, organizations can build lasting trust, adapt quickly, and drive business growth, all while avoiding wasting time and money battling damaging outages.
We offer the best IT automation software.
This article was produced as part of TechRadarPro’s Expert Insights channel, where we feature the best and brightest minds in today’s technology industry. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you would like to contribute, find out more here: https://www.techradar.com/news/submit-your-story-to-techradar-pro



