By Michael Allen, Dynatrace 25 October 2019
Every industry, and every company, is transforming itself with software to deliver new, improved digital services that capture new markets and reduce operational costs. In this quest for constant improvement, organizations are set to spend $1.2 trillion this year.
However, despite the goal to provide customers and business users with better, more seamless experiences, rarely does a week go by without performance problems causing disruption. In today’s always on, always connected digital world, mere milliseconds of downtime can cost millions in lost revenue and, as we become ever more reliant on software, there’s increasingly less margin for error.
To protect their organization against the chaos that performance problems can cause, the root cause must be found quickly, so IT teams can get to work resolving it before users are impacted. However, as the software landscape evolves to drive faster innovation, enterprise applications, and the hybrid cloud environments they run in, are becoming increasingly dynamic and complex. Organizations are now reliant on thousands of intricately connected services, running on millions of lines of code and trillions of dependencies. A single point of failure in this complex delivery chain can be incredibly difficult to pinpoint accurately. If this complexity goes unchecked, digital performance problems will increase in frequency and severity, creating an unacceptable risk for the business.
The flip side to agility
This escalating complexity is largely being driven by the accelerating shift towards the cloud. In modern, cloud native IT stacks, everything is defined by software. Applications are built as microservices running in containers, networks and infrastructure are virtualized, and all resources are shared among applications. This has been a key part of many businesses’ digital transformation strategy, enabling them to drive greater agility and faster innovation. However, the downside to all this is that complexity is off the charts. To understand their apps, IT teams now need to understand the full stack, with visibility into every tier, not just the application layer. This has made it impossible for humans to quickly identify where problems originate, leaving IT teams desperately trying to put out a growing number of fires, with little to no visibility into where and why they’re occurring.
As digital services and technology environments become increasingly defined by software, being unable to quickly detect and resolve performance problems will have wider ramifications for businesses and their revenues. While currently it’s frustrating when, say, an online banking website is down, glitches in the code of the driverless cars or drones that will dominate our roads and skies in the future could have catastrophic consequences. Businesses must act now if they are to relegate performance problems to the past before they have a devastating impact on our future.
Anyone call for some AI assistance?
It should come as a comfort that there is hope on the horizon for IT teams, in the form of a new breed of AI that has emerged over the past few years; AIOps. AIOps tools can automatically identify and triage problems to prevent IT teams drowning in the deluge of alerts from their monitoring solutions. The global AIOps market is expected to grow to reach $11bn by 2023, which demonstrates a real appetite for these capabilities. However, these solutions have their limitations, which is why we’re now seeing the emergence of more holistic, next-generation approaches to monitoring that combine AIOps capabilities with deterministic AI. This provides access to software intelligence based on performance data that’s analyzed in real-time, with full-stack context that provides IT teams with instant answers, so they can fix performance issues before users feel any impact. This type of 20:20 vision will help teams combat modern software complexity and gain clearer insight into their hazy cloud environments.
Taking it one step further, AI will eventually be capable of stopping performance degradations in their tracks, before they begin to develop into a real problem. For this to become reality, AI-powered monitoring solutions will need to be fully integrated with the enterprise cloud ecosystem, with access to metrics and events from other tools in the CI/CD pipeline, such as ServiceNow and Jenkins. AI capabilities will then be able to pull all monitoring data into a single platform, analyze it in real-time and deliver instant and precise answers that trigger autonomous problem remediation without the need for human intervention– something often referred to as application self-healing.
Smooth sailing into the future
It’s no secret that user experience is absolutely crucial for all companies operating today. While it may sound like a pipe dream, AI is fast becoming the key to helping businesses ensure these experiences remain seamless, by relegating performance problems to the history books. Whether you look at it in the long or short term, AI capabilities will ultimately give companies total peace of mind that performance problems will be dealt with quickly and efficiently – minimizing impact on user experience and protecting revenues and reputations against the devastation they can cause.
Michael Allen is VP for Global Partners at Dynatrace, which develops an all-in-one platform for IT infrastructure management.