HomeOperationsAI Set to Transform Incident Management in 2024

AI Set to Transform Incident Management in 2024

One of the major incident management challenges that organizations routinely face is as the crisis starts to expand there is inevitably a need to increase the size of the team assigned to resolve it. The issue is that it takes time to bring each new member of the team up to speed on what’s occurring and, naturally, the clock is always working against any team trying to contain the adverse impact of an incident.

The instant summarization capabilities that generative artificial intelligence (AI) enables promise to dramatically reduce the amount of time required to onboard new members to an incident response team. Rather than having to allocate someone on the team to bring new members up to speed, each new member of the team will use queries framed in natural language to determine for themselves the extent of the crisis.

There will always be a need to establish roles and responsibilities for managing the incident, but the amount of precious time allocated to briefing everyone involved should be sharply reduced. Atlassian, for example, is embedding generative AI capabilities into the IT service management (ITSM) platform it makes available as an extension of its Jira project management software. Similarly, PagerDuty is previewing PagerDuty Copilot, which makes use of large language models (LLMs) to enable IT teams to invoke a range of generative AI capabilities, including summarizations.

Just as importantly, these platforms will further reduce time and effort by also automatically generating summarizations for any report that an IT incident management team is going to likely be required to file once the incident is resolved.

Of course, the primary focus, as always, should be on preventing incidents from occurring in the first place. Given the dependency organizations now have on IT the probability an incident is going to have a direct impact on revenue has never been greater. Unfortunately, given the level of complexity that exists in most IT environments it’s all but inevitable there will be incidents. The challenge and the opportunity is to minimize their impact to the greatest extent possible. Achieving that goal will require organizations to employ alongside generative AI two other types of AI models: predictive and causal.

Predictive AI models use machine learning algorithms to determine the probability of specific outcomes based on data collected from past events. They are already widely used in various IT operations platforms (AIOps) to determine, for example, the likelihood a service might run out of capacity as the amount of data being processed continues to increase.

The second type of AI model that is employed with less frequency employs machine learning algorithms to determine the cause of a complex series of events. Known as Causal AI, the goal is to provide insights into the root cause of, for example, an IT incident to enable IT teams to prevent it from happening again.

That multimodal approach to applying AI to the management of IT will enable IT organizations automate incident management in a way that first reduces the number of incidents requiring immediate attention and, secondarily, reduce for all concerned the level of stress experienced.

It may be a while yet before AI is pervasively applied to ITSM, but the issue now is not so much whether it will as much as it is determining the extent as part of an effort to reduce the level of cognitive load currently required. Those advances will undoubtedly also have major implications for how IT teams are staffed as the need for specialists to manage specific tasks becomes less pronounced. On the plus side, however, there will also be a significant increase in the number of organizations that will be building and deploying applications at scale so in that sense IT for organizations large and small will in the AI era only continue to grow in strategic importance.


Receive our top stories directly in your inbox!

Sign up for our Newsletters