How much are critical IT incidents really costing your business? New research from real-time operational intelligence vendor, Splunk, and analyst firm Quocirca, revealed that these incidents – which slow down or halt business processes and impact user and customer experiences – cost IT departments an average of $36,326 each, plus a further $105,302 to the rest of the organization.
That’s a total cost of $141,628 to restore normal operations after just one critical incident! What’s more, the research found that the average organization logged around 1,200 incidents per month, of which five were classed as critical – making them a huge drain on business bottom lines.
The report also found that mean time to repair critical incidents (after detection) was 5.81 hours, with a further 7.23 hours spent on root cause analysis (which was only successful 65% of the time).
Turn the (incident) volume down!
Given the number of incidents and the time it takes to remediate them, it’s not surprising that organizations are struggling to cope. 66% of IT professionals surveyed reported that dealing with the volume of events was a challenge: 52% of those said they were ‘just about managing’, while 13% were ‘struggling’.
These challenges were compounded by the fact that 20% of respondents reported that they had no event management processes in place at all.
It’s clear that, with the vast number of alerts generated by monitoring tools, IT teams need efficient incident management processes to easily sort the critical incidents from false alarms and duplicate events – and then quickly prioritize remediation efforts while minimizing disruption to the business. As the report says, “All the event data generated by the IT infrastructure monitoring tools needs to be filtered to discover what is relevant, in order to troubleshoot problems and perform timely maintenance.”
You can’t fix what you can’t see
Yet, underpinning these challenges is a lack of visibility into the organization’s overall IT infrastructure. Just 45% of respondents said they had good or excellent network visibility with the tools they use, and only 2% (!!) of respondents had full visibility across their entire infrastructure. Regardless of the level of visibility, 80% admitted they had blind spots, leading to delayed detection and investigation of incidents.
The report noted that visibility tends to be better in traditional on-premise storage, server and networking environments, than across with next-generation technologies such as cloud computing and virtualized environments.
Putting incidents in (business) context
The solution, as the Splunk report noted, is that “IT teams must ensure incidents are prioritized according to the impact they are likely to have on business processes. This requires equipping IT operators with the necessary visibility and analytics across all components of the IT infrastructure.”
In practical terms, this means using a SIEM solution, such as Splunk, to identify and analyze the technical aspects of an incident – such as the IP addresses of affected servers, or the type of malware that has infected the network – and enrich this data with visibility and information about the business applications impacted – something AlgoSec does through its integration with leading SIEM solutions, including Splunk.
This is now it works. Let’s imagine, for example, that several servers on the network are targeted by hackers. By enriching the raw technical detail of the incident, for example, ‘this IP address has experienced an exploit’ with the business context of the applications affected (provided by AlgoSec), such as: ‘this IP address is linked to a server which is part of our payment processing system which connects to our e-commerce site’, the IT team can immediately see which applications are impacted and which incidents need to be prioritized for remediation – helping to limit the impact – and the remediation costs – on the business.
As the report concludes, “Dealing with critical IT incidents needs to be a top IT priority. IT teams should be equipped with tools to provide end-to-end visibility of processes and the IT infrastructure that supports them. The tools are needed to enable rapid detection and investigation of IT incidents, streamline root cause analysis, and reduce the size of teams involved in fixing problems. Achieving all of this reduces IT costs and the much higher consequential cost and impact of IT incidents on businesses.”
Find out more about how AlgoSec can significantly reduce the mean time to repair of critical incidents through its integration with SIEM solutions here, and make sure to check out Prof. Wool’s informative, whiteboard video course on Advanced Cyber Threat and Incident Management, here.
Receive notifications of new posts by email.