Welcome!

@CloudExpo Authors: Zakia Bouachraoui, Elizabeth White, Liz McMillan, Pat Romanski, Roger Strukhoff

Related Topics: @CloudExpo, Agile Computing

@CloudExpo: Blog Post

Proper Incident Management | @CloudExpo #API #Cloud #BestPractices

And how it can save your company millions

Why Proper Incident Management Is Key to Proper IT Management

Mitigating downtime requires good workflows, human response and - most importantly - proper alarms to alert relevant individuals when things go wrong. Proper incident notification is crucial to effect management of IT downtime.

Proper IT management requires proper incident management. Otherwise, you court Murphy's law at your peril. In the IT world, if a server can fail, a cache overload or traffic overload the network - it will. And the consequences are significant.

Many IT organizations face database, hardware, and software downtime, lasting short periods to shutting down the business for days. According to a January 2016 article in Network Computing on the high price of IT downtime, organizations face:

"an average of five downtime events each month, with each downtime event being expensive indeed: from $1 million a year for a typical midsize company to more than $60 million for a large enterprise."

cost of downtime in IT
Companies across the IT industry encur major costs from downtime.
This image is courtesy of evolven.com

The major cause of this downtime is equipment failures which account for almost 40% of downtime. The second most frequent cause of downtime is human error which accounts for 25% of downtime. Cybersecurity accounts for only about 10% of this downtime. Yet in each of these cases, traditional workflows use emails to alert those in charge of downed networks. The use of email alerts assumes - falsely - that an email will get the attention of a data center manager. Yet data managers are faced with 100s of other emails per day. Clearly, an email doesn't break through the noise and get noticed in this instance.

Best practices for effective incident management during downtime
While effective use of network monitoring tools is required to minimize the impact of downtime, using emails to provide effective response means you are expecting the person responding to the incident is sitting at their computer or hovering over their iPhone. And what happens when the servers go down at 3 am? One hopes even the most devoted of employees is asleep at that hour.

Furthermore, traditional pagers are inadequate as they go off and then go silent. Pagers, when used either as an alternative to email or in addition, don't always escalate and they don't persistently get the attention of the necessary individual. Instead, you need data security control tools coupled with proper incident management applications. This means, that when incidents do occur the appropriate individuals are alerted and the alerts don't stop until the requisite action happens.

Mitigating downtime requires good workflows, human response and - most importantly - proper alarms to alert relevant individuals when things go wrong. Proper incident notification is crucial to effect management of IT downtime. And there's more than just the cost savings. There's also the savings to reputation. If a company frequently experiences downtime to its IT infrastructure, then it is courting a besmirched reputation for lacking reliability. When a company has a bad reputation, business is more difficult and costly to conduct. Much of the writing on customer service notes that it is more difficult to retain customers and important stakeholders when a company's reputation is damaged. This, in turn, makes the costs of doing business significantly higher.

Conclusion
Of great importance in this is that while you cannot avoid every incident, you can ensure proper incident management. In their attempts to provide proper alerts when trouble raises its ugly head and things go south, heads of IT need to ensure there are proper alerts that rise above the clutter.

More Stories By OnPage Blog

OnPage is a disruptive technology and application that leverages today's technology and smartphone capabilities for priority mobile messaging. With a top notch history of ensuring uninterrupted communication for businesses and critical response organizations, OnPage is once again poised to pioneer new mobile communications methodology for business and organizational use.

CloudEXPO Stories
The precious oil is extracted from the seeds of prickly pear cactus plant. After taking out the seeds from the fruits, they are adequately dried and then cold pressed to obtain the oil. Indeed, the prickly seed oil is quite expensive. Well, that is understandable when you consider the fact that the seeds are really tiny and each seed contain only about 5% of oil in it at most, plus the seeds are usually handpicked from the fruits. This means it will take tons of these seeds to produce just one bottle of the oil for commercial purpose. But from its medical properties to its culinary importance, skin lightening, moisturizing, and protection abilities, down to its extraordinary hair care properties, prickly seed oil has got lots of excellent rewards for anyone who pays the price.
The platform combines the strengths of Singtel's extensive, intelligent network capabilities with Microsoft's cloud expertise to create a unique solution that sets new standards for IoT applications," said Mr Diomedes Kastanis, Head of IoT at Singtel. "Our solution provides speed, transparency and flexibility, paving the way for a more pervasive use of IoT to accelerate enterprises' digitalisation efforts. AI-powered intelligent connectivity over Microsoft Azure will be the fastest connected path for IoT innovators to scale globally, and the smartest path to cross-device synergy in an instrumented, connected world.
There are many examples of disruption in consumer space – Uber disrupting the cab industry, Airbnb disrupting the hospitality industry and so on; but have you wondered who is disrupting support and operations? AISERA helps make businesses and customers successful by offering consumer-like user experience for support and operations. We have built the world’s first AI-driven IT / HR / Cloud / Customer Support and Operations solution.
ScaleMP is presenting at CloudEXPO 2019, held June 24-26 in Santa Clara, and we’d love to see you there. At the conference, we’ll demonstrate how ScaleMP is solving one of the most vexing challenges for cloud — memory cost and limit of scale — and how our innovative vSMP MemoryONE solution provides affordable larger server memory for the private and public cloud. Please visit us at Booth No. 519 to connect with our experts and learn more about vSMP MemoryONE and how it is already serving some of the world’s largest data centers. Click here to schedule a meeting with our experts and executives.
Darktrace is the world's leading AI company for cyber security. Created by mathematicians from the University of Cambridge, Darktrace's Enterprise Immune System is the first non-consumer application of machine learning to work at scale, across all network types, from physical, virtualized, and cloud, through to IoT and industrial control systems. Installed as a self-configuring cyber defense platform, Darktrace continuously learns what is ‘normal' for all devices and users, updating its understanding as the environment changes.