Welcome!

@CloudExpo Authors: Pat Romanski, Yeshim Deniz, Elizabeth White, Liz McMillan, Aruna Ravichandran

Related Topics: @CloudExpo, Java IoT, Linux Containers, @BigDataExpo, @ThingsExpo, @DevOpsSummit

@CloudExpo: Article

AI and Analytics | @BigDataExpo #ML #IoT #BigData #DigitalTransformation

Artificial intelligence (AI) is finally coming into its own and beginning to demonstrate significant business value

How Artificial Intelligence Is Transforming IT Operation Analytics

After many years of research, misfires and frightening Hollywood plotlines, artificial intelligence (AI) is finally coming into its own and beginning to demonstrate significant business value. The combined forces of big data, human expertise and AI are being used across industries as diverse as healthcare and manufacturing, as well as within all aspects of business. IT operations is one area that AI is beginning to contribute to enormously.

IT infrastructures are changing rapidly today, particularly hybrid cloud environments. While they are increasingly dynamic and agile, they are also extraordinarily complex. Humans are no longer able to sift through the variety, volume and velocity of Big Data streaming out of IT infrastructures in real time, making AI - especially machine learning - a powerful and necessary tool for automating analysis and decision making. By helping teams bridge the gap between Big Data and humans, and by capturing human domain knowledge, machine learning is able to provide the necessary operational intelligence to significantly relieve this burden of near real-time, informed decision-making. Industry analysts agree. In fact, Gartner named machine learning among the top 10 strategic technologies for 2016, noting "The explosion of data sources and complexity of information makes manual classification and analysis infeasible and uneconomical."

However, the current state of IT Ops is that the domain experts - typically IT administrators, IT operators for TechOps and Site Reliability Engineers (SRE) for DevOps - must manually gather this disparate information and apply their domain expertise in an attempt to make informed decisions. While these professionals are great at what they do, trying to analyze so much data from multiple tools leaves the door wide open for human error. On the other hand, analytics that are based on machine learning are quickly becoming a necessity to ensure the availability, reliability, performance and security of applications in today's digital, virtualized and hybrid-cloud network environments.

Historically, these domain experts have used multiple tools, each which monitored a specific element of the system and provided them with information about their network, virtual and physical infrastructure and application performance. While these tools provide pieces of the puzzle, they offer a narrow view of the IT infrastructure and, therefore, only one aspect of the tool chain. The other aspect is service desk tools that manage tickets and change management. Humans more often than not bridge this gap between the siloed monitoring tools of yesterday and service desk applications with their domain expertise.

What Modern Analytics Can Do
Because today's TechOps and DevOps environments are so complex, there is a need to automate, learn and make intelligent, informed decisions based on real-time analysis of Big Data arising out of the entire application infrastructure stack. Following are key analytics for IT operations:

  1. Behavior Profiling - This type of analytics understands the behavior profile of each and every metric, how that flows into the object behavior and then how the object behaviors relate to other object behaviors across the hybrid cloud environment. It is a multi-dimensional problem, and understanding and adapting to "normal" behavior is extremely important.
  2. Anomaly Detection - This is the bedrock of what is typically referred to as diagnostic analytics. Best-of-breed machine learning algorithms should be able to look at contextual, historical and sudden changes in the behavior of objects to detect anomalies. Understanding when there is a real anomaly and more importantly, when there is not, is critical to avoid generating false alarms.
  3. Topology Analysis - Topology is something every IT administrator or SRE should be aware of. This is the understanding of the hierarchal, peer-to-peer and temporal relationship between hybrid cloud elements. This type of analysis should be able to self-learn the inter-relationships of objects and the impact of their performance on one another. Learning those relationships and maintaining that understanding in order to spot trouble in time is extremely important for both TechOps and DevOps environments.
  4. Root Cause - With the ability to zero in on the cause and impact of an incident, root-cause analysis fast-tracks the resolution and reduces mean time to repair substantially.
  5. Predictive - As the name implies, analytics of this kind help operators identify early indicators and provide insights into looming problems that may eventually lead to performance degradation and outages.  Predictive analytics are also good at providing early insights into anomalies to better plan for what's ahead.
  6. Prescriptive - These analytics provide insight-driven recommendations to remediate an incident. These recommendations should capture tribal knowledge gathered over the years in the organization, best practices in the industry and may even be crowd-sourced to capture state-of-the-art knowledge. These analytics provide the opportunity to finally close the loop in automated IT Operations Management.

Real Monitoring Intelligence
The modern IT environment has gone far past the point of staff being able to effectively

react to incidents as well as trying to resolve them after they have spun out of control. Instead, AI provides technologies to help automate many of these tasks in order to handle incidents in advance. The whole notion of automating IT operational tasks, as well as preventing outages in the first place, and getting to the root cause quickly and in an automated way is the next frontier in remediating these issues.

Monitoring data is critical for identifying, predicting and preventing incidents - and it's something humans can no longer do. DevOps and TechOps teams already have so much on their plates that they cannot possibly devote the time needed to address every alert and analyze the masses of data constantly being generated. And today, they don't have to. Artificial intelligence is able to see past siloes for a deep view across the application stack to provide the analytics that help keep apps up and running at desired service levels.

More Stories By Akhil Sahai

Dr. Akhil Sahai is an accomplished management and technology leader with 25+ years of experience at large enterprises and at startups. He came to Perspica from HP Enterprise where as Sr. Director of Product Management, he envisaged, planned and managed the Solutions Program. At Dell, as Director of Products, Akhil led Product Strategy and Management of Dell’s Converged Infrastructure product line. He also led Gale Technologies, as VP of Products to its successful acquisition by Dell.

Prior to that, at Cisco he undertook business development for VCE Coalition, and at VMware, he managed global product strategy and management for vCloud Software with focus on applications, and Virtual Appliances product line.

He has published 80+ peer-reviewed articles, authored a book, edited another, and chaired multiple International IEEE/IFIP Conferences. He has filed 20 technology Patents (with 16 granted). He has a Ph.D. from INRIA France and an MBA from Wharton School.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@CloudExpo Stories
Data scientists must access high-performance computing resources across a wide-area network. To achieve cloud-based HPC visualization, researchers must transfer datasets and visualization results efficiently. HPC clusters now compute GPU-accelerated visualization in the cloud cluster. To efficiently display results remotely, a high-performance, low-latency protocol transfers the display from the cluster to a remote desktop. Further, tools to easily mount remote datasets and efficiently transfer...
Though cloud is the future of enterprise computing, a smooth transition of legacy applications and systems is critical for seamless business operations. IT professionals are eager to start leveraging the cost, scale and other benefits of cloud, but with massive investments already in place in existing infrastructure and a number of compliance and resource hurdles, it can be challenging to move to a cloud-based infrastructure.
SYS-CON Events announced today that Avere Systems, a leading provider of enterprise storage for the hybrid cloud, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Avere delivers a more modern architectural approach to storage that doesn't require the overprovisioning of storage capacity to achieve performance, overspending on expensive storage media for inactive data or the overbui...
Containers are rapidly finding their way into enterprise data centers, but change is difficult. How do enterprises transform their architecture with technologies like containers without losing the reliable components of their current solutions? In his session at @DevOpsSummit at 21st Cloud Expo, Tony Campbell, Director, Educational Services at CoreOS, will explore the challenges organizations are facing today as they move to containers and go over how Kubernetes applications can deploy with lega...
SYS-CON Events announced today that Avere Systems, a leading provider of hybrid cloud enablement solutions, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Avere Systems was created by file systems experts determined to reinvent storage by changing the way enterprises thought about and bought storage resources. With decades of experience behind the company’s founders, Avere got its ...
Amazon is pursuing new markets and disrupting industries at an incredible pace. Almost every industry seems to be in its crosshairs. Companies and industries that once thought they were safe are now worried about being “Amazoned.”. The new watch word should be “Be afraid. Be very afraid.” In his session 21st Cloud Expo, Chris Kocher, a co-founder of Grey Heron, will address questions such as: What new areas is Amazon disrupting? How are they doing this? Where are they likely to go? What are th...
Today most companies are adopting or evaluating container technology - Docker in particular - to speed up application deployment, drive down cost, ease management and make application delivery more flexible overall. As with most new architectures, this dream takes significant work to become a reality. Even when you do get your application componentized enough and packaged properly, there are still challenges for DevOps teams to making the shift to continuous delivery and achieving that reducti...
As hybrid cloud becomes the de-facto standard mode of operation for most enterprises, new challenges arise on how to efficiently and economically share data across environments. In his session at 21st Cloud Expo, Dr. Allon Cohen, VP of Product at Elastifile, will explore new techniques and best practices that help enterprise IT benefit from the advantages of hybrid cloud environments by enabling data availability for both legacy enterprise and cloud-native mission critical applications. By rev...
The next XaaS is CICDaaS. Why? Because CICD saves developers a huge amount of time. CD is an especially great option for projects that require multiple and frequent contributions to be integrated. But… securing CICD best practices is an emerging, essential, yet little understood practice for DevOps teams and their Cloud Service Providers. The only way to get CICD to work in a highly secure environment takes collaboration, patience and persistence. Building CICD in the cloud requires rigorous ar...
Recently, REAN Cloud built a digital concierge for a North Carolina hospital that had observed that most patient call button questions were repetitive. In addition, the paper-based process used to measure patient health metrics was laborious, not in real-time and sometimes error-prone. In their session at 21st Cloud Expo, Sean Finnerty, Executive Director, Practice Lead, Health Care & Life Science at REAN Cloud, and Dr. S.P.T. Krishnan, Principal Architect at REAN Cloud, will discuss how they b...
SYS-CON Events announced today that SkyScale will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. SkyScale is a world-class provider of cloud-based, ultra-fast multi-GPU hardware platforms for lease to customers desiring the fastest performance available as a service anywhere in the world. SkyScale builds, configures, and manages dedicated systems strategically located in maximum-security...
As you move to the cloud, your network should be efficient, secure, and easy to manage. An enterprise adopting a hybrid or public cloud needs systems and tools that provide: Agility: ability to deliver applications and services faster, even in complex hybrid environments Easier manageability: enable reliable connectivity with complete oversight as the data center network evolves Greater efficiency: eliminate wasted effort while reducing errors and optimize asset utilization Security: imple...
High-velocity engineering teams are applying not only continuous delivery processes, but also lessons in experimentation from established leaders like Amazon, Netflix, and Facebook. These companies have made experimentation a foundation for their release processes, allowing them to try out major feature releases and redesigns within smaller groups before making them broadly available. In his session at 21st Cloud Expo, Brian Lucas, Senior Staff Engineer at Optimizely, will discuss how by using...
In this strange new world where more and more power is drawn from business technology, companies are effectively straddling two paths on the road to innovation and transformation into digital enterprises. The first path is the heritage trail – with “legacy” technology forming the background. Here, extant technologies are transformed by core IT teams to provide more API-driven approaches. Legacy systems can restrict companies that are transitioning into digital enterprises. To truly become a lead...
DevOps at Cloud Expo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 21st Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to w...
SYS-CON Events announced today that Daiya Industry will exhibit at the Japanese Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Ruby Development Inc. builds new services in short period of time and provides a continuous support of those services based on Ruby on Rails. For more information, please visit https://github.com/RubyDevInc.
When it comes to cloud computing, the ability to turn massive amounts of compute cores on and off on demand sounds attractive to IT staff, who need to manage peaks and valleys in user activity. With cloud bursting, the majority of the data can stay on premises while tapping into compute from public cloud providers, reducing risk and minimizing need to move large files. In his session at 18th Cloud Expo, Scott Jeschonek, Director of Product Management at Avere Systems, discussed the IT and busine...
As businesses evolve, they need technology that is simple to help them succeed today and flexible enough to help them build for tomorrow. Chrome is fit for the workplace of the future — providing a secure, consistent user experience across a range of devices that can be used anywhere. In her session at 21st Cloud Expo, Vidya Nagarajan, a Senior Product Manager at Google, will take a look at various options as to how ChromeOS can be leveraged to interact with people on the devices, and formats th...
First generation hyperconverged solutions have taken the data center by storm, rapidly proliferating in pockets everywhere to provide further consolidation of floor space and workloads. These first generation solutions are not without challenges, however. In his session at 21st Cloud Expo, Wes Talbert, a Principal Architect and results-driven enterprise sales leader at NetApp, will discuss how the HCI solution of tomorrow will integrate with the public cloud to deliver a quality hybrid cloud e...
Is advanced scheduling in Kubernetes achievable? Yes, however, how do you properly accommodate every real-life scenario that a Kubernetes user might encounter? How do you leverage advanced scheduling techniques to shape and describe each scenario in easy-to-use rules and configurations? In his session at @DevOpsSummit at 21st Cloud Expo, Oleg Chunikhin, CTO at Kublr, will answer these questions and demonstrate techniques for implementing advanced scheduling. For example, using spot instances ...