|By Andreas Grabner||
|October 6, 2014 04:00 AM EDT||
Web Service Monitoring 101: Identifying Bad Deployments
Have you ever deployed a change to production and thought "All went well - Systems are operating as expected!" but then you had to deal with users complaining that they keep running into errors?
When deployments fail you don't want your users to be the first to tell you about it: Sit down with the Business and Dev to define how and what to monitor
We recently moved some of our systems between two of our data centers - even moving some components to the public cloud. Everything was prepared well, system monitoring was set up and everyone gave the thumbs up to execute the move. Immediately following, our Operations dashboards continued to show green. Soon thereafter I received a complaint from a colleague who reported that he couldn't use one of the migrated services (our free dynaTrace AJAX Edition) anymore as the authentication web service seemed to fail. The questions we asked ourselves were:
- Impact: Was this a problem related to his account only or did it impact more users?
- Root Cause: What is the root cause and how was this problem introduced?
- Alerting: Why don't our Ops monitoring dashboards show any failed web service calls?
It turned out that the problem was in fact
- Caused by an outdated configuration file deployment
- It only impacted employees whose accounts were handled by a different authentication back-end service
- Didn't show up in Ops dashboards because the used SOAP Framework always return HTTP 200 transporting any success/failure information in the response body which doesn't show up in any web server log file
In this blog I give you a little more insight on how we triaged the problem and some best practices we derived from that incident in order to level-up technical implementations and production monitoring. Only if you monitor all your system components and correlate the results with deployment tasks will you be able to deploy with more confidence without disrupting your business.
Bad Monitoring: When Your End Users become your Alerting System
So - when I got a note from a colleague that he could no longer use dynaTrace AJAX Edition to analyze the web site performance of a particular web site, I launched my copy to verify this behavior. It failed with my credentials, which proved that it was not a local problem on my colleague's machine:
Business Problem: Our end users can't use our free product due to failing authentication service
Asking our Ops Team that manages and monitors these web services resulted in the following response:
"We do not see any errors on the Web Server nor do we have any reported availability problems on our authentication service. It's all green on our infrastructure dashboards as can be seen on the following screenshot:"
Infrastructure is all green: No HTTP-based errors or SLA problems based on IIS log or on any of the resources on the host
Web Server Log Monitoring Is Not Enough
As mentioned in the initial paragraph, it turned out that our SOAP Framework always returns HTTP 200 with the actual error in the response body. This is not an uncommon "Best (or worst) Practice" as you can see for instance on the following discussion on GitHub.
The problem with that approach though is that "traditional" operations monitoring based on web server log files will not detect any of these "logical/business" problems. As you don't want to wait until your users start complaining, it's time to level-up your monitoring approach. How can this be done? Those developing and those monitoring the system need to sit down and figure out a way how to monitor the usage of these services and need to talk with business to figure out which level of detail to report and alert on.
How can you find out if your current monitoring approach works? Start by looking more closely at problems reported by your users but that you don't get any automatic alerts on. Then, talk with engineers and see whether they use frameworks like mentioned here.
For further insight, and for lessons learned, click here for the full article.
DevOps tends to focus on the relationship between Dev and Ops, putting an emphasis on the ops and application infrastructure. But that’s changing with microservices architectures. In her session at DevOps Summit, Lori MacVittie, Evangelist for F5 Networks, will focus on how microservices are changing the underlying architectures needed to scale, secure and deliver applications based on highly distributed (micro) services and why that means an expansion into “the network” for DevOps.
Jul. 1, 2015 07:15 PM EDT Reads: 2,520
"We have a tagline - "Power in the API Economy." What that means is everything that is built in applications and connected applications is done through APIs," explained Roberto Medrano, Executive Vice President at Akana, in this SYS-CON.tv interview at 16th Cloud Expo, held June 9-11, 2015, at the Javits Center in New York City.
Jul. 1, 2015 05:00 PM EDT Reads: 867
Containers have changed the mind of IT in DevOps. They enable developers to work with dev, test, stage and production environments identically. Containers provide the right abstraction for microservices and many cloud platforms have integrated them into deployment pipelines. DevOps and Containers together help companies to achieve their business goals faster and more effectively. In his session at DevOps Summit, Ruslan Synytsky, CEO and Co-founder of Jelastic, reviewed the current landscape of...
Jul. 1, 2015 05:00 PM EDT Reads: 2,226
"AgilData is the next generation of dbShards. It just adds a whole bunch more functionality to improve the developer experience," noted Dan Lynn, CEO of AgilData, in this SYS-CON.tv interview at 16th Cloud Expo, held June 9-11, 2015, at the Javits Center in New York City.
Jul. 1, 2015 04:09 PM EDT Reads: 618
The Internet of Things is not only adding billions of sensors and billions of terabytes to the Internet. It is also forcing a fundamental change in the way we envision Information Technology. For the first time, more data is being created by devices at the edge of the Internet rather than from centralized systems. What does this mean for today's IT professional? In this Power Panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists addressed this very serious issue of pro...
Jul. 1, 2015 03:30 PM EDT Reads: 1,078
"We provide a web application framework for building really sophisticated web applications that run on a browser without any installation need so we get used for biotech, defense, and banking applications," noted Charles Kendrick, CTO and Chief Architect at Isomorphic Software, in this SYS-CON.tv interview at @DevOpsSummit (http://DevOpsSummit.SYS-CON.com), held June 9-11, 2015, at the Javits Center in New York
Jul. 1, 2015 02:45 PM EDT Reads: 1,086
Discussions about cloud computing are evolving into discussions about enterprise IT in general. As enterprises increasingly migrate toward their own unique clouds, new issues such as the use of containers and microservices emerge to keep things interesting. In this Power Panel at 16th Cloud Expo, moderated by Conference Chair Roger Strukhoff, panelists addressed the state of cloud computing today, and what enterprise IT professionals need to know about how the latest topics and trends affect t...
Jul. 1, 2015 02:30 PM EDT Reads: 1,198
Explosive growth in connected devices. Enormous amounts of data for collection and analysis. Critical use of data for split-second decision making and actionable information. All three are factors in making the Internet of Things a reality. Yet, any one factor would have an IT organization pondering its infrastructure strategy. How should your organization enhance its IT framework to enable an Internet of Things implementation? In his session at @ThingsExpo, James Kirkland, Red Hat's Chief Arch...
Jul. 1, 2015 02:21 PM EDT Reads: 663
In the midst of the widespread popularity and adoption of cloud computing, it seems like everything is being offered “as a Service” these days: Infrastructure? Check. Platform? You bet. Software? Absolutely. Toaster? It’s only a matter of time. With service providers positioning vastly differing offerings under a generic “cloud” umbrella, it’s all too easy to get confused about what’s actually being offered. In his session at 16th Cloud Expo, Kevin Hazard, Director of Digital Content for SoftL...
Jul. 1, 2015 01:15 PM EDT Reads: 2,172
"A lot of the enterprises that have been using our systems for many years are reaching out to the cloud - the public cloud, the private cloud and hybrid," stated Reuven Harrison, CTO and Co-Founder of Tufin, in this SYS-CON.tv interview at 16th Cloud Expo, held June 9-11, 2015, at the Javits Center in New York City.
Jul. 1, 2015 12:54 PM EDT Reads: 631
One of the hottest areas in cloud right now is DRaaS and related offerings. In his session at 16th Cloud Expo, Dale Levesque, Disaster Recovery Product Manager with Windstream's Cloud and Data Center Marketing team, will discuss the benefits of the cloud model, which far outweigh the traditional approach, and how enterprises need to ensure that their needs are properly being met.
Jul. 1, 2015 12:15 PM EDT Reads: 2,059
The time is ripe for high speed resilient software defined storage solutions with unlimited scalability. ISS has been working with the leading open source projects and developed a commercial high performance solution that is able to grow forever without performance limitations. In his session at Cloud Expo, Alex Gorbachev, President of Intelligent Systems Services Inc., shared foundation principles of Ceph architecture, as well as the design to deliver this storage to traditional SAN storage co...
Jul. 1, 2015 12:00 PM EDT Reads: 2,015
"Plutora provides release and testing environment capabilities to the enterprise," explained Dalibor Siroky, Director and Co-founder of Plutora, in this SYS-CON.tv interview at @DevOpsSummit, held June 9-11, 2015, at the Javits Center in New York City.
Jul. 1, 2015 11:45 AM EDT Reads: 1,035
SYS-CON Events announced today that JFrog, maker of Artifactory, the popular Binary Repository Manager, will exhibit at SYS-CON's @DevOpsSummit Silicon Valley, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Based in California, Israel and France, founded by longtime field-experts, JFrog, creator of Artifactory and Bintray, has provided the market with the first Binary Repository solution and a software distribution social platform.
Jul. 1, 2015 11:30 AM EDT Reads: 991
It is one thing to build single industrial IoT applications, but what will it take to build the Smart Cities and truly society-changing applications of the future? The technology won’t be the problem, it will be the number of parties that need to work together and be aligned in their motivation to succeed. In his session at @ThingsExpo, Jason Mondanaro, Director, Product Management at Metanga, discussed how you can plan to cooperate, partner, and form lasting all-star teams to change the world...
Jul. 1, 2015 11:30 AM EDT Reads: 2,256
In his session at 16th Cloud Expo, Simone Brunozzi, VP and Chief Technologist of Cloud Services at VMware, reviewed the changes that the cloud computing industry has gone through over the last five years and shared insights into what the next five will bring. He also chronicled the challenges enterprise companies are facing as they move to the public cloud. He delved into the "Hybrid Cloud" space and explained why every CIO should consider ‘hybrid cloud' as part of their future strategy to achi...
Jul. 1, 2015 11:10 AM EDT Reads: 699
"We got started as search consultants. On the services side of the business we have help organizations save time and save money when they hit issues that everyone more or less hits when their data grows," noted Otis Gospodnetić, Founder of Sematext, in this SYS-CON.tv interview at @DevOpsSummit, held June 9-11, 2015, at the Javits Center in New York City.
Jul. 1, 2015 11:00 AM EDT Reads: 966
Internet of Things is moving from being a hype to a reality. Experts estimate that internet connected cars will grow to 152 million, while over 100 million internet connected wireless light bulbs and lamps will be operational by 2020. These and many other intriguing statistics highlight the importance of Internet powered devices and how market penetration is going to multiply many times over in the next few years.
Jul. 1, 2015 10:30 AM EDT Reads: 2,146
Internet of Things (IoT) will be a hybrid ecosystem of diverse devices and sensors collaborating with operational and enterprise systems to create the next big application. In their session at @ThingsExpo, Bramh Gupta, founder and CEO of robomq.io, and Fred Yatzeck, principal architect leading product development at robomq.io, discussed how choosing the right middleware and integration strategy from the get-go will enable IoT solution developers to adapt and grow with the industry, while at th...
Jul. 1, 2015 09:45 AM EDT Reads: 1,954
The most often asked question post-DevOps introduction is: “How do I get started?” There’s plenty of information on why DevOps is valid and important, but many managers still struggle with simple basics for how to initiate a DevOps program in their business. They struggle with issues related to current organizational inertia, the lack of experience on Continuous Integration/Delivery, understanding where DevOps will affect revenue and budget, etc. In their session at DevOps Summit, JP Morgenthal...
Jul. 1, 2015 09:32 AM EDT Reads: 651