|By Andreas Grabner||
|October 6, 2014 04:00 AM EDT||
Web Service Monitoring 101: Identifying Bad Deployments
Have you ever deployed a change to production and thought "All went well - Systems are operating as expected!" but then you had to deal with users complaining that they keep running into errors?
When deployments fail you don't want your users to be the first to tell you about it: Sit down with the Business and Dev to define how and what to monitor
We recently moved some of our systems between two of our data centers - even moving some components to the public cloud. Everything was prepared well, system monitoring was set up and everyone gave the thumbs up to execute the move. Immediately following, our Operations dashboards continued to show green. Soon thereafter I received a complaint from a colleague who reported that he couldn't use one of the migrated services (our free dynaTrace AJAX Edition) anymore as the authentication web service seemed to fail. The questions we asked ourselves were:
- Impact: Was this a problem related to his account only or did it impact more users?
- Root Cause: What is the root cause and how was this problem introduced?
- Alerting: Why don't our Ops monitoring dashboards show any failed web service calls?
It turned out that the problem was in fact
- Caused by an outdated configuration file deployment
- It only impacted employees whose accounts were handled by a different authentication back-end service
- Didn't show up in Ops dashboards because the used SOAP Framework always return HTTP 200 transporting any success/failure information in the response body which doesn't show up in any web server log file
In this blog I give you a little more insight on how we triaged the problem and some best practices we derived from that incident in order to level-up technical implementations and production monitoring. Only if you monitor all your system components and correlate the results with deployment tasks will you be able to deploy with more confidence without disrupting your business.
Bad Monitoring: When Your End Users become your Alerting System
So - when I got a note from a colleague that he could no longer use dynaTrace AJAX Edition to analyze the web site performance of a particular web site, I launched my copy to verify this behavior. It failed with my credentials, which proved that it was not a local problem on my colleague's machine:
Business Problem: Our end users can't use our free product due to failing authentication service
Asking our Ops Team that manages and monitors these web services resulted in the following response:
"We do not see any errors on the Web Server nor do we have any reported availability problems on our authentication service. It's all green on our infrastructure dashboards as can be seen on the following screenshot:"
Infrastructure is all green: No HTTP-based errors or SLA problems based on IIS log or on any of the resources on the host
Web Server Log Monitoring Is Not Enough
As mentioned in the initial paragraph, it turned out that our SOAP Framework always returns HTTP 200 with the actual error in the response body. This is not an uncommon "Best (or worst) Practice" as you can see for instance on the following discussion on GitHub.
The problem with that approach though is that "traditional" operations monitoring based on web server log files will not detect any of these "logical/business" problems. As you don't want to wait until your users start complaining, it's time to level-up your monitoring approach. How can this be done? Those developing and those monitoring the system need to sit down and figure out a way how to monitor the usage of these services and need to talk with business to figure out which level of detail to report and alert on.
How can you find out if your current monitoring approach works? Start by looking more closely at problems reported by your users but that you don't get any automatic alerts on. Then, talk with engineers and see whether they use frameworks like mentioned here.
For further insight, and for lessons learned, click here for the full article.
The Internet of Things will greatly expand the opportunities for data collection and new business models driven off of that data. In her session at @ThingsExpo, Esmeralda Swartz, CMO of MetraTech, discussed how for this to be effective you not only need to have infrastructure and operational models capable of utilizing this new phenomenon, but increasingly service providers will need to convince a skeptical public to participate. Get ready to show them the money!
Jan. 30, 2015 02:30 AM EST Reads: 4,775
Scott Jenson leads a project called The Physical Web within the Chrome team at Google. Project members are working to take the scalability and openness of the web and use it to talk to the exponentially exploding range of smart devices. Nearly every company today working on the IoT comes up with the same basic solution: use my server and you'll be fine. But if we really believe there will be trillions of these devices, that just can't scale. We need a system that is open a scalable and by using ...
Jan. 30, 2015 02:00 AM EST Reads: 4,833
Code Halos - aka "digital fingerprints" - are the key organizing principle to understand a) how dumb things become smart and b) how to monetize this dynamic. In his session at @ThingsExpo, Robert Brown, AVP, Center for the Future of Work at Cognizant Technology Solutions, outlined research, analysis and recommendations from his recently published book on this phenomena on the way leading edge organizations like GE and Disney are unlocking the Internet of Things opportunity and what steps your o...
Jan. 30, 2015 02:00 AM EST Reads: 4,801
In their session at @ThingsExpo, Shyam Varan Nath, Principal Architect at GE, and Ibrahim Gokcen, who leads GE's advanced IoT analytics, focused on the Internet of Things / Industrial Internet and how to make it operational for business end-users. Learn about the challenges posed by machine and sensor data and how to marry it with enterprise data. They also discussed the tips and tricks to provide the Industrial Internet as an end-user consumable service using Big Data Analytics and Industrial C...
Jan. 30, 2015 01:00 AM EST Reads: 4,514
How do APIs and IoT relate? The answer is not as simple as merely adding an API on top of a dumb device, but rather about understanding the architectural patterns for implementing an IoT fabric. There are typically two or three trends: Exposing the device to a management framework Exposing that management framework to a business centric logic Exposing that business layer and data to end users. This last trend is the IoT stack, which involves a new shift in the separation of what stuff happe...
Jan. 30, 2015 12:30 AM EST Reads: 4,854
Log data provides the most granular view into what is happening across your systems, applications, and end users. Logs can show you where the issues are in real-time, and provide a historical trending view over time. Logs give you the whole picture. Logentries, a log management and analytics service built for the cloud, has announced a new integration with Slack, the team communication platform, to enable real-time system and application monitoring. Users of both services can now receive real-...
Jan. 29, 2015 11:45 PM EST Reads: 1,086
The 4th International DevOps Summit, co-located with16th International Cloud Expo – being held June 9-11, 2015, at the Javits Center in New York City, NY – announces that its Call for Papers is now open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's large...
Jan. 29, 2015 10:00 PM EST Reads: 5,224
At 15th Cloud Expo, Shrikant Pattathil, Executive Vice President at Harbinger Systems, demos a video delivery platform that helps you do interactive videos. He discusses how Harbinger is accomplishing it in the cloud world, the problems they faced and the choices they made to get around these problems.
Jan. 29, 2015 09:15 PM EST Reads: 2,551
“Will Jaya is a direct source for server integration and storage solutions. If you are looking for any specific configurations for a project we can help you configure based on your needs and requirements," explained Netty Goya, CEO of Will Jaya, in this SYS-CON.tv interview at 15th Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Jan. 29, 2015 08:30 PM EST Reads: 2,499
“The year of the cloud – we have no idea when it's really happening but we think it's happening now. For those technology providers like Zentera that are helping enterprises move to the cloud - it's been fun to watch," noted Mike Loftus, VP Product Management and Marketing at Zentera Systems, in this SYS-CON.tv interview at Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Jan. 29, 2015 07:00 PM EST Reads: 3,617
“DevOps is really about the business. The business is under pressure today, competitively in the marketplace to respond to the expectations of the customer. The business is driving IT and the problem is that IT isn't responding fast enough," explained Mark Levy, Senior Product Marketing Manager at Serena Software, in this SYS-CON.tv interview at DevOps Summit, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Jan. 29, 2015 07:00 PM EST Reads: 4,090
IoT is still a vague buzzword for many people. In his session at @ThingsExpo, Mike Kavis, Vice President & Principal Cloud Architect at Cloud Technology Partners, discussed the business value of IoT that goes far beyond the general public's perception that IoT is all about wearables and home consumer services. He also discussed how IoT is perceived by investors and how venture capitalist access this space. Other topics discussed were barriers to success, what is new, what is old, and what th...
Jan. 29, 2015 06:15 PM EST Reads: 5,464
Dale Kim is the Director of Industry Solutions at MapR. His background includes a variety of technical and management roles at information technology companies. While his experience includes work with relational databases, much of his career pertains to non-relational data in the areas of search, content management, and NoSQL, and includes senior roles in technical marketing, sales engineering, and support engineering. Dale holds an MBA from Santa Clara University, and a BA in Computer Science f...
Jan. 29, 2015 06:00 PM EST Reads: 5,197
The Internet of Things (IoT) is rapidly in the process of breaking from its heretofore relatively obscure enterprise applications (such as plant floor control and supply chain management) and going mainstream into the consumer space. More and more creative folks are interconnecting everyday products such as household items, mobile devices, appliances and cars, and unleashing new and imaginative scenarios. We are seeing a lot of excitement around applications in home automation, personal fitness,...
Jan. 29, 2015 06:00 PM EST Reads: 4,945
Entuity®, a provider of enterprise-class network management solutions, today announced that it solidifies its position as a market leader through global enterprise customer acquisitions and a refined channel strategy. In 2014, Entuity increased new license revenues in EMEA by over 75 percent, and LATAM by over 125 percent as customers embraced Entuity for its highly automated solution and unified architecture. Entuity’s refined channel strategy focuses on even deeper strategic alignment with ke...
Jan. 29, 2015 05:00 PM EST Reads: 846
The 3rd International Internet of @ThingsExpo, co-located with the 16th International Cloud Expo - to be held June 9-11, 2015, at the Javits Center in New York City, NY - announces that its Call for Papers is now open. The Internet of Things (IoT) is the biggest idea since the creation of the Worldwide Web more than 20 years ago.
Jan. 29, 2015 03:30 PM EST Reads: 5,130
Cloud Technology Partners on Wednesday announced it has been recognized by the Modern Infrastructure Impact Awards as one of the Best Amazon Web Services (AWS) Consulting Partners. Selected by the editors of TechTarget's SearchDataCenter.com, and by votes from customers and strategic channel partners, the companies acknowledged by the Modern Infrastructure Impact Awards represent the top providers of cloud consulting services for AWS including application migration, application development, inf...
Jan. 29, 2015 03:00 PM EST Reads: 2,494
“We help people build clusters, in the classical sense of the cluster. We help people put a full stack on top of every single one of those machines. We do the full bare metal install," explained Greg Bruno, Vice President of Engineering and co-founder of StackIQ, in this SYS-CON.tv interview at 15th Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Jan. 29, 2015 02:45 PM EST Reads: 4,095
"People are a lot more knowledgeable about APIs now. There are two types of people who work with APIs - IT people who want to use APIs for something internal and the product managers who want to do something outside APIs for people to connect to them," explained Roberto Medrano, Executive Vice President at SOA Software, in this SYS-CON.tv interview at Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Jan. 29, 2015 02:30 PM EST Reads: 4,415
"Blue Box has been around for 10-11 years, and last year we launched Blue Box Cloud. We like the term 'Private Cloud as a Service' because we think that embodies what we are launching as a product - it's a managed hosted private cloud," explained Giles Frith, Vice President of Customer Operations at Blue Box, in this SYS-CON.tv interview at DevOps Summit, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Jan. 29, 2015 02:30 PM EST Reads: 4,134