Welcome!

Cloud Expo Authors: Carmen Gonzalez, Esmeralda Swartz, Elizabeth White, Lori MacVittie, Sandi Mappic

Related Topics: Cloud Expo, AJAX & REA

Cloud Expo: Article

Integrated Cloud-based Load Testing and Performance Management

New Integration from Keynote and dynaTrace

Load Testing has traditionally been done In-House with load-testing tools using machines in your test center to generate HTTP traffic against the application needing to be tested for high volume transactions. With agile development practices, shorter release cycles and higher number of users that will ultimately access a web application from more places around the world, in-house testing reached its limits. Maintaining a load-testing infrastructure that supports 10 or 100 thousands of users becomes costly. With rapidly changing applications, updating test scripts is also becoming a bigger challenge all the time, binding lots of test resources to just the task of maintaining test scripts. When running tests more frequently, analyzing test results becomes a task that consumes performance architects or engineers with analyzing graphs and log files in order to figure out what problems were just uncovered by the recent test.

Let’s summarize these problems/requirements:

  • It is important to run bigger loads than ever before as our apps are accessed by more users around the globe
  • Besides just running the load we want to know how end-user performance is perceived from different locations around the globe
  • It is costly to own and maintain a test environment large enough to support these loads
  • It is time consuming to constantly adapt test scripts to reflect the changes of every product iteration
  • It takes experienced performance engineers or architects too long to analyze the test results and identify the root cause of problems

Cloud-Based Load-Testing with integrated Application Performance Management
Cloud based Load Testing solves many of these problems by providing high-volume tests from around the globe at specific times at a manageable cost. But it comes with some requirements on the tested application, and services like this must meet certain requirements in order to solve the discussed problems:

  • The Application Under Test (AUT) must be accessible from the internet as the generated transactions are generated from machines around the globe and not within the local test environment. Companies usually use part of their production system at “off-hours” to host the version of the application to be tested. This allows running large scale tests without having to have a replicate of the production environment for testing
  • The load testing service must provide an easy way to create and update scripts to adapt to changes within a product’s iterations. Otherwise too much time and effort is put into setting up tests.
  • The service must integrate with performance management software that runs on the tested application. This allows correlating data shown in load testing reports (Response Times, Transaction Rates, Bandwidth Usage …) with data captured in the application infrastructure (Transaction Times, CPU, Memory, Exceptions, …)

Proof of Concept: Load Testing with Keynote integrated with Application Performance Management from dynaTrace
Together with Keynote’s Load Testing Consultants we set up the following environment showcasing the benefits of an integrated solution of Cloud-Based Load Testing and Application Performance Management.

Step 1: Deploying the application
We deployed a 4 tier (2 Java and 2 .NET Runtimes) eCommerce Travel Portal on a hosted virtual infrastructure so that it is accessible by the Cloud-based Load Testing Service. We also installed and configured dynaTrace to manage this multi-tier heterogeneous application in order to identify problems once we put load on the system.

Application Dependency between the 4 tier heterogenous application

Application Dependency between the 4 tier heterogeneous application

Step 2: Test Scripts and Keynote/dynaTrace Integration
Keynote modeled several use-case scenarios based on the testing requirements we had on our application. We ended up with use cases such as executing a specific search, accessing the last-minute offers page or purchasing a trip. dynaTrace provides an integration interface for Load-Testing and Monitoring Services that allows us to link every executed synthetic request with the transaction that dynaTrace traces on the application server when these requests are handled by the application.

Step 3: Running a test
We decided to run a test starting with increasing load to figure out where the breaking point of our application is. We started with a load of 3000 sessions per hour running for 15 minutes and increasing this load every 15 minutes to 6k, 9, and 12k sessions/hour. It turned out our application broke much faster than we anticipated :-)

Step 4: Analyzing the Load Testing Report
When I log into the Keynote Load Testing Portal I start by looking at the load testing report that shows me the executed sessions, response times, page views and errors:

Keynote Load Testing Report showing an application problem when increasing the load to 6k sessions per hour

Keynote Load Testing Report showing an application problem when increasing the load to 6k sessions per hour

It is easy to see that – once we went from Phase 1 (3k sessions) to Phase 2 (6k sessions) – our application’s response times go through the roof causing most of the simulated users to experience timeouts. A click on the Page Error graph shows that these errors are mainly timeouts or connection errors. The question now is: Is this problem an application problem or is it related to the infrastructure? Without having insight to the application these results could be interpreted in multiple different ways, e.g: our hosting company doesn’t provide enough bandwidth. That is the point when Application Performance Management helps answering these uncertainties.

Step 5: Looking at application performance data
I’ve created two dashboards that I use to analyze application performance while or after running a load test. The first one is an Infrastructure Dashboard where I display CPU and Memory Utilization of all 4 Application Runtimes that are involved:

The Dashboard shows us high memory und GC activity on our GoSpaceBackend which also leads to very high CPU Utilization

The Dashboard shows us high memory und GC activity on our GoSpaceBackend which also leads to very high CPU Utilization

The red measure in the JVM Memory Usage graph indicates GC Collection time. The red in the CPU Usage indicates the max CPU Usage of that JVM. The conclusion is therefore easy. High memory usage leads to high GC activity which maxes out our CPU.

The next Dashboard gives me insight into the application itself – with all the involved application layers and the individual transactions that dynaTrace analyzed coming from the Keynote Load Test:

Transaction Response Time on the Application Server and Breakdown into Application Layers proofs that the slow response times are application related

Transaction Response Time on the Application Server and Breakdown into Application Layers proves that the slow response times are application-related

On the left of the Dashboard I placed a transaction overview of the individual use cases Keynote executed during the load test. It is easy to spot that once the load got ramped up to 6k sessions we saw a dramatic increase in response time on our application server. That means that our first question is answered: it is not an infrastructure problem with our web hosting but an application-specific problem. With the knowledge we already have by looking at the memory and CPU measures we can already guess that this is the main contributor. The performance breakdown on the bottom right also highlights which application layers were contributing the most to the application transaction response time. A double click on that graph gives us a close-up on this data:

Our Persistence Layer, EJBs and JMS are the main contributors of the application performance

Our Persistence Layer, EJBs and JMS are the main contributors of the application performance

Step 6: Drilling deeper into the problem
dynaTrace captured every single request that was executed while running the load test. Its PurePath technology is the enabler of the dashboards we looked at earlier. The next step is to identify what is really going on in the application and where is the main impact of the increased load. The next dashboard I created gives me a better overview of the application architecture, showing me which methods are called most often and how well they execute. I am also interested in database activity as well as individual web requests that were slow:

Detailed Overview of where my application hotspots are including application layers, methods, web requests and database statements

Detailed Overview of where my application hot spots are including application layers, methods, web requests and database statements

The dashboard again shows us that the primary application layer impacted is our persistence layer. It is also very interesting that the slowest URL is a web service hosted by our back-end application server and that we have a very high number of database statements coming from only a few web requests. This information is really valuable for the application architects who need insight into application dynamics under heavy load.

Step 7: Show me the root cause of these slow-running transactions
Not only can we get an overview of which requests were slow and how many methods or database statements were executed. We can now look into individual transactions, and also compare transactions to see where the difference is between a slow-running and a fast-running transaction. dynaTrace allows me to drill down to those 718 transactions that executed the slow running web service and I can inspect each individually:

All transactions (PurePaths) available for analysis. Selecting the slowest shows me where this web service got called and where time was spent

All transactions (PurePaths) available for analysis. Selecting the slowest shows me where this web service got called and where time was spent

Looking at the duration, CPU duration and Suspension Duration (Garbage Collection) really highlights the problem that we have. Suspension Time is really high with those transactions impacting the overall execution time.

I can also pick one that ran very slow and one that ran fast, and let dynaTrace compare these two transactions for me and highlight the differences:

Comparison shows the structural and timing difference between two transactions making it easy to spot the actual differences

Comparison shows the structural and timing difference between two transactions making it easy to spot the actual differences

Not only do I see how Garbage Collection impacts execution time of individual methods and the overall transaction. It also shows me how different the same transaction executes in case of an error (such as thrown abort exception) – which brings me to one additional dashboard I like to look it. This one includes exceptions, logging messages and an overview on the Garbage Collection runs on individual methods:

Exceptions including full stack traces, log messages with the context of where the were logged and an overview of suspended methods

Exceptions including full stack traces, log messages with the context of where they were logged and an overview of suspended methods

Step 8: Hand off the data
Looking at this data was easy as I simply look at these dashboards after the load test is finished. The dashboards already helped identifying several hot spots, e.g: high memory consumption by the back-end web services causing high GC, too many SQL statements per request, many hidden exceptions that never made it to a proper log message, …

dynaTrace makes this captured data available to the engineering team in order to resolve these problems. They can either access the data by directly accessing the dynaTrace environment used to capture this information. Another way is to export individual PurePaths or maybe all of them into a dynaTrace Session file which can be exchanged via email, Instant Messenger or attached to a bug ticket.

Proved the Concept: Cloud Based Load Testing with APM is ready for Agile Development

The problems/requirements listed in the beginning of this blog are solved/met with the integrated solution from Keynote and dynaTrace:

  • Keynote runs large scale load tests by driving load from many different locations around the globe
  • The global-distributed load generation allows us to identify local content delivery problems (slow network connections, wrongly configured CDNs, …)
  • The costs are under control as you only pay for the load test but don’t pay for maintaining your own load-testing infrastructure that would sit idle most of the time
  • Keynote makes it easy to create scripts and offers services to do the scripting for you
  • dynaTrace automatically highlights the problems identified during the load test. High-level analysis through dashboards doesn’t require highly skilled performance architects. The fine-grained data captured, however, gives the performance engineers and software architects actionable data without digging through log files or manually correlating a multitude of different performance metrics
  • No change to your application is required to use this integration

Related reading:

  1. End-to-End Monitoring and Load Testing with Keynote and dynaTrace We’ve learned from recent studies that performance has a direct...
  2. VS2010 Load Testing for Distributed and Heterogeneous Applications powered by dynaTrace Visual Studio 2010 is almost here – Microsoft just released...
  3. Performance Analysis in Load Testing Collection diagnostics information in Load Testing is a challenging task....
  4. From Cloud Monitoring to Effective Cloud Management – Webinar with IntraLinks on July 15th 2010 I am hosting a Webinar with IntraLinks this Wednesday. The...
  5. Elevating Web- and Load-Testing with MicroFocus SilkPerformer Diagnostics powered by dynaTrace MicroFocus and dynaTrace recently announced “SilkPerformer Assurance” and with that...

More Stories By Andreas Grabner

Andreas Grabner has more than a decade of experience as an architect and developer in the Java and .NET space. In his current role, Andi works as a Technology Strategist for Compuware and leads the Compuware APM Center of Excellence team. In his role he influences the Compuware APM product strategy and works closely with customers in implementing performance management solutions across the entire application lifecycle. He is a frequent speaker at technology conferences on performance and architecture-related topics, and regularly authors articles offering business and technology advice for Compuware’s About:Performance blog.

@CloudExpo Stories
The Internet of Things will greatly expand the opportunities for data collection and new business models driven off of that data. In her session at Internet of @ThingsExpo, Esmeralda Swartz, CMO of MetraTech, will discuss how for this to be effective you not only need to have infrastructure and operational models capable of utilizing this new phenomenon, but increasingly service providers will need to convince a skeptical public to participate. Get ready to show them the money! Speaker Bio: ...
Compute virtualization has been transformational, yet security policy implementation and enforcement has lagged behind in agility and automation. There are a number of key considerations when implementing policy in private and hybrid clouds. In his session at 15th Cloud Expo, Holland Barry, VP of Technology at Catbird, will discuss the impact of this new paradigm and what organizations can do today to safely move to software-defined network and compute architectures, including: How normal ope...
Samsung VP Jacopo Lenzi, who headed the company's recent SmartThings acquisition under the auspices of Samsung's Open Innovaction Center (OIC), answered a few questions we had about the deal. This interview was in conjunction with our interview with SmartThings CEO Alex Hawkinson. IoT Journal: SmartThings was developed in an open, standards-agnostic platform, and will now be part of Samsung's Open Innovation Center. Can you elaborate on your commitment to keep the platform open? Jacopo Lenzi: S...
How do APIs and IoT relate? The answer is not as simple as merely adding an API on top of a dumb device, but rather about understanding the architectural patterns for implementing an IoT fabric. There are typically two or three trends: Exposing the device to a management framework Exposing that management framework to a business centric logic • Exposing that business layer and data to end users. This last trend is the IoT stack, which involves a new shift in the separation of what stuff hap...
SYS-CON Events announced today that SOA Software, an API management leader, will exhibit at SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. SOA Software is a leading provider of API Management and SOA Governance products that equip business to deliver APIs and SOA together to drive their company to meet its business strategy quickly and effectively. SOA Software’s technology helps businesses to accel...
SYS-CON Events announced today that Utimaco will exhibit at SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Utimaco is a leading manufacturer of hardware based security solutions that provide the root of trust to keep cryptographic keys safe, secure critical digital infrastructures and protect high value data assets. Only Utimaco delivers a general-purpose hardware security module (HSM) as a customiz...
Almost everyone sees the potential of Internet of Things but how can businesses truly unlock that potential. The key will be in the ability to discover business insight in the midst of an ocean of Big Data generated from billions of embedded devices via Systems of Discover. Businesses will also need to ensure that they can sustain that insight by leveraging the cloud for global reach, scale and elasticity.
SYS-CON Events announced today that ElasticBox is holding a Hackathon at DevOps Summit, November 6 from 12 pm -4 pm at the Santa Clara Convention Center in Santa Clara, CA. You can enter as an individual or team of up to 10 developers. A New Star Is Born Every Month! All completed ElasticBoxes will then be sent to a judging panel - 12 winners will be featured on the ElasticBox website in 2015. All entrants will receive five full enterprise licenses for one year + ElasticBox headphones + Elasti...
Once the decision has been made to move part or all of a workload to the cloud, a methodology for selecting that workload needs to be established. How do you move to the cloud? What does the discovery, assessment and planning look like? What workloads make sense? Which cloud model makes sense for each workload? What are the considerations for how to select the right cloud model? And how does that fit in with the overall IT tranformation? In his session at 15th Cloud Expo, John Hatem, head of V...
Cloud services are the newest tool in the arsenal of IT products in the market today. These cloud services integrate process and tools. In order to use these products effectively, organizations must have a good understanding of themselves and their business requirements. In his session at 15th Cloud Expo, Brian Lewis, Principal Architect at Verizon Cloud, will outline key areas of organizational focus, and how to formalize an actionable plan when migrating applications and internal services to...
SAP is delivering break-through innovation combined with fantastic user experience powered by the market-leading in-memory technology, SAP HANA. In his General Session at 15th Cloud Expo, Thorsten Leiduck, VP ISVs & Digital Commerce, SAP, will discuss how SAP and partners provide cloud and hybrid cloud solutions as well as real-time Big Data offerings that help companies of all sizes and industries run better. SAP launched an application challenge to award the most innovative SAP HANA and SAP ...
Ixia develops amazing products so its customers can connect the world. Ixia helps its customers provide an always-on user experience through fast, secure delivery of dynamic connected technologies and services. Through actionable insights that accelerate and secure application and service delivery, Ixia's customers benefit from faster time to market, optimized application performance and higher-quality deployments.
SYS-CON Events announced today that Calm.io has been named “Bronze Sponsor” of DevOps Summit Silicon Valley, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Calm.io is a cloud orchestration platform for AWS, vCenter, OpenStack, or bare metal, that runs your CL tools puppet, Chef, shell, git, Jenkins, nagios, and will soon support New Relic and Docker. It can run hosted, or on premise and provides VM automation / expiry, self-service portals,...
In her General Session at 15th Cloud Expo, Anne Plese, Senior Consultant, Cloud Product Marketing, at Verizon Enterprise, will focus on finding the right mix of renting vs. buying Oracle capacity to scale to meet business demands, and offer validated Oracle database TCO models for Oracle development and testing environments. Anne Plese is a marketing and technology enthusiast/realist with over 19+ years in high tech. At Verizon Enterprise, she focuses on driving growth for the Verizon Cloud pla...
SYS-CON Events announced today that Aria Systems, the recurring revenue expert, has been named "Bronze Sponsor" of SYS-CON's 15th International Cloud Expo®, which will take place on November 4-6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Aria Systems helps leading businesses connect their customers with the products and services they love. Industry leaders like Pitney Bowes, Experian, AAA NCNU, VMware, HootSuite and many others choose Aria to power their recurring revenue bu...
The Internet of Things (IoT) is going to require a new way of thinking and of developing software for speed, security and innovation. This requires IT leaders to balance business as usual while anticipating for the next market and technology trends. Cloud provides the right IT asset portfolio to help today’s IT leaders manage the old and prepare for the new. Today the cloud conversation is evolving from private and public to hybrid. This session will provide use cases and insights to reinforce t...
As Platform as a Service (PaaS) matures as a category, developers should have the ability to use the programming language of their choice to build applications and have access to a wide array of services. Bluemix is IBM's open cloud development platform that enables users to easily build cloud-based, creative mobile and web applications without having to spend large amounts of time and resources on configuring infrastructure and multiple software licenses. In this track, you will learn about the...
Blue Box has closed a $10 million Series B financing. The round was led by a strategic investor and included participation from prior investors including Voyager Capital and Founders Collective, as well as the Blue Box executive team. This round follows a $4.3 million Series A closed in December of 2012 and led by Voyager Capital. In May of this year, the company announced general availability of its private cloud as a service offering, Blue Box Cloud. Since that release, the company has dem...
SYS-CON Events announced today that Verizon has been named "Gold Sponsor" of SYS-CON's 15th International Cloud Expo®, which will take place on November 4-6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Verizon Enterprise Solutions creates global connections that generate growth, drive business innovation and move society forward. With industry-specific solutions and a full range of global wholesale offerings provided over the company's secure mobility, cloud, strategic network...
SimpleECM is the only platform to offer a powerful combination of enterprise content management (ECM) services, capture solutions, and third-party business services providing simplified integrations and workflow development for solution providers. SimpleECM is opening the market to businesses of all sizes by reinventing the delivery of ECM services. Our APIs make the development of ECM services simple with the use of familiar technologies for a frictionless integration directly into web applicat...