Welcome!

Cloud Expo Authors: Liz McMillan, Michelle Drolet, Elizabeth White, Kevin Benedict, Pat Romanski

Related Topics: Cloud Expo, AJAX & REA

Cloud Expo: Article

Integrated Cloud-based Load Testing and Performance Management

New Integration from Keynote and dynaTrace

Load Testing has traditionally been done In-House with load-testing tools using machines in your test center to generate HTTP traffic against the application needing to be tested for high volume transactions. With agile development practices, shorter release cycles and higher number of users that will ultimately access a web application from more places around the world, in-house testing reached its limits. Maintaining a load-testing infrastructure that supports 10 or 100 thousands of users becomes costly. With rapidly changing applications, updating test scripts is also becoming a bigger challenge all the time, binding lots of test resources to just the task of maintaining test scripts. When running tests more frequently, analyzing test results becomes a task that consumes performance architects or engineers with analyzing graphs and log files in order to figure out what problems were just uncovered by the recent test.

Let’s summarize these problems/requirements:

  • It is important to run bigger loads than ever before as our apps are accessed by more users around the globe
  • Besides just running the load we want to know how end-user performance is perceived from different locations around the globe
  • It is costly to own and maintain a test environment large enough to support these loads
  • It is time consuming to constantly adapt test scripts to reflect the changes of every product iteration
  • It takes experienced performance engineers or architects too long to analyze the test results and identify the root cause of problems

Cloud-Based Load-Testing with integrated Application Performance Management
Cloud based Load Testing solves many of these problems by providing high-volume tests from around the globe at specific times at a manageable cost. But it comes with some requirements on the tested application, and services like this must meet certain requirements in order to solve the discussed problems:

  • The Application Under Test (AUT) must be accessible from the internet as the generated transactions are generated from machines around the globe and not within the local test environment. Companies usually use part of their production system at “off-hours” to host the version of the application to be tested. This allows running large scale tests without having to have a replicate of the production environment for testing
  • The load testing service must provide an easy way to create and update scripts to adapt to changes within a product’s iterations. Otherwise too much time and effort is put into setting up tests.
  • The service must integrate with performance management software that runs on the tested application. This allows correlating data shown in load testing reports (Response Times, Transaction Rates, Bandwidth Usage …) with data captured in the application infrastructure (Transaction Times, CPU, Memory, Exceptions, …)

Proof of Concept: Load Testing with Keynote integrated with Application Performance Management from dynaTrace
Together with Keynote’s Load Testing Consultants we set up the following environment showcasing the benefits of an integrated solution of Cloud-Based Load Testing and Application Performance Management.

Step 1: Deploying the application
We deployed a 4 tier (2 Java and 2 .NET Runtimes) eCommerce Travel Portal on a hosted virtual infrastructure so that it is accessible by the Cloud-based Load Testing Service. We also installed and configured dynaTrace to manage this multi-tier heterogeneous application in order to identify problems once we put load on the system.

Application Dependency between the 4 tier heterogenous application

Application Dependency between the 4 tier heterogeneous application

Step 2: Test Scripts and Keynote/dynaTrace Integration
Keynote modeled several use-case scenarios based on the testing requirements we had on our application. We ended up with use cases such as executing a specific search, accessing the last-minute offers page or purchasing a trip. dynaTrace provides an integration interface for Load-Testing and Monitoring Services that allows us to link every executed synthetic request with the transaction that dynaTrace traces on the application server when these requests are handled by the application.

Step 3: Running a test
We decided to run a test starting with increasing load to figure out where the breaking point of our application is. We started with a load of 3000 sessions per hour running for 15 minutes and increasing this load every 15 minutes to 6k, 9, and 12k sessions/hour. It turned out our application broke much faster than we anticipated :-)

Step 4: Analyzing the Load Testing Report
When I log into the Keynote Load Testing Portal I start by looking at the load testing report that shows me the executed sessions, response times, page views and errors:

Keynote Load Testing Report showing an application problem when increasing the load to 6k sessions per hour

Keynote Load Testing Report showing an application problem when increasing the load to 6k sessions per hour

It is easy to see that – once we went from Phase 1 (3k sessions) to Phase 2 (6k sessions) – our application’s response times go through the roof causing most of the simulated users to experience timeouts. A click on the Page Error graph shows that these errors are mainly timeouts or connection errors. The question now is: Is this problem an application problem or is it related to the infrastructure? Without having insight to the application these results could be interpreted in multiple different ways, e.g: our hosting company doesn’t provide enough bandwidth. That is the point when Application Performance Management helps answering these uncertainties.

Step 5: Looking at application performance data
I’ve created two dashboards that I use to analyze application performance while or after running a load test. The first one is an Infrastructure Dashboard where I display CPU and Memory Utilization of all 4 Application Runtimes that are involved:

The Dashboard shows us high memory und GC activity on our GoSpaceBackend which also leads to very high CPU Utilization

The Dashboard shows us high memory und GC activity on our GoSpaceBackend which also leads to very high CPU Utilization

The red measure in the JVM Memory Usage graph indicates GC Collection time. The red in the CPU Usage indicates the max CPU Usage of that JVM. The conclusion is therefore easy. High memory usage leads to high GC activity which maxes out our CPU.

The next Dashboard gives me insight into the application itself – with all the involved application layers and the individual transactions that dynaTrace analyzed coming from the Keynote Load Test:

Transaction Response Time on the Application Server and Breakdown into Application Layers proofs that the slow response times are application related

Transaction Response Time on the Application Server and Breakdown into Application Layers proves that the slow response times are application-related

On the left of the Dashboard I placed a transaction overview of the individual use cases Keynote executed during the load test. It is easy to spot that once the load got ramped up to 6k sessions we saw a dramatic increase in response time on our application server. That means that our first question is answered: it is not an infrastructure problem with our web hosting but an application-specific problem. With the knowledge we already have by looking at the memory and CPU measures we can already guess that this is the main contributor. The performance breakdown on the bottom right also highlights which application layers were contributing the most to the application transaction response time. A double click on that graph gives us a close-up on this data:

Our Persistence Layer, EJBs and JMS are the main contributors of the application performance

Our Persistence Layer, EJBs and JMS are the main contributors of the application performance

Step 6: Drilling deeper into the problem
dynaTrace captured every single request that was executed while running the load test. Its PurePath technology is the enabler of the dashboards we looked at earlier. The next step is to identify what is really going on in the application and where is the main impact of the increased load. The next dashboard I created gives me a better overview of the application architecture, showing me which methods are called most often and how well they execute. I am also interested in database activity as well as individual web requests that were slow:

Detailed Overview of where my application hotspots are including application layers, methods, web requests and database statements

Detailed Overview of where my application hot spots are including application layers, methods, web requests and database statements

The dashboard again shows us that the primary application layer impacted is our persistence layer. It is also very interesting that the slowest URL is a web service hosted by our back-end application server and that we have a very high number of database statements coming from only a few web requests. This information is really valuable for the application architects who need insight into application dynamics under heavy load.

Step 7: Show me the root cause of these slow-running transactions
Not only can we get an overview of which requests were slow and how many methods or database statements were executed. We can now look into individual transactions, and also compare transactions to see where the difference is between a slow-running and a fast-running transaction. dynaTrace allows me to drill down to those 718 transactions that executed the slow running web service and I can inspect each individually:

All transactions (PurePaths) available for analysis. Selecting the slowest shows me where this web service got called and where time was spent

All transactions (PurePaths) available for analysis. Selecting the slowest shows me where this web service got called and where time was spent

Looking at the duration, CPU duration and Suspension Duration (Garbage Collection) really highlights the problem that we have. Suspension Time is really high with those transactions impacting the overall execution time.

I can also pick one that ran very slow and one that ran fast, and let dynaTrace compare these two transactions for me and highlight the differences:

Comparison shows the structural and timing difference between two transactions making it easy to spot the actual differences

Comparison shows the structural and timing difference between two transactions making it easy to spot the actual differences

Not only do I see how Garbage Collection impacts execution time of individual methods and the overall transaction. It also shows me how different the same transaction executes in case of an error (such as thrown abort exception) – which brings me to one additional dashboard I like to look it. This one includes exceptions, logging messages and an overview on the Garbage Collection runs on individual methods:

Exceptions including full stack traces, log messages with the context of where the were logged and an overview of suspended methods

Exceptions including full stack traces, log messages with the context of where they were logged and an overview of suspended methods

Step 8: Hand off the data
Looking at this data was easy as I simply look at these dashboards after the load test is finished. The dashboards already helped identifying several hot spots, e.g: high memory consumption by the back-end web services causing high GC, too many SQL statements per request, many hidden exceptions that never made it to a proper log message, …

dynaTrace makes this captured data available to the engineering team in order to resolve these problems. They can either access the data by directly accessing the dynaTrace environment used to capture this information. Another way is to export individual PurePaths or maybe all of them into a dynaTrace Session file which can be exchanged via email, Instant Messenger or attached to a bug ticket.

Proved the Concept: Cloud Based Load Testing with APM is ready for Agile Development

The problems/requirements listed in the beginning of this blog are solved/met with the integrated solution from Keynote and dynaTrace:

  • Keynote runs large scale load tests by driving load from many different locations around the globe
  • The global-distributed load generation allows us to identify local content delivery problems (slow network connections, wrongly configured CDNs, …)
  • The costs are under control as you only pay for the load test but don’t pay for maintaining your own load-testing infrastructure that would sit idle most of the time
  • Keynote makes it easy to create scripts and offers services to do the scripting for you
  • dynaTrace automatically highlights the problems identified during the load test. High-level analysis through dashboards doesn’t require highly skilled performance architects. The fine-grained data captured, however, gives the performance engineers and software architects actionable data without digging through log files or manually correlating a multitude of different performance metrics
  • No change to your application is required to use this integration

Related reading:

  1. End-to-End Monitoring and Load Testing with Keynote and dynaTrace We’ve learned from recent studies that performance has a direct...
  2. VS2010 Load Testing for Distributed and Heterogeneous Applications powered by dynaTrace Visual Studio 2010 is almost here – Microsoft just released...
  3. Performance Analysis in Load Testing Collection diagnostics information in Load Testing is a challenging task....
  4. From Cloud Monitoring to Effective Cloud Management – Webinar with IntraLinks on July 15th 2010 I am hosting a Webinar with IntraLinks this Wednesday. The...
  5. Elevating Web- and Load-Testing with MicroFocus SilkPerformer Diagnostics powered by dynaTrace MicroFocus and dynaTrace recently announced “SilkPerformer Assurance” and with that...

More Stories By Andreas Grabner

Andreas has over a decade of experience as an architect and developer, and currently works as a senior performance architect and technology strategist for dynaTrace Software, where he influences product strategy and works closely with customers in implementing performance management solutions across the application life cycle. He is a regular speaker at software conferences, writes for a number of technology publications, and blogs at http://blog.dynatrace.com

Cloud Expo Breaking News
More and more enterprises today are doing business by opening up their data and applications through APIs. Though forward-thinking and strategic, exposing APIs also increases the surface area for potential attack by hackers. To benefit from APIs while staying secure, enterprises and security architects need to continue to develop a deep understanding about API security and how it differs from traditional web application security or mobile application security. In his session at 14th Cloud Expo, Sachin Agarwal, VP of Product Marketing and Strategy at SOA Software, will walk you through the various aspects of how an API could be potentially exploited. He will discuss the necessary best practices to secure your data and enterprise applications while continue continuing to support your business’s digital initiatives.
Web conferencing in a public cloud has the same risks as any other cloud service. If you have ever had concerns over the types of data being shared in your employees’ web conferences, such as IP, financials or customer data, then it’s time to look at web conferencing in a private cloud. In her session at 14th Cloud Expo, Courtney Behrens, Senior Marketing Manager at Brother International, will discuss how issues that had previously been out of your control, like performance, advanced administration and compliance, can now be put back behind your firewall.
Next-Gen Cloud. Whatever you call it, there’s a higher calling for cloud computing that requires providers to change their spots and move from a commodity mindset to a premium one. Businesses can no longer maintain the status quo that today’s service providers offer. Yes, the continuity, speed, mobility, data access and connectivity are staples of the cloud and always will be. But cloud providers that plan to not only exist tomorrow – but to lead – know that security must be the top priority for the cloud and are delivering it now. In his session at 14th Cloud Expo, Kurt Hagerman, Chief Information Security Officer at FireHost, will detail why and how you can have both infrastructure performance and enterprise-grade security – and what tomorrow's cloud provider will look like.
The social media expansion has shown just how people are eager to share their experiences with the rest of the world. Cloud technology is the perfect platform to satisfy this need given its great flexibility and readiness. At Cynny, we aim to revolutionize how people share and organize their digital life through a brand new cloud service, starting from infrastructure to the users’ interface. A revolution that began from inventing and designing our very own infrastructure: we have created the first server network powered solely by ARM CPU. The microservers have “organism-like” features, differentiating them from any of the current technologies. Benefits include low consumption of energy, making Cynny the ecologically friendly alternative for storage as well as cheaper infrastructure, lower running costs, etc.
The revolution that happened in the server universe over the past 15 years has resulted in an eco-system that is more open, more democratically innovative and produced better results in technically challenging dimensions like scale. The underpinnings of the revolution were common hardware, standards based APIs (ex. POSIX) and a strict adherence to layering and isolation between applications, daemons and kernel drivers/modules which allowed multiple types of development happen in parallel without hindering others. Put simply, today's server model is built on a consistent x86 platform with few surprises in its core components. A kernel abstracts away the platform, so that applications and daemons are decoupled from the hardware. In contrast, networking equipment is still stuck in the mainframe era. Today, networking equipment is a single appliance, including hardware, OS, applications and user interface come as a monolithic entity from a single vendor. Switching between different vendor'...
Cloud backup and recovery services are critical to safeguarding an organization’s data and ensuring business continuity when technical failures and outages occur. With so many choices, how do you find the right provider for your specific needs? In his session at 14th Cloud Expo, Daniel Jacobson, Technology Manager at BUMI, will outline the key factors including backup configurations, proactive monitoring, data restoration, disaster recovery drills, security, compliance and data center resources. Aside from the technical considerations, the secret sauce in identifying the best vendor is the level of focus, expertise and specialization of their engineering team and support group, and how they monitor your day-to-day backups, provide recommendations, and guide you through restores when necessary.
Cloud scalability and performance should be at the heart of every successful Internet venture. The infrastructure needs to be resilient, flexible, and fast – it’s best not to get caught thinking about architecture until the middle of an emergency, when it's too late. In his interactive, no-holds-barred session at 14th Cloud Expo, Phil Jackson, Development Community Advocate for SoftLayer, will dive into how to design and build-out the right cloud infrastructure.
You use an agile process; your goal is to make your organization more agile. What about your data infrastructure? The truth is, today’s databases are anything but agile – they are effectively static repositories that are cumbersome to work with, difficult to change, and cannot keep pace with application demands. Performance suffers as a result, and it takes far longer than it should to deliver on new features and capabilities needed to make your organization competitive. As your application and business needs change, data repositories and structures get outmoded rapidly, resulting in increased work for application developers and slow performance for end users. Further, as data sizes grow into the Big Data realm, this problem is exacerbated and becomes even more difficult to address. A seemingly simple schema change can take hours (or more) to perform, and as requirements evolve the disconnect between existing data structures and actual needs diverge.
SYS-CON Events announced today that SherWeb, a long-time leading provider of cloud services and Microsoft's 2013 World Hosting Partner of the Year, will exhibit at SYS-CON's 14th International Cloud Expo®, which will take place on June 10–12, 2014, at the Javits Center in New York City, New York. A worldwide hosted services leader ranking in the prestigious North American Deloitte Technology Fast 500TM, and Microsoft's 2013 World Hosting Partner of the Year, SherWeb provides competitive cloud solutions to businesses and partners around the world. Founded in 1998, SherWeb is a privately owned company headquartered in Quebec, Canada. Its service portfolio includes Microsoft Exchange, SharePoint, Lync, Dynamics CRM and more.
The world of cloud and application development is not just for the hardened developer these days. In their session at 14th Cloud Expo, Phil Jackson, Development Community Advocate for SoftLayer, and Harold Hannon, Sr. Software Architect at SoftLayer, will pull back the curtain of the architecture of a fun demo application purpose-built for the cloud. They will focus on demonstrating how they leveraged compute, storage, messaging, and other cloud elements hosted at SoftLayer to lower the effort and difficulty of putting together a useful application. This will be an active demonstration and review of simple command-line tools and resources, so don’t be afraid if you are not a seasoned developer.
SYS-CON Events announced today that BUMI, a premium managed service provider specializing in data backup and recovery, will exhibit at SYS-CON's 14th International Cloud Expo®, which will take place on June 10–12, 2014, at the Javits Center in New York City, New York. Manhattan-based BUMI (Backup My Info!) is a premium managed service provider specializing in data backup and recovery. Founded in 2002, the company’s Here, There and Everywhere data backup and recovery solutions are utilized by more than 500 businesses. BUMI clients include professional service organizations such as banking, financial, insurance, accounting, hedge funds and law firms. The company is known for its relentless passion for customer service and support, and has won numerous awards, including Customer Service Provider of the Year and 10 Best Companies to Work For.
Chief Security Officers (CSO), CIOs and IT Directors are all concerned with providing a secure environment from which their business can innovate and customers can safely consume without the fear of Distributed Denial of Service attacks. To be successful in today's hyper-connected world, the enterprise needs to leverage the capabilities of the web and be ready to innovate without fear of DDoS attacks, concerns about application security and other threats. Organizations face great risk from increasingly frequent and sophisticated attempts to render web properties unavailable, and steal intellectual property or personally identifiable information. Layered security best practices extend security beyond the data center, delivering DDoS protection and maintaining site performance in the face of fast-changing threats.
From data center to cloud to the network. In his session at 3rd SDDC Expo, Raul Martynek, CEO of Net Access, will identify the challenges facing both data center providers and enterprise IT as they relate to cross-platform automation. He will then provide insight into designing, building, securing and managing the technology as an integrated service offering. Topics covered include: High-density data center design Network (and SDN) integration and automation Cloud (and hosting) infrastructure considerations Monitoring and security Management approaches Self-service and automation
In his session at 14th Cloud Expo, David Holmes, Vice President at OutSystems, will demonstrate the immense power that lives at the intersection of mobile apps and cloud application platforms. Attendees will participate in a live demonstration – an enterprise mobile app will be built and changed before their eyes – on their own devices. David Holmes brings over 20 years of high-tech marketing leadership to OutSystems. Prior to joining OutSystems, he was VP of Global Marketing for Damballa, a leading provider of network security solutions. Previously, he was SVP of Global Marketing for Jacada where his branding and positioning expertise helped drive the company from start-up days to a $55 million initial public offering on Nasdaq.
Performance is the intersection of power, agility, control, and choice. If you value performance, and more specifically consistent performance, you need to look beyond simple virtualized compute. Many factors need to be considered to create a truly performant environment. In his General Session at 14th Cloud Expo, Marc Jones, Vice President of Product Innovation for SoftLayer, will explain how to take advantage of a multitude of compute options and platform features to make cloud the cornerstone of your online presence.