The Definitive Guide to Application Performance Management

by Greg Shields


If one of your servers went down today, would you know about it? Probably. But if an obscure service on a small server that is part of a much larger solution started transacting at a slower rate than normal, would you? Perhaps not. If that obscure service is responsible for a critical activity in your business, its slight reduction in performance might be a big deal that you need to know about!

The Definitive Guide to Application Performance Management offers a comprehensive look at Application Performance Management (APM), an industry-leading mechanism for monitoring business services and applications. APM looks at every facet of the IT infrastructure, from the services to the users themselves, to gain a holistic understanding of what’s going on in your business applications. Using APM, you can quickly dive into the root causes of problems within your infrastructure. You will understand the level of impact associated with those problems, and will recognize and quantify the overall health of your IT infrastructure and business services.


Chapter 1: What is Application Performance Management?

This Web site is experiencing unexpectedly high volume. Please try again later.

You’ve seen these words before. Perhaps you were buying a just-released book or video through an online Web site and it popped up in the middle of checking out. Maybe you were trying to get tickets to that important sporting event or that one-night-only concert. What about when the winter storm of the century hits your airport and thousands of people scramble at once to find a new flight or a hotel room.

Each of these scenarios is strikingly similar to the others. An IT service struggles to keep up with the load of its users, until that load finally overwhelms its capabilities. You, the end consumer, are greeted with a pleasant message that effectively tells you…nothing. You don’t know what happened. You don’t know the status of the problem or its resolution. You don’t even know when that suggested “later” may be for you to try again. So, you—and everyone else—find yourself hitting the Refresh button over and over again impatiently waiting for a better response.

Or, in extreme situations, you stop doing business with that site entirely.

Each of these scenarios is also remarkable in how often they’re seen by the end customers of Web and other IT-based services. When they work, the IT services used by businesses are fantastically efficient in servicing customers. Yet when they don’t, the result is the online equivalent of a "Closed for Business" sign hanging on the front door.

You might have experienced situations like this in other places. Perhaps the problem isn’t in an online e-commerce system. Maybe a similar outage of service happened within an internal business application, the functionality of which is critical to getting your job done. Maybe an underlying IT infrastructure component such as name resolution or the network itself experiences a problem. The result of that low-level issue manifests itself in ways that are seemingly unrelated to the actual problem.

The central problem in all of these situations is an inability to properly manage application performance.

The goal of this first chapter, and indeed this guide as well, is to help you understand the critical need for managing your applications’ performance and behaviors. Chapter One will document the problem with today’s traditional monitoring solutions and explain why APM is an effective solution. It will discuss an introduction to APM and where it fits into the environment.

Chapter 2: How APM Aligns IT with the Business

There’s a central problem intrinsic to many IT organizations. This problem relates to IT’s ability to consider itself an integral part of the business, and ultimately the profitability of that business. The problem isn’t necessarily sourced from IT itself. In its relatively short history, “the people who fix the computers” have long served as a secondary function of business. For a very long period of time, the only time IT professionals were needed—or even seen—on the business floor was when something broke. Having a problem with your computer today? Call the IT Help desk line (see Figure 2.1) and someone will magically appear at your desk in a few hours.

Figure 2.1: IT is seen as “the people who fix,” a common sight in many businesses.

When the business didn’t need IT, these groups of people usually found themselves shuffled away to other parts of the building. Taking over closets and storage rooms behind locked doors, there this group awaited the next problem to be fixed.

Over time, this break/fix mentality begins to grow deeply ingrained into both the members of IT as well the rest of the business who rely on them for services. When IT operates in a break/fix mode, they usually find themselves reacting to problems. A critical server is down today? Here come the IT “white knights”, riding in to work through the night and ultimately save the day.

But at the same time, the break/fix mentality’s “hero effect” actually becomes a liability to the business. IT organizations that see themselves as the heroes to be called when problems occur probably aren’t spending the right amount of time preventing those
problems from occurring in the first place. If that critical server was actually reporting a problem for weeks before it finally crashed, IT is no hero in getting it running again - they’re actually the problem.

Why this disconnect between IT and the business? Other than a historical position inside the company’s locked storage closets, what are the causes behind IT’s reactive mindset? Differing responsibilities and mismatched priorities with the rest of the business, a lack of common vocabulary, and a missing vision into the business’ dollars and cents are all common factors.

In order to truly bridge the gap business needs and IT activities, an organization must demonstration maturity in its IT processes. Chapter 2 focuses heavily on understanding IT maturity gains that can be achieved through the implementation of an APM solution.

Chapter 3: Understanding APM Monitoring

The previous chapter of this book spent a lot of time discussing the concepts of IT organizational maturity. Although that conversation has little to do with monitoring integrations and their technological bits and bytes, it serves to illuminate how IT organizations themselves must grow as the systems they manage grow in complexity. As an example, a Chaotic or Reactive IT organization will simply not be successful when tasked to manage a highly‐critical, customer‐focused application. The processes, the mindset, and the technology simply aren’t in place to ensure good things happen.

To that end, IT has seen a similar evolution in the approaches used for monitoring its infrastructure. IT’s early efforts towards understanding its systems’ “under the covers” behaviors have evolved in many ways similar to Gartner’s depiction of organizational maturity. Early attempts were exceptionally coarse in the data they provided, with each new approach involving richer integrations at deeper levels within the system.

IT organizations that manage complex and customer‐facing systems are under a greater level of due diligence than those who manage a simple infrastructure. As such, the tools used to watch those systems must also have a higher level of due diligence. As monitoring technologies have evolved over time, new approaches have been developed that extend the reach of monitoring, enhance data resolution, and enable rich visualizations to assist administrative and troubleshooting teams. This chapter discusses how this evolution has occurred and where monitoring is today. As you’ll find, APM aggregates the lessons learned from each previous generation to create a unified system that leverages every approach simultaneously.

Chapter 4: Integrating APM into Your Infrastructure

Integrating an Application Performance Management solution into your environment is no trivial task. Although best-in-class APM software comes equipped with predefined templates and automated deployment mechanisms that ease its connection to IT components, its widespread coverage means that the initial setup and configuration are quite a bit more than any "Next, Next, Finish."

That statement isn’t written to scare away any business from a potential APM installation. Although a solution’s installation will require the development of a project plan and coordination across multiple teams, the benefits gained are tremendous to assuring quality services to customers. Any APM solution requires the involvement of each of IT's traditional silos. Each technology domain—networks, servers, applications, clients, and mainframes—will have some involvement in the project. That involvement can span from installing an APM's agents to servers and clients to configuring SNMP and/or NetFlow settings on network hardware to integrating APM monitoring into off-the-shelf or homegrown applications.

In Chapter 4, learn more about integrating Application Performance Management into your infrastructure such that you can extract and analyze environmental data in order to assure the highest level of service quality and an overall sense of system "health".

Chapter 5: Understanding the End User's Perspective

Chapter 3 of this guide walked you through the entire history of IT monitoring as we know it. Starting with the basics of "ping" responses, through SNMP polls, agent and agentless perspectives, and concluding with application analytics and transaction gathering, the history of monitoring has evolved dramatically over time. With each evolution, the areas in which monitoring integrates with your systems grow richer while their data grows more useful to the business. As continued in Chapter 4, each successive approach adds yet another layer to the overall view into a computing environment.

Yet Chapter 3 and 4's discussion concluded at the very point where experience-based monitoring actually starts to get interesting. With the development of End User Experience Monitoring (EUE), automated solutions for watching your business systems get their first looks into the actual behaviors experienced by an application's users. Gathering metrics from the perspective of the user themselves brings a level of objective analysis to what has traditionally been a subjective problem. If you’ve ever dealt with the dreaded "the servers are slow today" phone call, you understand this problem.

Chapter 5 offers a discussion of End User Experience Monitoring and the most valuable perspectives to consider when targeting your EUE monitoring. This is followed by an examination of helpful performance measurements relating to the amount of time required to complete a transaction between two elements. Finally, Chapter 5 discusses ways to leverage EUE in order to improve application quality.

Chapter 6: APM's Service-Centric Monitoring Approach

This guide has spent a lot of time talking about monitoring and monitoring integrations. It discussed the history of monitoring. It explained where and how monitoring can be integrated into your existing environment. It outlined in great detail how end user experience (EUE) monitoring layers over the top of traditional monitoring approaches. Yet in all these discussions, there has been little talk so far about how that monitoring is actually manifested into an APM solution’s end result.

In this chapter, you will learn about the process of fitting together holistic sets of data captured from the IT infrastructure in order to derive meaningful, actionable information. You will discover that the real magic in an APM solution comes through the creation and use of its Service Model.

Chapter 7: Developing and Building APM Visualizations

If to you pictures tell more than a thousand words, this chapter is one not to miss. This guide's growing explanation of APM has introduced each new topic with an end goal in mind. That end goal—both for this guide as well as APM in general—is to gather necessary data that ultimately creates a set of visualizations useful to the business.

It is the word “useful” that is most important in the previous sentence. “Useful” in this context means that the visualization is providing the right data to the right person. “Useful” also means providing that data in a way that makes sense for and provides value to its consumer.

The concept of digestibility was first introduced in this book's companion, The Definitive Guide to Business Service Management. In both guides, the digestibility of data relates to the ways in which it can be usefully presented to various classes of users. For example, data that may be valuable to a developer is not likely to have the same value for Dan the COO. Dan's role might care less about the failure of an individual network component compared with how that component impacts the system's customers. Each person in the business has a role to fill, and as such, different views of data are necessary.

Chapter 8: Seeing APM in Action

You could easily think of this chapter as the "companion" to the previous chapter. In Chapter 7, you learned about the best practices in creating APM visualizations. By analyzing a sample of mocked-up dashboards, the previous chapter presented a number of ways in which APM visualizations enhance IT operations as well as business decision making.

However, it's worth saying again that APM is all about its pictures. The return provided by an APM solution comes from the data it presents to the multiple classes of users in your organization: administrators, developers, business executives, Service Desk employees, end users. As such, truly understanding the value in those visualizations needs a second look.

You've hopefully been enjoying the chapter story of, and how Dan and John and the entire cast of players have evolved along with their monitoring capabilities. You've seen how John's job has gotten easier as the notifications he and his team receive grow more useful. You've also seen how Dan gets the quality information he needs to make data-driven decisions as a business executive.

But what you really haven't experienced to this point is a complete walk through of the entire process. Such a walk through can tell the extended story of a potentially scary problem along with how the APM solution set assists. That storytelling happens in this chapter. Its goal is to add a dash of humanity into the 20,000-foot perspective that's been the forefront thus far in this guide.

The story you'll be reading is entirely fictional but is based off the types of problems and war stories that you probably experience every day. Every IT organization in every business has its share of technology issues; they're a fundamental part of doing business with technology. This chapter's story is written to show how the presence of a fully-implemented APM solution enables every actor to do a better job supporting the needs of the company as a whole.

You'll also quickly notice that you've already seen many of the images here in previous chapters. Many are slightly-adjusted views of those seen in Chapter 7, where this guide's extended discussion on pictures and data-filled visualizations was most prevalent. Where feasible, the images have been altered to fit the storyline and its characters. This reuse is done intentionally, to bring a sense of continuity between the topics that have already been discussed and the story that ensues.

Chapter 9: APM Enables Business Service Management

The topics in Chapter 9 focus on the technologies involved in APM along with the performance and availability metrics associated with those technologies. The resulting visualizations are heavily focused on the needs of the technologist.

Missing from the previous chapter's discussion is another set of business-related metrics that convert technology behavior into usable data for business leaders. This class of data tells the tale of how a business service ultimately benefits-or takes away from-the business' bottom line. It also creates a standard by which the quality of that service's delivery can be measured. It is the gathering, calculation, and reporting on these business-related metrics that comprise the methodology known as Business Service Management.

Chapter 10: The APM Cheat Sheet

APM is obviously a big and complex topic. With monitoring integrations that span networks, servers, and end users along with all the transactions in between, a fully-realized APM solution quickly finds itself wrapped around every part of your network. Truly understanding how an APM solution adds value to your business services requires every part of a multi-chapter book. This book's definitive approach to explaining APM strategy and tactics gives you the information you need to make smart purchasing decisions. Once decided, its chapters then help guide you towards the best ways to lay it into place.

However, not everyone has the time or the interest in poring through a 200-page tome. Digesting this guide's 200-odd pages will consume more than an afternoon making the topic hard to approach for the busy executive or IT director. To remedy this situation, this final chapter is published as a sort of "Cheat Sheet" for the other nine. Using excerpts from each of the previous chapters, this "shortcut" guide summarizes the information you need to know into a more bite-size format.

So, how is this chapter best utilized? Hand it out to your business leaders as a walkthrough for APM's business value. Pass it around your IT department to give them an idea of APM's technical underpinnings. Show it to your Service Desk employees as an example of the future you want to implement. Then, for those who show particular interest, clue them in on the other chapters for the full story. In the end, you'll find that APM benefits everyone. You just have to show them how.