The Definitive Guide to Scaling Out SQL Server 2005

by Don Jones


Enterprise applications have become more complex and have taken on a greater burden for managing a company’s critical data. At the same time, the amount of data managed by those applications has swelled exponentially as companies begin to track an increasingly greater amount of information—data about customers, vendors, sales, and more.

In The Definitive Guide to Scaling Out SQL Server 2005, industry expert Don Jones offers strategies for expanding a SQL Server 2005 database so that it can grow to meet the mounting and constantly changing requirements that are put upon it.


Chapter 1: An Introduction to Scaling Out

Enterprise applications have become more complex and have taken on a greater burden for managing a company’s critical data. At the same time, the amount of data managed by those applications has swelled exponentially as companies begin to track an increasingly greater amount of information—data about customers, vendors, sales, and more. In addition, the advent and popularity of data warehousing has expanded the amount of stored data by a factor of a hundred or more. In short, we’re keeping track of more data than ever before, and our database servers are starting to show signs of strain. This strain has been the catalyst for interest in scaling out from a diverse audience that ranges from database administrators to CEOs. Even your organization’s CFO will care, as scaling up is often more expensive than scaling out, especially when scaling up requires the purchase of an expensive midrange or mainframe platform.

Chapter 2: Scaling Out vs. Better Efficiency

Scaling out a database is a common technique for addressing performance issues. However, in many cases, performance can be markedly improved through better efficiency on the existing server. In addition, improving single-server efficiency will lay the path to a more successful scale-out project.

Database designs and applications that are flawed or are simply not well-tuned can have a major negative impact on a single-server application; however, they will lead to considerably worse performance in a scaled-out distributed database. A design or application problem that causes issues on a single server will most likely continue to do so on multiple servers. Thus, a welltuned database (and application) provides a much better distributed database.

This chapter will introduce you to common database design considerations as well as best practices for application design, index usage, and Transact SQL (T-SQL) fine tuning. These tips will enable you to optimize performance for single-server databases, paving the way to a better distributed database in your scale-out project.

Chapter 3: Scaling Out SQL Server

You’ve tweaked T-SQL until you’re blue in the face, upgraded your servers until they just can’t fit more memory or processors, and bought the fastest disk storage—and your database is still too slow. Whether you’re trying to support tens of thousands of users in a transactional application or quickly retrieve terabytes of data, scaling up has taken you as far as it can—it is time to scale out.

Before you start buying new servers and creating new databases, you must determine which scale-out technique is right for your environment and test the solution that you choose. You need to decide exactly how you will scale out and how to maintain a high level of fault tolerance in your new design. You will then need to perform some real-world testing to determine whether your database is ready to be scaled out and whether the scale-out decisions you’ve made result in an actual performance gain. In this chapter, we’ll explore how to select the best technique for your situation and the best methods for testing your selection to ensure it will handle the workload.

Chapter 4: Distributed Partitioned Views

As I’ve mentioned in previous chapters, distributed partitioned views can be a powerful tool in any scale-out scenario. However, they also have drawbacks, including the possibility of creating an imbalance in the amount of resources consumed by the servers that are handling the view. This chapter will go deeper into the pros and cons of distributed partitioned views, show you how to design and implement them, and offer some distributed partitioned view tips and best practices. In addition, I will provide direction for designing a performance comparison that allows you to measure the effectiveness and efficiency of distributed partitioned views in your environment.

Chapter 5: Distributed and Partitioned Databases

The phrases distributed and partitioned mean different things to different people when it comes to databases. Strictly speaking, distributed refers to any database that exists in more than one place (on more than one server), and a partitioned database is divided into multiple sections, with each section existing on a different server. In this chapter, I’ll discuss the pros and cons of these scale-out techniques and walk you through the steps necessary to implement each.

Pros and Cons

There are a number of good and bad points about partitioned and distributed databases. In most applications, the biggest drawback to partitioned and distributed databases for database administrators and developers is the level of complexity. Involving more database servers in your environment obviously increases maintenance and administrative effort; changing the way in which data is distributed across servers can create obvious difficulties for client applications that are hard-coded to look for data in specific locations or, at least, on one server.

Chapter 6: Windows Clustering

You’re considering a scale-out solution to ensure that the many users who need access to data can do so without worrying about downtime. In fact, your organization wants to avoid downtime at all costs. Enter Windows clustering. Although Windows clustering isn’t a requirement in a scale-out solution, it offers a higher level of availability than standalone servers can provide.

Chapter 7: Scale-Out and Manageability

In addition to bringing a solution to larger workloads, scale-out solutions also bring a unique set of problems to your environment, primarily in the realm of manageability. These are similar to the problems encountered by administrators building a Web farm: With so many additional servers handling the same application, how do you keep content synchronized across the farm? When changes are made to server configuration standards, how can you accurately deploy those changes across the farm? How can you efficiently monitor the health of the servers in the farm? In this chapter, I’ll focus on solutions that can help solve the manageability problems in a SQL Server scale-out solution.

Manageability Problems in a Scale-Out Environment

The manageability problems in a scale-out environment are more than just having to manage multiple servers, it’s the fact that those multiple servers all form a single application. In other words, you have to somehow manage the servers almost as a single unit, even though they’re distinct, independent units, so that you can maintain the integrity and functionality of the overall application. That’s a problem made more difficult in a SQL Server scale-out than even in a Web farm; with Web farms, taking a single server offline isn’t a big deal, because the other servers in the farm do the same thing and can pick up the slack. In most SQL Server scale-out solutions, however (such as a federated database), each individual server is a crucial element of the overall application. In the next three sections, I’ll explore some of the specific issues with manageability in a scale-out environment, so that you can clearly understand the challenges that you face.

Chapter 8: High-Performance Storage

Applications such as SQL Server depend heavily upon the availability of high-performance storage. In fact, as much research and design has gone into creating high-performance storage subsystems as has gone into SQL Server, simply because most SQL Server applications eventually become storage-bound—meaning storage imposes the upper limit on application performance. Such is particularly true on 64-bit servers, where the greatly expanded memory capacity can help prevent SQL Server from becoming memory-bound.

This chapter will discuss the storage technologies that are most appropriate for use with SQL Server and introduce you to best practices for utilizing high-performance storage systems. These systems are critical to building successful scale-out solutions—without the proper storage subsystem backing up your SQL Server applications, all the advantages of replication, federated databases, and other scale-out technologies are degraded. Obviously, most of the storage suggestions in this chapter are also appropriate for scale-up solutions, as well.

Chapter 9: Scaling Out at the Application Level

A major step in creating a SQL Server-based scale-out solution is creating the client application (or applications) that end users will utilize. These applications must be designed to accommodate your back-end scale-out technique, whatever that is, and it can be one of the most difficult parts of creating a scale-out solution. This chapter will focus on the client-side (and other application tier) design techniques that can be used to create an effective scale-out solution.

You need to keep in mind, though, the precise scale-out technique you’ve selected for SQL Server. For example, if you’ve decided to scale out by creating multiple SQL Server installations, all containing the same data and using replication to keep one another updated, then you’ll need to build your client applications in a particular way. If, however, you’re using a federated database—where each server contains only a portion of the data, and views and other techniques are used to draw it together—there is a whole different set of client-side techniques. This chapter will examine them all, but you’ll need to select the ones that are most appropriate for your particular scale-out solution.