Intelligently Reducing SharePoint Costs Through Storage Optimization

by Don Jones


Every SharePoint administrator knows the challenge of dealing with SharePoint: The business wants to put everything in SharePoint so that as much data as possible is centrally located and managed; doing so, however, bloats the SharePoint SQL database and creates an administrative nightmare. How can you find your happy medium? How can you get as much data as possible into SharePoint - searchable, version-controlled, and secured - while keeping your database as trim as possible? The answer is storage optimization, and it's the subject of this book by noted industry expert Don Jones. You'll learn how to optimize SharePoint for the inclusion of large content items, external content items (like databases, shared files, and media files), and even "dormant" content that you no longer actively need - but can't afford to get rid of.


Chapter 1: The Problem with SharePoint Storage

We've been promised a world where SharePoint, in many ways, becomes our entire intranet. At the very least, SharePoint is marketed as a means of centralizing all our shared data and collaboration efforts. Conference speakers tell us that we should migrate our shared folders into SharePoint, integrate SharePoint with back-end databases, and make SharePoint the "dashboard" for all our users' information needs.

In many regards, SharePoint can do all of that-but the price can be prohibitive. Why? That's what this chapter is all about: The problems that can arise when SharePoint becomes the centerpiece of your information sharing and collaboration. This chapter will define the major challenges and goals for SharePoint content.

Chapter 2: Optimizing SharePoint Storage for Large Content Items

One of the biggest uses of SharePoint is to store large content items. Unfortunately, those are also one of the biggest contributors to massively‐larger SQL Server databases, slower database performance, and other problems. One of the most important topics in today's SharePoint world is optimizing SharePoint to store these large content items.

What Is "Large Content?"

Large content, in this context, refers primarily to the file attachments stored within SharePoint. Microsoft refers to this kind of content as unstructured data, as opposed to the more structured, relational data normally stored in a database. As outlined in the previous chapter, SQL Server's default means of storing this kind of data is as a Binary Large Object (BLOB), usually stored in a column defined with the varbinary() type. Physically, SQL Server keeps a pointer on the actual data page, and spreads the BLOB data across several pages. Figure 2.1 illustrates how the row data page provides a pointer to sequential BLOB pages.

Figure 2.1: BLOB storage in a SQL Server database.

Chapter 3: Optimizing SharePoint Storage for External or Legacy Content

Shared folders. External databases. Even media files—audio, video, and so on. We want it all in SharePoint so that it's version-controlled, secured, and searchable—but can we afford to bloat the SharePoint database with that much content? Adding all that content will not only result in a pretty sizable SharePoint database but also take up a lot of expensive SharePoint storage. However, with the right tools and techniques, you can bring that content "into" SharePoint, while keeping it stored "outside," helping to optimize your SharePoint storage and maintain a smaller, more manageable SharePoint content database.

Chapter 4: Optimizing SharePoint Storage for Dormant and Archived Content
Do we really need to keep every version of every file in the SharePoint database forever? Probably not—but where do you draw the line? How can you maximize your version history while minimizing your storage impact? Better yet, how can you take advantage of your existing tiered storage—including tape storage—to archive content without creating a SharePoint database that just grows and grows and grows? Chapter 4 explores these considerations and offers effective ways to optimize SharePoint storage for dormant and archived data.