Debunking SharePoint Performance Myths
SharePoint is a big, complex, enterprise platform with lots of moving parts. There are multiple levels of dependencies, from the operating system to the physical network configuration, and everything has to work in harmony for SharePoint to operate at maximum efficiency. Because of all this complexity, and the dearth of authoritative information on the subject, a number of myths have taken root over the last few years regarding SharePoint performance. As with most myths, a lot of them contain some elements of truth, but many are just the result of a lack of understanding on what is happening behind the scenes.
The following is a summary of the myths I most commonly encounter when speaking with administrators, end users, developers, and other SharePoint consultants, along with some true and accurate information (some of which, admittedly, may only be my personal opinion, so if you disagree please post a comment to the contrary). It is by no means an exhaustive list but it should help to clear the air on some of the most pressing performance issues related to SharePoint.
1. SharePoint Lists Have an Upper Limit of Two Thousand Items
This is probably the most common – and demonstrably false – myth of them all. It stems from guidance provided by Microsoft that individual folders or views in SharePoint lists should not contain more than 2,000 items. People took this to mean that lists themselves are unable to scale past that limit but nothing could be further from the truth. The guidance was provided as a recommendation to maintain the performance of the list view objects in the UI which are based on legacy code (and some poorly constructed SQL queries) which does not scale well. It only applies to the built-in web parts and list views – it does not have any bearing upon the lists themselves.Let me be clear – SharePoint lists scale up to very large levels. Microsoft states explicitly (see here) that the maximum number of supported items in a list is 5 million. That’s a big number and, naturally, it comes with some caveats. First, if you choose to implement item-level permissions, that number is reduced drastically. Second, your search crawl and index performance could suffer if you push your lists to anywhere near that limit. Finally, and most importantly, you must – must – write your own presentation logic to manage the display of list content at such high levels.
Why is this? Well, it helps to understand how list content is stored in the database in order to best decide how to go about scaling lists to the required size. Essentially, SharePoint lists are comprised of two distinct objects – the list definition itself (which determines the content types, columns, views and other settings associated with the list) and the list items. In the content database, these are separated into two tables – AllLists and UserInfo. AllLists contains the list structure and associated properties; UserInfo contains all the individual items for every list on every site in the entire site collection. Let me repeat that – every single list item in a content database is stored in a single table. If you have hundreds of sites across multiple site collections and they’re all in one content database then all of those list items are going to be stored in the same logical location.
Obviously, this schema has significant performance implications. Each query which is executed to retrieve list items must be run against the entire UserInfo table. Even if the harmless custom list you created only contains ten items, in order to display those items a select statement must be run that encompasses every list item in the entire content database. Simply put, the list view web parts and view mechanisms that ship out-of-the-box do not utilize the most efficient queries for this type of scenario; therefore, they start to break down after a certain level (you guessed it – somewhere around 2,000 items in a result set). In fact, the larger the content database grows the less and less performant these queries are, regardless of the size of the list they are being executed from (more information on this topic is available in the list scalability whitepaper published by Steve Peschka).
So how do we overcome these limitations? The simple answer is to write better queries but since we don’t have access to the database stored procedures we’ll need an alternative. The answer, as you may have already guessed, is to write your own query logic using the CAML syntax in the SharePoint Object Model and surface the results in a custom web part, application page, or other presentation-layer object. CAML (Collaborative Application Markup Language) gives developers the ability to execute structured queries against the SharePoint database which are more flexible and performant by design than the stock list view objects (especially the stock list view web parts).
In my presentations on creating scalable enterprise applications for SharePoint, I demonstrate these techniques using various methods to show how each performs against a sizeable list (10,000+ items). You can get the sample code here to see how this is done. There are also other alternatives to writing your own web parts to improve list view performance – using the MOSS Search functionality and constructing Data Form Web Parts in SharePoint Designer are both prime examples. But make no mistake – lists themselves are capable of scaling to very large levels; it’s the built-in view mechanisms that don’t scale well. Remember that and you’ll be far less frustrated when trying to architect solutions using SharePoint as a data repository.
2. SharePoint Is Just Too Slow for Common Tasks
Last year at TechEd North America I had some fun with people who walked by our booth. I would ask them if they had SharePoint (which was nearly everybody) and then I would ask them if it was slow (which was, coincidentally, almost exactly the same number as people who had SharePoint installed). The point, of course, was to push our Sonar product for isolating object-level performance issues but the results were enlightening. Regardless of whether it really is slow or not, people have the perception that it is and we all know that perception often defines reality within the enterprise.
Naturally, this raises the question of why people think SharePoint is slow. First, let’s put some things in perspective – anyone who has ever worked with enterprise web applications has come to accept a level of performance significantly lower than comparable desktop applications. Considering the technologies involved and the delivery method this is not surprising. Experience tells me that this is no more true of SharePoint than it is for WebSphere, PeopleSoft, Oracle, or a thousand other big, gnarly web apps. What amazes me is how little the people who complain about SharePoint being slow actually take the time to understand what it’s doing and how it works.
To begin with, all those moving parts that make up the monster we know as SharePoint have to be taken into account when designing an overall architecture. In addition, how the product is going to be used plays just as important a role in designing the overall architecture as what hardware it runs on and where it lives in the network. If the primary intent is for a corporate information portal then the design focuses on delivering relatively small pages to a relatively small number of users. If, on the other hand, it is intended to be used as a corporate document repository, the fact that large binary streams are going to constantly be in transit between the client and server over a geographically dispersed population must be taken into account. To further complicate matters, the propensity for SharePoint to become the Borg of the enterprise and slowly assimilate everything around it means that, quite often, the first case morphs into the second case with alarming regularity.
It should be no surprise to anyone that the two anemic web servers and pokey SQL server that they put in last year are suddenly groaning under the weight of a 100GB search index. And yet people still act amazed that SharePoint isn’t screaming along even while they are systematically starving it of physical resources. Let’s me be blunt for a moment (I know, as if that’s something new for me) – SharePoint is a resource hog. The more you use it, the more resources it consumes. If you want it to be fast then you need to architect for speed – big, fast SQL boxes with tons of directly-attached storage (I hate to break your heart but that big, expensive SAN you purchased is probably bottlenecking your entire operation) and boatloads of memory, fast network segments with proper traffic shaping and isolation, more load-balanced front-end servers than you ever thought you could cram into a single rack, and lots of isolated index and crawl servers. Sound expensive? Yep, you bet it is – but if you want fast then you gotta pay up.
Back in his Microsoft days, Joel Oleson compiled a bunch of links on SharePoint performance and scalability here. All the references are required reading when designing an enterprise portal deployment. I am always surprised, even though I shouldn’t be, how few people have read more than one or two items on this list. Instead, they let their budget dictate their architecture instead of the other way around. Worse, they have a “we’ll optimize later – right now we just need to get it up and running”. Wrong, wrong, wrong. I have never seen that approach work successfully; budgets get cut, priorities shift, and SharePoint becomes so mission-critical that you can’t tear it down once it’s in place. Sure, there are ways to patch things up and put band-aids on here and there but a bad architecture will always be a bad architecture.
My advice is to engage an expert with a proven track record early and have them help you design it properly from the ground up. If that’s not possible, at least educate yourself thoroughly before attempting to put a solution in place. Be sure to take everything into account – size of content databases, search indexes, crawl servers, Enterprise features (BDC, SSO, Forms Services, etc.), disaster recovery, load balancing, caching, compression, network connectivity – the whole nine yards. You can’t blame the product if you failed to implement it in a way that properly meets the needs of your organization.
3. SharePoint Is Not Suitable For Large Public-Facing Web Sites
I’ll admit, this one makes me cock my head like a ol’ hound dog listening to a strange noise. I’m never sure what people mean by this. Do they mean that it isn’t a proper web content management system (it is)? Or that it doesn’t have robust search capabilities (it does)? Or that it’s too hard to brand (it’s not)? Or that big organizations don’t use it (they do)? Just what are they trying to say exactly?
I would argue that the Library of Congress, State of West Virginia, Energizer, Kraft Foods, Hawaiian Airlines and hundreds of others have pretty well proven that SharePoint works just as well on the Internet as it does on the Intranet. Sure, it has its quirks and (as mentioned above) it has to be architected well to serve in high-traffic environments but what platform doesn’t? I remember the good ol’ days of the Internet boom, working with Interwoven and Vignette trying to get those beasts configured for major dotcoms. Easy? Hardly. Expensive? Very. But they worked – more or less – even though you had to completely change the way you managed content to use them. And training web authors to work around all their idiosyncrasies was like teaching hamsters to run in formation. How is SharePoint any different? Sure, it has it’s own terminology and ways of doing things but that’s to be expected.
The most common knock I hear about SharePoint on the Internet is the cost of the MOSSFIS license. Anyone who complains about this is simply not familiar with true cost of web CMS systems in general. I assure you that the big packages in use by major dotcom sites are way, way more costly than SharePoint ever will be. I admit that it’s out of reach for most small to medium sized business but that’s not it’s core market – MOSSFIS is targeted at big sites for whom a million-dollar investment in a public web presence is chump change. Could we use a scaled-down version for smaller sites? You bet – and if we all ask Microsoft for it, often, and in no uncertain terms, then we might just get it. But let’s not diss the product on account of market positioning, m’kay?
Another negative comment I hear in this arena is around branding. Holy smokes, has anyone ever tried to slap a custom design template on top of a Java-based product? You think SharePoint is hard to brand? I vehemently disagree. SharePoint branding isn’t hard once you learn what you are doing; it’s complex, yes, but not hard – there’s an important difference. SharePoint gives you so many methods out-of-the box for branding (Themes, Templates, SharePoint Designer, Site Definitions) that it’s a wonder we haven’t all become branding experts by now (just kidding, Heather – your job is safe). I do custom site definitions all the time – both in WSS and MOSS – for branding purposes and while it’s vitally important to have strong HTML and CSS skills the actual branding process isn’t all that challenging. Learning what SharePoint does behind the scenes, how it renders various controls, the proper use of content types and publishing fields, properly constructed layout pages, accessibility concerns – these are all important topics but there’s lots of information and training available. If you want to learn how to be a SharePoint design expert you can. Just don’t say it’s hard until you’ve tried to do the same thing with other comparable products.
As for performance on the public web, this is all down to configuration and tuning. Yes, there are some page payload items that are excessive (core.js, anyone?) and you have to do a lot of tinkering with IIS to get it just right but why should that be a show-stopper? Knowing what I know about the underlying architecture, data structure, rendering methods and memory management routines, I’m shocked it runs as well as it does. I’m familiar with a whole slew of public sites that have met their sub-eight-second render time goals with relatively little modifications to out-of-the-box functionality. Just as with list scalability, you have to design your architecture properly to perform adequately on the web where hundreds of thousands – or possibly millions – of users may be hitting your site on a daily basis.
4. SharePoint Isn’t A Scalable Enterprise Document Management System
It is hardly debatable that there are products in the marketplace which scale to levels that SharePoint cannot aspire in its current form and such products are often assumed to perform faster or more efficiently. But do they really? In a case study developed jointly with Microsoft and Fujitsu, KnowledgeLake constructed a scenario to test the overall scalablity of SharePoint’s document management features. The tests involved loading more than a terabyte of data into SharePoint, indexing that content with MOSS search, simulating thousands of concurrent users accessing the content, including searching, viewing and updating documents. The results were nothing short of astounding and proved that, with the proper architecture, SharePoint can handle huge volumes of data.
Therein lies the challenge. Very few enterprises have made a sufficient investment in properly architecting their SharePoint infrastructure. They assume – wrongly – that the out of the box configuration parameters are optimal and that scalability can be achieved by adding additional front end web servers or by increasing the physical resources of their SQL servers. While both of these methods are certainly valid, they are only a small part of the overall picture; proper infrastructure design must be done on a case-by-case basis with the needs of each particular scenario in mind.
If you truly intend to use the platform as a fully-fledged document management system then it is incumbent upon you to construct an architectural design that will support those requirements. In other words, don’t blame the product (or Microsoft, or the consultant, or even your own team members) if you fail to plan properly. There is a wealth of information on such obscure topics as database configuration, web application distribution, IIS compression, MOSS caching and the like that you really have no excuse for deploying SharePoint in a manner that doesn’t scale well within your organization.
Do your research first and don’t be afraid to tell management what it’s *really* going to cost not what you think they want to hear. I am constantly amazed at people who compare a $250,000 SharePoint deployment with, say, a multimillion dollar Documentum or FileNet implementation. That’s just silly. I’m not saying that SharePoint does everything that these products do (but to be fair, they don’t do even a small portion of what SharePoint does) but I am saying you should take a step back and realize what you’re getting yourself into when you talk about “enterprise scalability”. It can’t be done on the cheap – SharePoint will do it but you have to write the same big checks you would write to other enterprise vendors in order to make it happen.
5. SharePoint Pages Take Too Long to Render Over the WAN
Having said that, there are a number of ways in which any real or perceived page lag can be mitigated. To begin with, MOSS implements a very robust caching engine which reduces the amount of database trips required to server frequently requested content. If you have deployed only WSS and are wondering what MOSS can add to the mix besides the obvious Enterprise features, this is one area that provides a lot of bang for the buck; in fact, I would go so far as to say it’s worth the cost of the standard CAL all by itself (but that’s just me and I’ve been known to be off in the fringe a bit from time to time). In general, there are three basic kinds of caching in MOSS – Page, Object and BLOB – each of which has a specific role to play in the overall process of reducing the amount of data sent across the wire to the client or retrieved from the database before page rendering can commence. The good news is that MOSS does all these things behind the scenes without requiring any special configuration or input; administrators can tweak these settings, of course, by right out of the box they do a pretty darn good job of handling what is, by nature, a very dynamic application.
Finally, I have often found that organizations fail to set user expectations before implementing SharePoint as the new and greatest thing that is going to make everyone’s work life better (which I have yet to see actually happen but that’s what the marketing spinmeisters would have you believe). Let’s not forget that this is a collaborative platform, designed to facilitate interaction between human beings and decentralize the ownership of content. It is *not* a replacement for network file shares nor a massive Wiki-like free-for-all. Users must understand the implications of trying to interact with large data respositories (lists) using the built-in views mechanisms. Come to that, they also need to be prepared for the general funkiness that comes with using a completely browser-based interface. The old eight second rule is nothing more than a number to throw around in conversation; so what if a page takes twenty seconds to load so long as it makes the user more productive? I would happily trade moderately slower page render times (notice I qualified that statement – there is such a thing as waiting too long to get the information you’re after) for greater functionality and flexibility. But if we don’t set user expectations early then they will be disappointed – rightly or wrongly – when the product doesn’t measure up to whatever standards they had established in their own minds.
So there you have it – my top five list of SharePoint performance myths. There are certainly other myths out there that need a good and proper debunking but these are the ones I come across most often. If you see a related scenario often that you feel needs to be addressed, post a comment and let’s put it out there for everyone to benefit from. The more we address these issues the less FUD (Fear, Uncertainty and Doubt) we’ll have to deal with in our everyday lives. And let’s face it – everyone could use less FUD, couldn’t they?