Sunday 25 January 2015

Test and Monitor your SQL storage performance and some good practices for SharePoint.

In the last couple of weeks I've been busy with analyzing and optimising one of our customer SharePoint 2013 deployment. This is a huge topic to discuss, SharePoint itself is not a slow application. However, there might be countless reasons that can make your SharePoint work and feel slow. In my case the source of the performance issues was the SQL.
You should pay as much attention to your SQL when you plan and deploy your SharePoint solution, as other components. The SQL server is referred as  the storage tier for SharePoint. Except the binaries and the static application files SharePoint stores almost everything(94% to be accurate) in the SQL, after all SharePoint farm is just a Configuration database and all your user content.sites,webs are stored in content databases.
Now, I am not a SQL Server DBA nor a storage expert, but for me the storage that hosts you databases is one of the most important components for the performance of your SQL. If your RAM is enough for your workload,you have enough CPU power or you have followed all good practices for SQL that hosts SharePoint, but if your storage needs more time accessing and delivering certain data(latency) or the debit (IOPS) is not small enough or big enough the SQL will work slow, therefore your SharePoint will work and feel slow.
In this post I am going to give you links and short description to articles and tools(free/open source) you can use for testing and monitor your SQL storage at the planning phase or as it was in my case after the system was deployed.
I mentioned the good practices for SQL that hosts SharePoint. Well if you do not know what this practices are you can learn about them from the following resources:

    1. Tuning SQL Server 2012 for SharePoint 2013 - This are MVA videos hosted by Bill Baer(Senior Technical Product Manager) and Brian Alderman. In this demo-rich videos they will cover some of the best practices for integrating SQL Server 2012 and SharePoint 2013

    2. Maximizing SQL 2012 Performance for SharePoint 2013 Whitepaper - This is a great whitepaper for the good practices that can maximize your SQL Server performance for SharePoint. The author is Vlad Catrinescu (SharePoint MVP) that covers most of the points from the videos above, plus many additional.

I had to implement lots of this practices before going to investigate any storage bottlenecks.

Now I would like to bold on one thing that has became more important with the development of the Cloud services. Distribute your database files on as many disks as possible. Every cloud provider I've been working with limits the VHDs by IOPS. For example in Azure class A and D VMs you have 500 IOPS per disk. However with the different VM sizes you can attach different number of disks. With A4 for example you can attach up to 16 data disks and that will make 16x500 IOPS. So why not attach as many disk as possible if it will not affect your budget(most of the providers charge storage per GB)?

One of the first thing to check if you are experiencing SharePoint slowness and you are suspecting slow SQL is the "Developer Dashboard". There you can find the SQL execution time and many more useful stats for your request execution.

Developer Dashboard

You can find information how to enable and use the dashboard for SharePoint 2013 in following Article.

At the planning phase of your solution, before you have installed anything you can use various tools to test the performance of your storage, here are some of them:

     1. SQLIO - In this article from Brent Ozar you will find download link for the tool and how to use it. The tool is really straightforward and can show you basic performance information for your storage, IO/S, MBs/S and the latency from your output. In this article you will also find links for other tools like SQLIOSim that can simulate SQL Server OLTP and OLAP.

     2. DiskSpd - Now this is more robust tool for testing your storage. With this tool you can test local disks and SMB shares as well. This is a great article from Jose Barreto (member of the File Server team at Microsoft). There you will find everything about DiskSpd. The tool also comes with very comprehensive documentation. However this is a powerful tool  and you should not run it against storage with live data because it can destroy it.

Tools that you can use to monitor after system deployment:

     1. PAL  - This is a great universal tool for Performance Log Analysis. It is written by Clint Huffman (PFE at Microsoft). It is a tool not just for SQL, but for number of products. If you do not know how to interpret the performance logs for certain Microsoft product or you need aggregated performance data report , the PAL is just the tool for you. PAL is working with templates for the different products. If you do not know what PerfMon counters to capture you can export the PAL template to PerfMon template and then you can create regular Data Collector Set in the Performance Monitor MMC. You can see an example for PAL HTML report below.

PAL Report Sample


     2. SSMS Activity Monitor - This is a great tool that you can launch in the SQL Server Management Studio. Just open SSMS right click on the instance and open Activity Monitor. There you will find current stats for Overview, Active User Tasks, Resource Waits, Data File I/O, and Recent Expensive Queries. If you do not know how the Activity Monitor looks like you can see it below:

Activity Monitor

     3. Triage Wait Stats in SQL Server -  Here is another article from Brent Ozar where you can find an explanation of what Resource Waits are, a script that can capture stats for a period of time and references to other articles on the subject. For me the Waits are very important thing to look at when it comes to SQL performance in general.

     4. Capturing IO latencies for a period of time - This article is last but not least. It is written by Paul Randal from www.sqlskills.com. There you can find a script that for me is the ultimate tool to identify some storage bottlenecks. Because it is not mentioned in the article you can set the time period with changing WAITFOR DELAY. In the output you can see the average latency (Read/Write) per file Average Bytes for Read/Write operations and more. You can see example output below.



And here are the final notes. SharePoint is application where we cannot edit some DB schema for example. It is enterprise product, not some inhouse build. If you have followed above mentioned Best Practices, you have correctly scaled your deployment, you are sure that there is no customisation that can cause issues, but you suddenly receive performance hit most, probably it is caused by other systems that are more likely to change like storage,network, Windows etc.. And of course there is no magic number or silver bullet to fix all issues.
If you are just SharePoint Administrator and deep diving in SQL and storage is not your responsibility, notify and push the SQL and Storage guys to do their work and fix the issues. If you are not sure with above mentioned tools, test them in non-production environment first. As far as I understand T-SQL the above scripts do not touch any user database.

I am hoping this was useful for you.

3 comments:

  1. Good one, Ivan. Seems you've gone depper than most SharePoint Admins would go. In more than 50 % of cases I've seen the SQL boxes are not provisioned with the right amount of RAM and they're paging heavily... which leads to poor performance due to IOPs as you're mentioning. Azure is getting better than its early days in this area obviously.

    ReplyDelete
  2. Thanks Dimitar! RAM increase was one of the steps to improve the performance. We have WCM farm in Azure and the storage is one of the things we are pretty happy with :)

    ReplyDelete