Getting the Most Out of SQL Server Profiler Using Templates

In case you haven’t discovered this for yourself, SQL Server Profiler is one of the most powerful tools that come with SQL Server. It has the ability to gather huge amounts of data you can use to help you identify and troubleshoot many different SQL Server problems. In fact, Profiler can capture about 180 different events, and for each event, it can collect many different data columns (up to 64).

When I was a novice DBA and needed to run a new trace, I would go to New Trace within Profiler and create a new trace that I customized to collect the data I needed to identify and troubleshoot a particular problem. Once I had gathered the data I needed and fixed the problem, I would exit Profiler without saving the trace definition or the data I had collected. I didn’t see any need to save it once the immediate problem was resolved.

Continue reading

Free T-SQL Database Maintenance Scripts

I often get questions in e-mails, or at user group meetings, for advice on writing T-SQL database maintenance scripts. There seem to be a lot of DBAs who need more flexibility than what the Database Maintenance Wizard offers to perform database maintenance, which means they must user either T-SQL or PowerShell scripts in order to get this flexibility. Unfortunately, many of the DBAs who ask me this question don’t have a lot of time, or T-SQL experience, so they don’t know where to start.

The advice I generally give to them is to not reinvent the wheel, but to see what others have done, and to emulate them, especially to those DBAs with lesser experience. For example, the following websites offer T-SQL database maintenance scripts that are freely available for others to use with little or no modification, or to use as a learning tool in order to see how other DBAs have tackled this issue.

Continue reading

Help Design a New SQL Server Monitoring Tool

image

In an experiment in SQL Server community involvement, the members of Red Gate Software’s software development usability team have created a new website called www.thefutureofmonitoring.com. This team, among other teams at Red Gate Software, are working Version 2 of its popular SQL Response software.

The goal of the new website is to get your input on what you would like to see in a SQL Server monitoring tool. For example, what do you dislike about software installers, or what memory counters do you think are the most important? In fact, you can even help the usability team design a new dashboard.

Whatever you think about SQL Server monitoring tools, feel free to share those thoughts with the usability team at their new website. In fact, I have already given the team my input on the next version of SQL Response, and they would really like to hear from you too.

SQL Server 2008: Performance Data Collector

SQL Server 2008’s Performance Data Collector lets us create a central repository to store performance data and includes three built-in Data Collection sets that can be used to collect and store this data. Three built-in reports can then be used to access the stored data, so we can identify and troubleshoot SQL Server performance problems. Want to find out how the Performance Data Collector works and how it can help you? Read more…

Reasons Why You May Not Want to Use a Heap

I have started compiling a list of reasons why heaps aren’t usually a good choice when designing a database. Now, I didn’t say that heaps are always bad, but in most cases, I think they should be avoided (read the list to see why).

As I compiled the list, in some cases I have had the opportunity to verify that they are true, but in other cases, I have not had the time to verify them. I would like your input on the list. Is there anything on the list that is not true, or only partially true (under some circumstances, but not others), and what other reason are there to avoid using heaps that have I left out?

I look forward to your feedback.

  1. If non-clustered indexes are not added to a heap, then all queries against a heap will require table scans to retrieve the requested data. If the heap is large, then these queries will be very resource intensive and hurt SQL Server’s overall performance.
  2. Since the data in a heap is unordered, performing a table scan on a heap can cause a lot of extra I/O activity because inefficient random reads, not efficient sequential reads, are more the norm.
  3. While a non-clustered index can be added to a heap to speed up some queries, when the non-clustered index is non-covering, the use of a RID bookmark lookup is required. A RID lookup means that once the record(s) to be returned by the query are identified in the non-clustered index, additional reads (the RID bookmark lookup) must be made of the related rows in the heap, so that all of the data requested by the query is returned. This is not very I/O efficient, especially if many rows are returned. At some point, it may be faster for SQL Server to do a table scan than it is to use a non-clustered index when returning many rows. On the other hand, if the non-clustered index is covering, then the non-clustered can be used to immediately return the data to the user without having to lookup anything in the heap.
  4. If you want to create an XML index on an XML data column, a clustered index must exist on the table.
  5. If you want to create a spatial index on a spatial data column (GEOMETRY or GEOGRAPHY), a clustered index must exist on that table.
  6. If a heap has a non-clustered index on it (as the primary key), and data is inserted into the table, two writes have to occur. One write for inserting the row, and one write for updating the non-clustered index. On the other hand, if a table has a clustered index as the primary key, inserts take only one write, not two writes. This is because a clustered index, and its data, are one in the same. Because of this, it is faster to insert rows into a table with a clustered index as the primary key than it is to insert the same data into a heap that has a non-clustered index as its primary key. This is true whether or not the primary key is monotonically increasing or not.
  7. When data is updated in a heap, and the updated row is larger than the old row and can’t fit into the old space, a forwarding record is inserted into the original location that points to the new location of the page. If this happens a lot, then there is a lot of space wasted in a database maintaining the forwarding records. This also contributes to additional I/O activity as both the pointer, and the row, have to be read.
  8. Even if data updated in a heap is not larger than the old row (the updated data is smaller or the same size than the original data), updating a heap with a non-clustered primary key is slower than updating the same table that has a clustered index as the primary key. This is because updating a table with a clustered index is less write intensive than updating a heap with a non-clustered index as its primary key.
  9. If a row is deleted from a heap with a non-clustered index as its primary key, it is slower than deleting the same row from the same table with a clustered index as its primary key. This is because it takes more I/O to perform this task on a heap than on a clustered index.
  10. When data is deleted from a heap, the data on the page is not compressed (reclaimed). And should all of the rows of a heap page are deleted, often the entire page cannot be reclaimed. This not only wastes space, it contributes to fragmentation of the data pages within a database.
  11. If you take two identical tables, one that is a heap with a non-clustered index as its primary key, and a table that has a clustered index as its primary key, the heap with the non-clustered index will be substantially larger, wasting valuable space and increasing disk I/O.
  12. The ALTER INDEX rebuild and reorganize options cannot be used to defragment and reclaim space in a heap (but they can used to defragment non-clustered indexes on a heap). If you want to defragment a heap in SQL Server 2005, you have three options: 1) create a clustered index on the heap, then drop the clustered index; 2) Use SELECT INTO to copy the old table to a new table; or 3) use BCP or SSIS to move the data from the old table to a new table. In SQL Server 2008, the ALTER TABLE command has been changed so that it now has the ability to a rebuild heap.

Check Out the Free Performance Analysis of Logs (PAL) Tool

Traditionally, analyzing Performance Monitor data has not been an easy task. It often involved viewing Performance Monitor logs in graph view, seeing what was happening, and hoping that your knowledge and background in SQL Server and the Windows OS provided you with enough information in order to help you to be able to pinpoint specific performance-related issues.

Thanks to SQL Server MVP Jonathan Kehayias, I recently learned about a tool that can greatly benefit DBAs who want to automatically analyze Performance Monitor logs. The tool is called “Performance Analysis of Logs”, or “PAL” for short. It was developed by Clint Huffman of Microsoft, and can be downloaded free from Codeplex at www.codeplex.com/PAL. The tool is free and is Open Source code.

In brief, the tool takes a Performance Monitor log file (.blg), analyzes it, and provides you with an HTML report that presents a graphical analysis of the data, including alerts for those counters that exceed a specified threshold. The thresholds for the counters are based on those recommended by various Microsoft product teams and Microsoft support.

The tool includes many features, and is customizable, allowing you to configure it to meet your specific needs. While it won’t help resolve all of your performance problems, it provides a great start, and I highly recommend that all DBAs download it and give it a try.

Below is a very brief overview of the tool.

Using PAL

The first step is to create a Performance Monitor log file, as this will be the source of the data that will be analyzed by PAL. In the example below, I am using the Reliability and Performance Monitor, and I have created a data collector called the “PAL Data Collector.” You can include as many counters as you want, but PAL won’t analyze all Performance Counters, only those that it knows about. More on this later. In addition, you will want to collect data over a representative period of time, such as one hour, one day, and so on. Keep in mind that collecting Performance Monitor logs can become resource intensive if you collect a lot of counters and collect the data for long periods of time. Once you have collected data, copy that log file onto a local desktop computer, this way, when PAL is running, it won’t negatively affect the performance of the server you are analyzing.

image

Now, it’s time to start PAL. The tool is wizard driven, so once the Welcome screen appears, click “Next.”

image

The next step is to tell PAL the location of the Performance Monitor log file you want to analyze. Notice (below)  that you can analyze an entire log file or a partial log file (based on time and data).

image

The “Threshold File” screen is probably the most important screen to complete correctly. Notice at near the top of the screen below, “Microsoft SQL Server 2005” listed next to “Threshold File Title.” This is actually a drop-down box where you can select different analysis templates.

image

Below is example of the analysis templates that are available. As you can see, PAL is a multipurpose tool and can be used for analyzing the performance of many different Microsoft products. Currently, there is both a Microsoft SQL Server 2000 and 2005 analysis template. Currently, there isn’t a SQL Server 2008 template, although you can use some of the templates with SQL Server 2008.

image

Once you have selected an analysis template, you can optionally click on the “Edit” button, and a screen similar to the one below appears. This screen lists every Performance Counter that is analyzed by this analysis template, and in addition, you have the ability to edit it. What I really find useful about this screen is that it lists all of the threshold figures used by Microsoft to determine if a particular counter is exceeding its recommended threshold or not. When you create your Performance Monitor logs, I highly suggest you collect all of the counters that are listed in the analysis template that you want to use, so you can take full advantage of the analysis ability of PAL.

image

Once you have selected an analysis template (SQL Server 2005 in my case), the next step is to answer some questions about your server. The questions that are presented depend on which analysis template you have selected. Below, notice that there are five questions that you need to answer. This data is used when the analysis report is created later.

image

Next, the wizard presents us with the Analysis Interval screen. Here, we tell PAL how we want to slice and dice the data. For example, if we want to see Performance Monitor counter analysis for every  60 second time period, or for every hour time period, we can. This way, you can see what was happening during the time interval you specify. If you choose “AUTO,” PAL will automatically slice all of your data into 30 time slices. As this is not a great choice in most cases, you will probably want to pick your own time period.

image

The Output Options screen (below) allows you to specify where the reports will be outputted to. You can use the default values, or change them as you desire.

image

The Queue screen shows you (see below) the script file that it will run in order to analyze your data. In most cases, you will just click on “Next” to continue.

image

Now we are ready to analyze the Performance Monitor log file. In most cases, you will want to use the default “Execute: Execute what currently is in the queue” option, and the analysis begins. If you have collected a lot of data, it might take some time for the analysis to complete.

image

Once the analysis is over, a single-page (and very long) HTML report is produced. The top part of the report (shown below) is a table of contents and allows you to quickly drill down into the area(s) you want to focus on. It also shows you those areas where alerts have been produced. As you might imagine, an alert is not a good thing.

image

If you scroll past the table of contents screen (assuming you don’t want to drill down right away), you come to the “Alert” section of the report (see below). This is one of the most powerful parts of the report. For whatever Analysis Interval you have selected, the Performance Monitor counters are summarized. For example, I set my Analysis Interval to 5 seconds, so in the figure below, you can see three 5-second intervals (the last interval is cut off). During each of these intervals you can see the alerts that were created. For example, in the interval starting at 6:26:35 PM, there were three alerts. In the next interval, there were 2 alerts, and in the last interval, 7 alerts. In other words, what PAL has done is to summarize the counters for the Analysis Interval, and based on the thresholds set inside of PAL, it has produced various Performance Monitor counter warnings, in addition to providing us with various data for each counter. If a performance counter was fine during this time period (as defined by the predefined counter thresholds, then it won’t appear in this report.

Blue alerts are informational, yellow alerts are warnings, and red alerts are critical. So if you see a lot of yellow or red alerts, this is where you want to focus your time, as these are strong indications where your server may be having a problem. In addition, you can drill down into this information by clicking on the alert (it is a hyperlink) and it will move you to the location on the page where there is detailed data about the alert that provides additional information you can use to help you identify and resolve the specific problem.

image

For example, let’s say that I want to find out more about the red alert for the first interval for “Page Life Expectancy < 300 Seconds,” which is shown in the above screen. To find out more about this alert, I click on the hyperlink and the following detail screen appears.

image

In the above detail section of the report, I can see a graph of this counter over time (in this case, just a few seconds), along with the minimum, average, maximum, hourly trend (mine is negative because I only captured a few minutes of data), and other statistical data.

If you don’t know what this counter means to you as a DBA, there is an explanation at the top of the screen that tells you that if this value is below 300 (which it is), then adding more memory to your SQL Server instance would most likely boost performance. Each alert in the report provides similar information, providing you with a “short course” in Performance Monitor counter analysis.

Summary

PAL can take a Performance Monitor log file, analyze it, and highlight those counters that exceed commonly accepted thresholds, helping you to determine what, if any, problems your server is experiencing. This information, along with other information you gather, can be used to help you determine what is happening with your server, and hopefully, providing you with enough information to correct it.

This short blog entry just shows you a very small part of what PAL can do for you. Hopefully, you see the potential power this tool has, and you will download it and give it a try.