It’s been awhile since I have worked on my checklist series of blog posts that will eventually be turned into a book. This time around the checklist focuses on the kinds of things you might want to monitor on your SQL Server instance to ensure that they are working as you expect.
Unlike some of my previous checklists, this one is different in the fact that the checklist is asking if you are monitoring a particular aspect of a SQL Server instance, and if you are, how you are performing the monitoring, and how often are you performing the monitoring. My goal is not to suggest what you should monitor, how you should monitor, or how often you should monitor, but only to get started thinking about what kinds of things you should be monitoring on a periodic basis. Towards the end of the list I have provided a list of some of the more common ways you might consider monitoring different aspects of your SQL Server instances.
As with all checklists, this one is not intended to cover 100% of all the possible things you might want to monitor, but to include the key ones. Some of the items on the list may not be applicable to you, some may be applicable but you don’t care, and in other cases they are applicable and you do care.
Below is the current version of my SQL Server Monitoring Checklist, which is in the form of a spreadsheet that can be used not only to collect and store information, but it can be customized by DBAs as needed to better meet their unique environment.
Hardware Monitoring | Yes/No | How | How Often |
Is Physical Hardware Monitored: | |||
Is I/O Subsystem (DAS, SAN) Monitored: | |||
Is Free Disk Space % Monitored: | |||
Is Physical Disk Fragmentation Monitored: | |||
Overall Server Performance | Yes/No | How | How Often |
Is CPU Utilization Monitored: | |||
Is Memory Utilization Monitored: | |||
Is I/O Utilization Monitored: | |||
Is Network Utilization Monitored: | |||
High Availability | Yes/No | How | How Often |
Is Clustering Monitored: | |||
Is Database Mirroring Monitored: | |||
Is Log Shipping Monitored: | |||
Are Backups Monitored: | |||
Are Offsite Backups Monitored: | |||
Operating System Monitoring | Yes/No | How | How Often |
Are OS Events Monitored: | |||
Are Hardware Event Logs Monitored: | |||
Are System Logs Monitored: | |||
Are Application Logs Monitored: | |||
Are Security Audit Failures Monitored: | |||
Are Other Windows Logs Monitored: | |||
Is Virtualization Monitored: | |||
SQL Server Instance Monitoring | Yes/No | How | How Often |
Is SQL Server Connectivity Monitored: | |||
Is the SQL Server Service Monitored: | |||
Is the SQL Server Agent Service Monitored: | |||
Is the Analysis Services Service Monitored: | |||
Is the Reporting Services Service Monitored: | |||
Is the SSIS Service Monitored: | |||
Is the Service Broker Service Monitored: | |||
Is the Full-Text Search Service Monitored: |
|||
Are SQL Server Agent Jobs Monitored: | |||
Are SQL Server Alerts Monitored: | |||
Is the Database Mail Log Monitored: | |||
Is the SQL Server Log Monitored | |||
Is the SQL Server Agent Log Monitored: | |||
Are Policy Violations Monitored: | |||
Are SQL Audit Events Monitored: | |||
Is Resource Governor Monitored: | |||
Is the Default Trace Monitored: | |||
Database Monitoring | Yes/No | How | How Often |
Is Database Status Monitored: | |||
Are MDF and LDF Sizes Monitored: | |||
Is tempdb Size and Performance Monitored: | |||
Is Blocking Monitored: | |||
Are Deadlocks Monitored: | |||
Is Query Performance Monitored: | |||
Are Indexing Needs Monitored: | |||
Is Database Corruption Monitored: | |||
Is Replication Monitored: | |||
Documentation | Yes/No | How | How Often |
Is Server Documentation Monitored: | |||
Is the Disaster Recovery Documentation Monitored: | |||
Microsoft SQL Server Monitoring Tools Used | Yes/No | How | |
SQL Server Configuration Manager: | |||
SSMS Reports: | |||
SSMS Activity Monitor: | |||
SQL Server Alerts: | |||
OS Alerts: | |||
Performance Monitor: | |||
Profiler/SQL Trace: | |||
SQL Server 2005 Performance Dashboard: | |||
SQL Server 2008 Data Collector: | |||
SQL Audit: | |||
SQL Server 2008 Best Practices Analyzer | |||
RML Utilities: | |||
PAL: | |||
Extended Events: | |||
DMVs & Custom T-SQL: | |||
Custom PowerShell Scripts: | |||
Other Monitoring Tools Used | Yes/No | How | |
List Any Third-Party Monitoring Tools: |
I would like your input on my SQL Server Monitoring Checklist. For
example, I would like your input on:
- What is missing from the list? What other aspects of SQL Server should be monitored on a periodic basis? Keep in mind that I can’t include every possible variation.
- What on the list could be removed because it is not very important? Keep in mind that the list is designed to be generic, so there will be items on the list that will not be applicable to all SQL Server environments.
- Does my wording make sense, or should I change any of the wording so that it is more understandable or more accurate? Should I change some of the categories, or move some of the items to different categories?
Please add your comments below, and as I get feedback, I will update the checklist. Thanks!
Pingback: Anonymous
Regarding “List Any Third-Party Monitoring Tools:” I would like to know at a glance what such tools are being used for. This is probably something that could utilize the third column (where “Yes/No” is the first and “How” is the second), perhaps called “Use”, “Task” or whatever is appropriate. While the day-to-day DBA might know what that third-party tool does, this checklist could be viewed at some point by a person who doesn’t know this (for example, a newly hired junior DBA or someone in management).
Thanks Brad for this exhaustive list of items for tracking. It makes administrators life easy as well as organized. Please keep sharing your experiences to the community. It helps 🙂
Nice post.
What i tend to find is that for large shops, some of the monitoring items are designated for different departments, for example in my shop the windows team will take care of operating system/hardware tasks, whilst the DBA checks anything database related. I have also automated some of the checks that i do on a day to day basis, what i have done is to create an SSIS package that will import logs, dmv results, database sizes etc onto a central database, so that I can write trend reports etc from it, and also find out for example when i issue started. I will start some kind of blog post on this, see http://rahmanagoro.wordpress.com/