SQL Server Monitoring Checklist

I have worked with a lot of organizations who have just a small handful of SQL Server instances. In most of these cases, these SQL Servers contain mission-critical data, but on the other hand, these organizations are not big enough to justify a full-time DBA, so either they subcontract the DBA work to an outside organization, or they assign someone from within the company to be the a “part-time” DBA.

Often, many of these part-time DBAs don’t have the time, or the interest, in learning how to properly maintain the SQL Servers under their management. Because of this, many of these mission-critical SQL Servers are at the brink of disaster, just one small step away from going down, potentially bringing down their organization at the same time.

When I run across such SQL Server instances, I fix up their instances the best I can (there are usually limitations imposed on me on what I can and cannot do), and I do my best to mentor the “part-time DBA” on common best practices. While some of these “part-time DBAs” are eager to learn, many of them don’t always retain everything I tell them. To help overcome this problem, I like to create checklists (cheat sheets) for them so they can better remember what they need to do.

If these part-time DBAs tell me they don’t have the time to follow my checklists (which is often the case), then I will recommend a third-party monitoring tool, such as Red Gate Software’s SQL Response monitoring tool, to help them monitor their SQL Servers. SQL Response has the ability to monitor SQL Server instances and alert DBAs to potential problems, and is easy for “part-time” DBAs to implement and use.

While I generally create customized checklists for each organization (as each organization is slightly different), many of the items in my checklists are common to most SQL Servers. For example, one of the checklists I generally prepare is what I call my Day-to-Day DBA Checklist, which includes these general suggestions. I recommend that DBAs perform these tasks on a daily basis (or as applicable).

1. Make sure each of your production databases are available and accessible to their users.

2. Check all processes – such as clustering, replication, service broker, log shipping, database mirroring – to verify that they are running correctly.

3. Check OS, SQL Server, Security, and other logs for unusual events.

4. Verify that all scheduled jobs have run successfully, including checking for long running or “hung” jobs.

5. Confirm that backups have been made and successfully saved to a secure location.

6. Monitor disk space to ensure your SQL Servers won’t run out of disk space. For best performance, all disks should have 15% or more of free space.

7. Throughout the day, periodically monitor SQL Server performance using Performance Monitor, Profiler/SQL Trace, or other performance monitoring tools.

8. Regularly monitor and identify blocking issues.

9. Keep a log of any changes you make to servers, including documentation of any performance issues you identify and correct.

10. Regularly restore backups to a test server in order to verify that you can restore them. You don’t need to restore all backups every day, but do so often to ensure that you are confident you have good backups.

11. Take some time to learn something new as a DBA to further your professional development.

12. Automate as many of these day-to-day tasks as possible.

If you are an experienced DBA, your SQL Server Monitoring list is probably much longer, and you probably automate most of these tasks. Because many of the people I work with are “part-time” DBAs, I chose to keep the list as simple as possible.

So my question to you is, what additional items would you add to my Day-to-Day SQL Server Checklist, keeping in mind that the audience for this list is for “part-time DBAs” without a lot of SQL Server experience?

You can download a free PDF poster version of my Day-to-Day SQL Server Checklist here.