DBAmon
What DBAmon Monitors

Home | Index / DBAmon Documentation | DBAmon Change History | DBAmon Event/Error Documentation |
What DBAmon Monitors | Free Oracle Tool: orastat | Request Support | DBAmon Download | DBAmon Gadget

What DBAmon Monitors: MSSQL

Note: While it may appear that DBAmon does not monitor as many metrics for MSSQL as for other DBMS's,
MSSQL actually does a good job of monitoring itself. See #5 below -- DBAmon leverages MSSQL alerts.

Category Event Metrics What
DBAmon
Monitors
Metric Data Source AutoFix
Action
Taken
(If Any)
Backup 1. Database Backup Age
(AutoFix)
BCKAGE_D This Metric monitors The number of hours since the last successful D (Differential) or F (Full) backup is measured. This metric will be measured for databases listed in the DBC "Backup_Check_DBs:" parameter. If the DBC "Backup_Age:" number of hours is exceeded, then this event will occur. Data Dictionary Table: msdb..backupset If specified, the DBC "Backup_Command:" is run.
Availability 2. Instance Running INS The DB is verified to be running and can be connected to. "net start" Is run to see which services are running + Data Dictionary View: master..sysdatabases + An ODBC connect is attempted  
Availability 3. Alerts Any MSSQL Alert (SQL Studio Management -> SQL Server Logs) with a severity of 17 or higher will cause a DBAmon Critical Event. the xp_readerrorlog procedure is used to read the log. System Proc: master..xp_readerrorlog
Availability 4. Agent Running A check is done to ensure that the SQL Agent is running. "net start"
Availability 5. Transaction Log Full TLOG_PCT_FULL The DBC "T_Tlog_Full:" parameter is used for the WARNING and CRITICAL thresholds. If the SQL Transaction Log:
  • Size versus the User Specified TLog MAX SIZE -or-
  • Size versus the DISK Free Space
... is exceeded, then a DBA657 or DBA658 event will occur.
DBCC sqlperf (logspace) + sp_helpdb
Availability 6. Disk Drive Full DRIVE_SPACE All disks which contain MSSQL DB Files are checked for capacity and freespace. If any are >= the DBC "T_Disk_Full:" percent full, an event will occur. dbamon_mssql_di.exe + sp_spaceused
Availability 7. Filegroup Full FILEGROUP_FULL If ALL of the DB files in the file group DO NOT have MAX SIZE set to UNLIMITED, then this check will take place. It will compare the TOTAL SIZE of all DB file to the TOTAL MAX SIZE of all DB files. If the DBC "T_FG_Full:" percent full is exceeded (warning and critical threhsolds) are exceeded, a DBA850 event will occur. Again, this check will not occur if ANY of the DB files are MAX SIZE UNLIMTED. In that case, the DRIVE FULL monitoring will be used. sp_helpdb + sp_helpfilegroup
Availability 8. Disk Potential Growth DISK_POTENTIAL In the FILEGROUP check above, the POTENTIAL GROWTH for each DB file (the MAX size minus the CURRENT size) is calculated for ALL DB file on ALL DISKS. If the total potential growth for ALL DB file, by disk, exceeds the FREE SPACE for that drive, then a DBA851 event will occur. sp_helpdb + sp_helpfilegroup
Availability 9. DBMail Running OK DBMAIL_STATUS A check is performed to endure that DBMail is working properly. if the proper status is not detected, then a DBA853 event will occur. msdb..sysmail_allitems
Availability 10. Database SUSPECT PAGES SQL Server will populate MSDB..SUSPECT_PAGES when a corrupt page is detected in SQL 2005+. This check looks for rows in this table, which would indicate DB CORRUPTION. The resulting DBA856 event will occur in this case. msdb..suspect_pages
Availability 11. Database Recovery Model Inconsistent With TLOG Backups This metric checks for the condition where a SQL Database has been created with RECOVERY MODEL set to FULL or BULK-LOGGED, but there are not any TLOG backups on record (during the last 30 days). See DBA519 for details. SELECT name FROM master.sys.databases WHERE recovery_model_desc != 'SIMPLE' AND is_read_only = 0 AND name NOT IN ( SELECT DISTINCT database_name FROM [msdb].[dbo].[backupset] WHERE type = 'L' AND backup_start_date > DATEADD(dd, -30, GETDATE()) )
Management 12. TLog Size vs. DB Size Ratio TLOG_RATIO If it is found for any MSSQL DB that the SQL Transaction Log is at least 1gb in size and at least 5 times the total size of all Database files, a DBA656 event will occur. sp_helpdb
Management 13. Zero Non-System Databases If a MSSQL instance is found to contain ZERO non-system databases, a DBA663 event will occur. master..sysdatabases
Management 14. OFFLINE Database(s) Found If at least 1 database is OFFLINE, a DBA804 event will occur. An ODBC connect at attempted
Management 15. SQL Jobs Running It is sometimes necessary in a SQL Instance to ensure that certain jobs are always running. An example may be a SQL Distribution agent job, etc. By specifying the new DBC "MSSQL_DDP_Job_Cat_MBR:" and DBC "MSSQL_DDP_Job_Name_MBR:" parameters, you can specify the JOB CATEGORY or JOB NAME of jobs that should always be running. if a job that is configured in this way is NOT running, then a DBA351 event will occur. sp_help_job
Management 16. SQL Job Failure If the most recent exection of an Enabled SQL Job ended with a Failed status, the DBA354 event will occur. This check will occur if the DBC "MSSQL_Job_Fail_Check:" parameter is set to Y. sp_help_job
Management 17. Hung SQL Backups If a SQL backup is found to be executing for at least 24 hours, an event will occur. This is usually due to a backup process being HUNG. In this case a DBA854 event will occur. master..sysprocesses
Management 18. MSSQL DB Ownership This may be a matter of opinion, but it is (or can be) a poor practice for a production MSSQL database to be owned by an individual. If that person were to be removed from DNS, some functionality would be lost, as well as the obvious security concern when a person changes jobs, etc. The remedy is to have all DBs owned by SA or some other generic ID (which you can specify), which is a good practice. DBAmon checks all DBs to see that they are either owned by SA or by a userid specified in the DBC "MSSQL_DB_OK_Owners:" parameter. If any DBs are found not owned by SA or the userids that you specify in the MSSQL_DB_OK_Owners: DBC parameter, a DBA353 event will occur. Corrective DDL is supplied in the event log. sp_helpdb
Management 19. MSSQL Job Ownership This may be a matter of opinion, but it is (or can be) a poor practice for a production MSSQL job to be owned by an individual. If that person were to be removed from DNS, some functionality would be lost, as well as the obvious security concern when a person changes jobs, etc. The remedy is to have all Jobs owned by SA or some other generic ID (which you can specify), which is a good practice. DBAmon checks all Jobs to see that they are either owned by SA or by a userid specified in the DBC "MSSQL_DB_Job_Owners:" parameter. If any Jobs are found not owned by SA or the userids that you specify in the MSSQL_Job_OK_Owners: DBC parameter, a DBA352 event will occur. Corrective DDL is supplied in the event log. sp_helpjob
Management 20. SQL Log String Checking It is possible to tell DBAmon to check for a string of your choice in the SQL Log, and then to set a threshold on the number of occurrences of this string. The DBC "MSSQL_String_Check:" parameter. Look at the desciption of this DBC parameter (URL above) for more information. xp_readerrorlog
Performance 21. *ANY* MSSQL or Server Performance Counter Performance monitoring for SQL server and the servers that it runs on is simple because of DBAmon's ability to monitor any Windows Performance Counter . This Windows feature allows the OS or any underlying program to expose performance data as a Performance Counter. Do a Google search for more information. With DBAmon it is possible to take reading from any Performance Counter and to set a threshold. A DBA852 event will occur if a counter sample exceeds the threshold that you have set. *ANY* MSSQL or Windows Performance Counter.
Performance 22. Known BAD DB Options If it is found that any DBs have options set which are known to cause performance problems (now just AUTOCLOSE and AUTOSHRINK), the a new DBA855 event will occur. DATABASEPROPERTYEX
Performance 22. "I/O Delay" Messages in SQL Log Any messages with the text "I/O Delay" in the SQL Log indicate that the is I/O slowness. A DBA665 Performance event will occur in this case. xp_readerrorlog
Performance 24. "A significant part of SQL Memory has been paged out" Messages in SQL Log Any messages with the text "A significant part of SQL Memory has been paged out" in the SQL Log indicate that the is I/O slowness. A DBA670 Performance event will occur in this case. xp_readerrorlog
Performance 25. TEMPDB Database File Count PERF_TEMPDB_FILE_COUNT To prevent PAGE ALLOCATION contention, it is a good practice to have 1 TEMPDB Datbase File per usable CPU (CORE), up to a maximum of 8. See the documentation for DBAmon (P)erformance event: DBA857 for more information. sp_helpdb
Performance 26. SQL Server Instance "Signal Wait" Percent PERF_SQL_SIGNAL_WAIT_PCT SQL DMO's are queried to determine the current SIGNAL WAIT PERCENT. This basically is the percent of ALL wait time which is spent WAITING FOR CPU. A high value here (over the 25% threshold) over a long period of time indicates a need for additional CPUs. See DBAmon event: DBA858 for more information. master.sys.dm_os_wait_stats
Performance 27. SQL Server VLF (Virtual Log File) Count Per Database PERF_SQL_VLF_COUNT Having too many (you can set the threshold) VLFs can cause performance problems with SQL Updates/Inserts/Deletes and with DB recovery at startup. A good article of VLFs can be found at: Kimberly Tripp's BLOG Entry . DBAmon simply counts the number VLF in each SQL database by running DBCC LOGINFO. The could of VLFs for each DB is then compared to the DBC "T_MSSQL_VLF_Count:" parameter. If the number of VLFs is greater then or equal to the T_MSSQL_VLF_Count: value, then a DBA859 event will occur. The text for this event contains URLs to good internet articles that will show you how to reduce the number of VLFs. DBCC LOGINFO
Performance 28. SQL Server Processes Blocked by Other Process PERF_SQL_BLOCKEDPROCSOTHERPROC The value that SQL exposes to the performance counter for BLOCKED PROCESSES is misleading (and NOT THERE AT ALL for SQL 2000). When an application uses MSSQL Parallel Processing (you know; MAXDOP), it is common for a process to be waiting for its own parallel subprocesses. This metric better give a picture of SPIDs that are waiting for other processes. select count(*) from master..sysprocesses (nolock) where blocked != 0 and blocked != spid
Performance 29. SQL Server "Lock Pages In Memory" Setting PERF_SQL_LPIM_LOCKED_KB It seems common that when we do a SQL install that we grant the server role to enable LOCK PAGES IN MEMORY (LPIM), but for a variety of reasons it is not really enabled. A NON-ZERO value for this metric indicates that LPIM is enabled. select locked_page_allocations_kb from master.sys.dm_os_process_memory
Performance 30. SQL Server "Lock Pages In Memory" Setting PERF_SQL_RUNNABLE_REQS This metric shows the number of RUNNABLE requests. A non-zero value here indicates a possible CPU bottleneck. select count(*) from master.sys.dm_exec_requests where status = 'runnable'


DBAmon.com
This Document: http://www.dbamon.com/misc/monitors_mssql.shtml