SQL Server Management Studio
Index and Statistics
The course is part of this learning path
Information is at the heart of most software systems and the lifeblood of many organizations, so you want the database that stores this information to be efficient and reliable. But as we know, things happen; sometimes bad things. One of the ways that we can prevent bad things from happening is to know about them in advance like the old saying says, "To be forewarned is to be forearmed."
Azure SQL in its many forms has an abundance of features that help you to monitor the state of your databases and database server. Ranging from prebuilt automated monitoring that is augmented with artificial intelligence through to dynamic management views, SQL Server monitors and logs all aspects of the database engine’s operation and configuration. Intelligent Insights and Azure SQL analytics enable you to easily access the wealth of diagnostic and performance data in an easily digestible format.
This course introduces you to the different database monitoring and notification technologies available, how they work, and how to use them. If you have any feedback relating to this course, feel free to contact us at firstname.lastname@example.org.
- Understand the key elements of database monitoring
- Learn about the features of Intelligent Insights, Azure's AI-based database monitoring service
- Create graphical reports using SQL Server Management Studio
- Understand how wait statistics can show you where threads have to wait and how this can be used to monitor performance
- View and fix index fragmentation
- Monitor database storage
- Implement notification alerts on various database platforms
This course is aimed at database administrators or anyone who wants to learn how to implement systems that can find potential issues that may disrupt the delivery of their database services.
To get the most out of this course, you should have experience with SQL Server Management Studio, be familiar with reading and writing SQL, and have an understanding of basic database architecture and administration tasks, like indexes and backups.
Course Related SQL Scripts
As I mentioned earlier, outdated statistics and indexes can adversely affect query performance as the query plans are based on erroneous data. You can use T-SQL scripts to view table statistics and index fragmentation and if necessary, update those statistics or reorganize or rebuild fragmented indexes.
For all versions of SQL Server or Azure SQL, you can run scripts to update statistics and rebuild indexes.
Within SQL Server Management Studio you can create maintenance plans to update statistics and rebuild indexes for SQL Server on-premise, or in a VM. You can schedule these types of maintenance tasks for Azure SQL using an Azure runbook.
By querying sys.dm_db_index_physical_stats joined with sys.indexes, I can see the fragmentation of indexes on a particular table, in this case customer. I've just created an index on the code column so there is 0% fragmentation.
Now, if I insert just shy of 1 million records with a randomly generated customer code field and re-run the query, the resulting fragmentation is 86%. You have two options when addressing index fragmentation. You can either reorganize the index or you can rebuild it. Microsoft suggests, as a rough guide, that if row store index fragmentation is greater than 30%, you should rebuild the index, and if it's between 5 and 30% fragmented, then reorganize the index.
You can rebuild or reorganize all indexes on a table at once by using the ALL keyword instead of the index name. When a column store index's fragmentation exceeds 20%, it is recommended to reorganize the index. You can rebuild and reorganize indexes on an SQL Server hosted database through SQL Server Management Studio by right-clicking on the index and selecting Rebuild or Reorganize from the context menu.
You can use the DBCC SHOW_STATISTICS command to display statistics for an index. The command returns three result sets. The first one displays the name of the index, the number of rows and the number of rows that were sampled to calculate the statistic.
Steps refers to the number of divisions in the histogram breakdown, which is the third result set. Density is one divided by the distinct values, but is not used by the query optimizer and is only displayed for compatibility for versions of SQL Server prior to 2008. String index means that separate statistics are created when the first column of an index is a char or varchar type of column, for better performance when using the like operator.
If the statistics have been filtered with a predicate that will be displayed in the filter expression column, and as there is no filter expression, the unfiltered rows are the same as the total rows. In terms of the density vector, all density is calculated as one divided by the number of distinct values.
In this case, the index is on a single field called code, which is a varchar 18 and the table has a primary clustered index on an integer identity key. The average length of all code values is 13 more or less, and when we add in the four-byte primary key, we get 17 bytes. The histogram breakdown is the distribution of values over a series of intervals, in this case, 166, but the maximum number of intervals SQL Server will display is 200.
The range hi key is the upper bound of the interval. Range rows are the number of rows in the interval, excluding the upper bound. EQ_rows are the number of rows that equal the upper boundary, while the distinct_range_rows is the number of distinct values within the interval, also excluding the upper boundary, and finally, average_range_rows is the average number of duplicate values within the range interval.
This data lends itself to being displayed as a histogram, hence the name. If I scroll down to the bottom of the histogram breakdown there is no sign of any of the million new records I've inserted with the randomly generated code values. So, what I'll do is run an update statistics command on the customer code index.
Now that's completed I can see that the sampled rows are considerably less than the total number of rows. The show statistics command has used about 13% of the rows to update the statistics on this index. Scrolling down to the bottom of my histogram dataset, I can now see those new values.
If you wanna include all the rows in your statistics update, you can run it with the full scan option. Because of the high degree of fragmentation, I'll rebuild the index and then check the fragmentation again. As you would expect, the rebuild has resulted in no fragmentation.
In terms of automating index and statistics maintenance, you can use the maintenance plans feature within the management node of SQL Server Management Studio if you're working with SQL Server, either on-premise or on a VM or a managed instance. Alternatively, you can run your own scripts as a scheduled job using SQL Server Agent.
In terms of Azure SQL platform as a service, you have a couple of options for automating index and statistics maintenance if you wanna go beyond the auto-tuning facility already present. You can use a runbook through Azure Automation with PowerShell scripts to perform the tasks, or the new Elastic Job agent, which is another way of scheduling PowerShell script jobs.
Hallam is a software architect with over 20 years experience across a wide range of industries. He began his software career as a Delphi/Interbase disciple but changed his allegiance to Microsoft with its deep and broad ecosystem. While Hallam has designed and crafted custom software utilizing web, mobile and desktop technologies, good quality reliable data is the key to a successful solution. The challenge of quickly turning data into useful information for digestion by humans and machines has led Hallam to specialize in database design and process automation. Showing customers how leverage new technology to change and improve their business processes is one of the key drivers keeping Hallam coming back to the keyboard.