Storage & Compression
The course is part of this learning path
SQL is the language for interacting with databases, and like all computer languages, there are many ways to achieve the right result. And like other software languages, some ways will be better and more efficient than others. What sets SQL and database programming apart from most other computer programming is the vast amount of data that is regularly dealt with, and how other factors that are not immediately obvious can impact the SQL query code you write.
This course looks at ways that you can see how the SQL Server engine views your queries and how it intends to execute them. Using that information, we look at other factors like table and index design that can positively or negatively impact your query execution.
For any feedback relating to this course, please contact us at email@example.com.
- Understand what impacts query performance when using SQL
- Learn how the SQL Server engine processes queries
- Use execution plans to see how a query is executed
- Instruct the query engine
- Understand how different query types can be used
- Learn how to design tables to make queries more efficient
- Use database file configuration to improve performance
- Understand how database compression can impact performance
This course is aimed at database administrators, data engineers, or anyone who wants to learn how to improve database query execution.
To get the most out of this course, you should be familiar with SQL query syntax, SQL data types, and SQL table structure.
Whichever version of SQL Server you are working with, whether that's Azure SQL, SQL Managed Instance, or the traditional server product, they all have the same internal mechanisms for processing queries. The query engine has two modes of operation, row mode, which is the one that most people think of where each row or record is processed one at a time. By that, I mean each row or an index related to it is inspected to see if the values it contains match the ones you are looking for. That can be when joining one table to another, or in the where clause of the query.
Batch mode can process values from multiple rows simultaneously, but this is dependent on using Columnstore technology, which we will look at later. When you execute a query against the database, your SQL statement is pulled apart into its constituent clauses, turned into expressions the query engine can understand, and then set off to retrieve or manipulate data. Well, that's kind of how it works, but it's considerably more involved and complicated.
So yes, the SQL takes is pulled apart so the query engine can understand what it needs to do, but that's not the end of the story. A query optimization engine then uses schema information to examine the structure of the tables containing the information you want, plus all the index related to those tables and the density of those indexes, which is part of the database statistics.
The density of an index is a measure of granularity. What I mean by that as a product has a unique barcode, but it also might belong to several product categories. A barcode index is the least dense as it has a one-to-one relationship with the product table.
While an index on the supplier is denser because many products share the same supplier. Because many indexes are made up of multiple columns, so the density calculation is the overall density of those columns values. The query optimizer wants to strike a balance between returning the results sets as fast as possible and using the least resources as possible. It will use index statistics to determine which index will deliver the best, least cost performance based on the query.
The distillation of the query optimization engines calculations is a query plan, also known as an execution plan. Before we get into looking at query plans, I just wanna talk about SQL Server query store. As the name suggests, the query store is where SQL Server and Azure SQL records details of query execution, their plans, and the related statistics. This feature is useful for analyzing historical query execution data.
If you notice a degradation in query performance, also known as query regression, you can use the query store to help pinpoint when performance started to drop off. There are two ways you can enable a database as query store. In SQL Server management studio, you can right click on the database and select properties. Then go to the query store page and set the operation mode to read write. You can also use the alter database command with set query store equals on and operation mode equals read write.
Having executed the alter database command with query store on, we can go back to the database properties and see that the operation mode has been updated to read write. Once the query store has been enabled, we get a query store node under the database with several child nodes that are predefined reports on query performance.
I'll go into top resource consuming queries, which isn't displaying anything right now, but if I go to the top right-hand corner and click configure, I can specify what type of metric I want to view, the timespan I want to view it over and the number of results to return. So I'm just going to leave it as is, except for the time interval, which I will change to the last day.
The first query I have here is one where I'm inserting data into a table that I'll be using later on in the course. In fact, the first few queries are the same SQL insert statement, but the first time I ran it obviously took the longest. But once SQL Server had warmed up, then the subsequent executions were much quicker.
The next execution, which is for query 145, we can see in the plan summary window on the right that it has two plan IDs. These are the execution plans SQL Server has worked out for this query. In the bottom panel, we can see the query execution plan. And as I flick from plan five to plan 12, we can see the plan changes in the bottom pane. The reason for this is that between the executions, I placed a clustered index on the sales data table that was being inserted into.
We can also see in the green font an index that is suggested to improve performance. And by right clicking on it and selecting missing index details, we get the SQL statement for creating the index. Regressed queries are those where performance has degraded over time. Here is query 145 that we looked at previously. And the reason it appears in regressed queries is that placing a clustered index on sales data has slowed down into performance as you would expect. Not that it makes a lot of sense in the situation, but you can force a query to use a particular plan. So I'll force the query to use plan 12.
Tracked queries let you view a query's performance over time. Enter the query ID and the tracking query field and click the green play button. We can see how the query performed under different planned scenarios at different times. This view is useful for identifying query performance problems related to operational loads on the server.
In terms of server-wide performance issues, we can look at overall resource consumption. Once again, clicking on configure at the top right allows you to specify which metrics you want displayed and over what time period. Another predefined view is queries with forced plans. And we can see query 145 that I previously forced to use plan 12.
If I connect to an Azure SQL Server, I can view the same predefined query store metrics as with SQL Server on premise. Queries 323 and 308, that we shall see more of later in the course are shown here. When I go to the Azure portal and go into my database blade, I can see similar predefined query reports and a query performance insight. Once again, top five queries by CPU, data IO or log IO. And if I go into the custom page, I have similar functionality as we saw with the configure options in SQL Server management studio. Here I can see the two queries that we looked at before 323 and 308. And while we don't have play in detail, we can view resource consumption over time.
Hallam is a software architect with over 20 years experience across a wide range of industries. He began his software career as a Delphi/Interbase disciple but changed his allegiance to Microsoft with its deep and broad ecosystem. While Hallam has designed and crafted custom software utilizing web, mobile and desktop technologies, good quality reliable data is the key to a successful solution. The challenge of quickly turning data into useful information for digestion by humans and machines has led Hallam to specialize in database design and process automation. Showing customers how leverage new technology to change and improve their business processes is one of the key drivers keeping Hallam coming back to the keyboard.