SQL is the language for interacting with databases, and like all computer languages, there are many ways to achieve the right result. And like other software languages, some ways will be better and more efficient than others. What sets SQL and database programming apart from most other computer programming is the vast amount of data that is regularly dealt with, and how other factors that are not immediately obvious can impact the SQL query code you write.
This course looks at ways that you can see how the SQL Server engine views your queries and how it intends to execute them. Using that information, we look at other factors like table and index design that can positively or negatively impact your query execution.
For any feedback relating to this course, please contact us at support@cloudacademy.com.
Learning Objectives
- Understand what impacts query performance when using SQL
- Learn how the SQL Server engine processes queries
- Use execution plans to see how a query is executed
- Instruct the query engine
- Understand how different query types can be used
- Learn how to design tables to make queries more efficient
- Use database file configuration to improve performance
- Understand how database compression can impact performance
Intended Audience
This course is aimed at database administrators, data engineers, or anyone who wants to learn how to improve database query execution.
Prerequisites
To get the most out of this course, you should be familiar with SQL query syntax, SQL data types, and SQL table structure.
Resources
https://github.com/cloudacademy/azure-sql-query-tuning-dp-300
We've seen query plans in the Query Store, so let's look at them in more detail. Query or execution plans come in three flavors. The Estimated Execution Plan is the initial query plan the optimization engine comes up with. The Actual Execution Plan is how the query ended up being executed. In the majority of cases, the Estimated and Actual Execution Plans are the same. Live Query Statistics display in real-time how much work is involved in each step of the query's execution.
Within SQL Server Management Studio you can view a query's Estimated Execution Plan by right-clicking on the query and selecting Display Estimated Execution Plan, or clicking the Display Estimated Execution Plan button on the toolbar, or using the Control + L shortcut key.
The Estimated Plan is exactly that, the query optimization engine calculates the plan it deems most appropriate for the query and returns it to you without actually executing the query. Here we have the same execution plan graphic as we saw in the Query Store.
An execution plan is read from right to left with each graphic representing a different operation and the arrows between them representing the relative volume of data, although in this example it's really only noticeable on the table scan of items where it joins TransLines in the hash match inner join.
Each operation is a relative cost to the overall query execution, so you can see which aspect of your query is most expensive. Once again, we also get a suggested index to apply to increase the overall performance. If I click on an operation's graphic, I can see details of that operation.
Clicking on the Translines table scan I can see the physical and logical operation which is indeed a table scan, and various estimated costs like I/O and CPU. Also, the number of rows to be read and at the bottom, we can see our predicate or where clause expression plus the list of columns that are output by that operation.
Clicking on a data flow arrow, I can also get information about the number of rows and the estimated total size of data plus the row size. As with the Estimated Execution Plan you can view an Actual Execution Plan either by right-clicking on the query and selecting Include Actual Execution Plan or by clicking the Include Actual Execution Plan toolbar button or use the Control + M key shortcut.
Most of the time, the Actual Execution Plan is the same as the estimated one, and we can see, that is definitely the case here. The estimated cost percent is now augmented with time values for each operation and we get the actual number of rows as a percentage of the estimated number.
When you find a significant variation between the estimated and actual rows in a particular operation, say the estimate is out by more than 100%, this could indicate a problem with the statistics the query engine is using for calculating its estimates. Without addressing the problem behind this discrepancy, query performance will never improve as the optimization engine is working with out of date or wrong data. The operation details are the same estimated metrics with the addition of the actual metrics.
Live Query Statistics are another variation on the Query Plan Theme. As we have previously seen, you can enable this by right-clicking on the query and selecting from the context menu, or clicking the Include Live Query Statistics toolbar button. When you execute the query, the previously static graphic of the query plan comes alive with animation and real-time updates of the operations as they progress.
Next to the SQL statement on the left, is an overall estimated query progress display. Once execution has finished, we can go to the Results tab and see a text printout of the query execution through its various stages. The graphical query plan is read right-to-left and this text version is read from the bottom to the top.
So, we start at row 13 with the table scan of Translines with the predicate of UPC does not equal TPAY2. The first column is the number of rows output from each operation, then we have statement text which is the operation for that row. That is followed by the statement ID column, which is one as there is only one SQL statement. Then we have the node ID and parent columns that describe the relationship between the different operations, followed by the physical and logical op columns.
There are various estimated metrics in this table as well as the output list column which details the output fields of each operation. You can also view the query plan with the Set Show Plan Text on command. You can get the complete range of statistics as we saw with Live Query Statistics but by using the Set Show Plan All command. These SQL plan commands aren't additive, so you will need to turn off Show Plan Text before setting Show Plan All to on.
As interesting as this all is, the real purpose behind this is to improve query performance. One of the first things you may have noticed about this plan is that it starts with a table scan, which means it's going to read the whole table. As we have a where condition on the TransLines table, we want to exclude a UPC from our query, we should have an index on the UPC field. It turns out there is an index on UPC, but it is just that field, so SQL Server will have to scan the index and then link it to the table to retrieve the appropriate data.
Based on what it knows of the index and the table, it's obviously decided that it would be just as easy to do a complete table scan. Therefore, the suggested index is also on the UPC column but it includes the fields that are needed to be returned for this query.
So let's implement the suggested index. I'll just create that index and then re-run the query. Straightaway we can see that the execution plan has changed as it starts with an index seek on the newly created index. If I go back to Top Resource Consuming Queries under the Query Store, which by the way defaults to the last hour, I can see that query before and after the index was created, and there has been a significant three-fold improvement in execution time. You can access query and plan information from the Query Store using SQL statements.
By looking into the query_store_plan, query_store_query, and query_store_query_text tables we can retrieve information from the Query Store. By adding a where clause, we can get information on a particular query, in this case, query 109 which is the insert into SalesData that we have been looking at. Then I can pull back the actual query plan from the Query Store Plan Table. The plan is stored as XML but clearly not in an XML field. If I rerun this query with plan cast as XML we can have a look at it, and that looks like SQL Server knows how to automatically display the query plan XML as a graphical plan.
Hallam is a software architect with over 20 years experience across a wide range of industries. He began his software career as a Delphi/Interbase disciple but changed his allegiance to Microsoft with its deep and broad ecosystem. While Hallam has designed and crafted custom software utilizing web, mobile and desktop technologies, good quality reliable data is the key to a successful solution. The challenge of quickly turning data into useful information for digestion by humans and machines has led Hallam to specialize in database design and process automation. Showing customers how leverage new technology to change and improve their business processes is one of the key drivers keeping Hallam coming back to the keyboard.