Optimizing a Power BI Data Model
The course is part of this learning path
In Optimizing a Power BI Data Model we start by investigating the underlying structure and operation of Power BI's VertiPaq database engine. Armed with this knowledge, we investigate various strategies for optimizing performance. Not to give the game away, most of the strategies involve reducing the size of the data model and loading pre-summarized data. We also look at tools within Power BI and Power Query Editor for analyzing report and query performance as well as internal error tracking and diagnosis.
- Remove unnecessary rows and tailor data sets for report audiences
- Change data types to improve storage and split columns into multiple columns to reduce the overall number of unique values
- Remove redundant or duplicate data and replace columns with measures where possible
- Use Performance Analyzer to highlight report bottlenecks
- Create aggregations for imported and direct query data models
- Configure and use Power BI's and Power Query's diagnostic tools
This course is intended for anyone who wants to improve the speed and responsiveness of their reports by improving the underlying data model.
We would highly recommend taking the Developing a Power BI Data Model and Using DAX to Build Measures in Power BI courses first, as they go into depth about how to implement several of the topics discussed in this course.
There are two types of diagnostics available to us within Power BI and Power Query editor. Power BI desktop has the ability to record crash dump files that contain much more information than an error dialogue might present. This functionality has to be turned on in diagnostic options within options and settings. As well as error tracing, we can also enable crash dump collection that will record detailed errors if Power BI desktop freezes and crashes on you.
I'll just enable tracing and then try to connect an SQL Server instance on a non-existent server. In this case, the error dialogue is very descriptive, and the information comes directly from SQL Server but let's go and have a peek at the trace files to see what they look like. Go back into diagnostics and options and settings and click on the open crash dump traces folder link. The Microsoft mash-up container folders is produced by the Power BI query engine, whereas the PBI desktop file is created by the Power BI desktop application.
The PBI file's size shows as 0 kB and won't be populated until you close Power BI desktop. Because this is a data error as opposed to an application error, I'll open up the larger of the mash up container files. You can use any text editor to have a look at the trace file, and I'm using visual studio code. It's quite easy to see that there are two error lines in the file, and if I scroll across to the right-hand side, sure enough, we have a detailed message telling us what we already know, and that is the server cannot be found. If you are only interested in the mashup files and don't close Power BI desktop, remember to uncheck enable tracing once you've finished having a look at your trace files; otherwise, Power BI will continue to produce them while it remains open.
It's Query diagnostics within Power Query editor that's going to be more useful on a day-to-day basis. Under the Tools tab, you can start recording diagnostic data with the start diagnostics button before performing some action like refreshing or pulling your data. I'll start recording diagnostics and then refresh the preview. Once I've finished, I'll stop diagnostics, and we can see a new diagnostics folder is created with three files and it.
If this was a purely imported model instead of direct query with an aggregation, we just have two files, a diagnostic and a diagnostic detail. Query column relates to the table in the query's pain on the left, while the category column groups the operation type, found in the operation's column. There is a plethora of information available in these diagnostic files. You may want to create visualizations to help you interpret and easily pinpoint issues with your DAX queries and data loading procedures.
Hallam is a software architect with over 20 years experience across a wide range of industries. He began his software career as a Delphi/Interbase disciple but changed his allegiance to Microsoft with its deep and broad ecosystem. While Hallam has designed and crafted custom software utilizing web, mobile and desktop technologies, good quality reliable data is the key to a successful solution. The challenge of quickly turning data into useful information for digestion by humans and machines has led Hallam to specialize in database design and process automation. Showing customers how leverage new technology to change and improve their business processes is one of the key drivers keeping Hallam coming back to the keyboard.