Optimizing a Power BI Data Model
The course is part of this learning path
In Optimizing a Power BI Data Model we start by investigating the underlying structure and operation of Power BI's VertiPaq database engine. Armed with this knowledge, we investigate various strategies for optimizing performance. Not to give the game away, most of the strategies involve reducing the size of the data model and loading pre-summarized data. We also look at tools within Power BI and Power Query Editor for analyzing report and query performance as well as internal error tracking and diagnosis.
- Remove unnecessary rows and tailor data sets for report audiences
- Change data types to improve storage and split columns into multiple columns to reduce the overall number of unique values
- Remove redundant or duplicate data and replace columns with measures where possible
- Use Performance Analyzer to highlight report bottlenecks
- Create aggregations for imported and direct query data models
- Configure and use Power BI's and Power Query's diagnostic tools
This course is intended for anyone who wants to improve the speed and responsiveness of their reports by improving the underlying data model.
We would highly recommend taking the Developing a Power BI Data Model and Using DAX to Build Measures in Power BI courses first, as they go into depth about how to implement several of the topics discussed in this course.
What happens after you have optimized and streamlined your data model but your report performance is still sluggish or below expectations? Power BI desktop has a performance analyzer tool, found under the view tab, which can be used to record the time taken for each of the processes required to render a report to complete. Clicking on performance analyzer opens a new pane, where you click start recording and then refresh the report. Whether you choose to refresh just the visuals by clicking refresh visuals or do a complete data refresh performance analyzer will record the same events, that is, it won't record the time pulling data from the original source.
This report is made up of multiple card elements with text box titles, and performance analyzer records the same three metrics for each visual element. They are DAX query, visual display, and other. If an element is using a DAX query, which includes not only calculated measures but requesting any data from the model, it is the time in milliseconds from requesting the data to when the results are returned to the report. Visual display is the amount of time for the graphical element to be rendered on-screen. This will include pulling any images or graphics from web-based sources. Other, seems to me like a bit of a catch all category.
Microsoft says it's the time for visual elements to prepare queries, time waiting for other visual elements to complete rendering, or performing some other processing. Other is essentially anything that isn't a DAX query executing or a visual display element rendering. Unfortunately, there doesn't appear to be a total report rendering time, just totals for each graphical element. If I look at each of the card elements we can see that other makes up by far and away the majority of time in rendering the result. Gross sales is pulled directly from the LineTotal field, so the DAX is just LineTotal and filtering by the region table.
When I open the corporate margin card, which is displaying the margin calculated measure we can see that the DAX query is taking almost twice as long but is still minuscule in comparison with the other category. When I look at all the other elements we can see the same trend, the nondescriptive other category is making up the vast bulk of the time taken. This report took 5.8 seconds to render. How can we improve the performance? Well clearly we can make the biggest improvement by reducing the impact of the other category, and the easiest way to do that is by reducing the number of visual elements that need to be rendered.
I've added two other reports which display the same information, but in slightly different ways with each one using a different type of visual element. The page two report users a multi-card graphic. In the first report, I went with the default number formatting, millions and thousands dependent on the data value. I quite like that, but to keep it consistent across reports I've had to develop two other measures, formatted LineTotal and formatted margin.
Both of these use an if statement to determine with the value is greater or less than and 1 million and format the number appropriately. Formatted margin is slightly different because the data source as another measure. Visually the reports aren't exactly the same but this is the point of the exercise. Now when I record the report rendering after clicking refresh visuals on the performance analyser there is only one visual element involved so the total for that element is the report's total, which is a substantially faster 1.8 seconds. What is really interesting apart from the massive drop in the other category is the increase in the DAX query.
In the first report where I went with the default formatting that Power BI provided the DAX query took 280 ms. In this report where I have essentially added two more calculated measures, one of which is a measure within a measure the DAX query time is now almost 1.7 seconds, a six-fold increase. The third report uses the same measures as the second but displays information within a table. As we can see, there is practically no difference in the time taken to render a table report over a multi-card.
To be honest, I am quite staggered by the increase in the DAX portion of the report rendering time, so I'm going to create another report and drop the multi-card report two elements on the page, but this time go with the basic LineTotal column and the original margin measure. I'll start recording and hit refresh visuals. Wow! That is phenomenal, so much so that I'm going to have to hit refresh visuals again. The performance analyzer has shown us that multiple graphical elements are significantly slower to render than a single element, and it has helped us to identify some less than optimal DAX in the form of the formatted LineTotal and margin measures.
Hallam is a software architect with over 20 years experience across a wide range of industries. He began his software career as a Delphi/Interbase disciple but changed his allegiance to Microsoft with its deep and broad ecosystem. While Hallam has designed and crafted custom software utilizing web, mobile and desktop technologies, good quality reliable data is the key to a successful solution. The challenge of quickly turning data into useful information for digestion by humans and machines has led Hallam to specialize in database design and process automation. Showing customers how leverage new technology to change and improve their business processes is one of the key drivers keeping Hallam coming back to the keyboard.