Optimizing a Power BI Data Model
The course is part of this learning path
In Optimizing a Power BI Data Model we start by investigating the underlying structure and operation of Power BI's VertiPaq database engine. Armed with this knowledge, we investigate various strategies for optimizing performance. Not to give the game away, most of the strategies involve reducing the size of the data model and loading pre-summarized data. We also look at tools within Power BI and Power Query Editor for analyzing report and query performance as well as internal error tracking and diagnosis.
- Remove unnecessary rows and tailor data sets for report audiences
- Change data types to improve storage and split columns into multiple columns to reduce the overall number of unique values
- Remove redundant or duplicate data and replace columns with measures where possible
- Use Performance Analyzer to highlight report bottlenecks
- Create aggregations for imported and direct query data models
- Configure and use Power BI's and Power Query's diagnostic tools
This course is intended for anyone who wants to improve the speed and responsiveness of their reports by improving the underlying data model.
We would highly recommend taking the Developing a Power BI Data Model and Using DAX to Build Measures in Power BI courses first, as they go into depth about how to implement several of the topics discussed in this course.
Optimizing a Power BI data model is conceptually very simple when you understand that the underlying Vertipaq database engine architecture is a column store of unique values. Less unique values, the smaller the dataset and the less work the engine has to do to find a value. This is why column cardinality is intrinsic to Power Bi's performance. There are a variety of strategies to reduce a data set's size, and depending on the nature of your reporting data, you can choose to implement one or any combination of them.
First and foremost, you only want to have data you are reporting on in the data model. There is no point in populating or refreshing a model with data that will never be seen in reports. Start with the minimum amount of data, including only the columns necessary to meet your reporting requirements. It's easy enough to add additional fields later. If your report users are restricted in what they can view, then consider separate and smaller datasets for each audience.
When possible, replace columns with calculated measures to make the data model smaller and improve performance as measures tend to be faster than interrogating large datasets. Measures can also enable you to massage data into more meaningful information. Some fields can be split into multiple columns that reduce the overall cardinality of the data. Splitting date-time into date and time is one way to significantly reduce cardinality, or perhaps drop the time component altogether and have data aggregated only by date.
Aggregation is another way to both reduce the dataset size and have information already summarised for reporting. Power BI has an aggregation feature that pulls data from a direct query source into hidden aggregation tables that Power BI automatically manages.
Power Query Editor has diagnostic tools that help you look into and investigate the processes and queries that populate your data model. You can record actions such as refreshing data and see exactly what is going on in the background and how long each step takes. Performance analysis is not restricted just to the data model. Performance analyzer within the reporting function allows you to see which graphical elements take the longest to retrieve data and render it.
My name is Hallam Webber, and I've enjoyed presenting this course on optimizing a Power Bi data model. I hope you have also enjoyed it and learned how to optimize your own data models.
Hallam is a software architect with over 20 years experience across a wide range of industries. He began his software career as a Delphi/Interbase disciple but changed his allegiance to Microsoft with its deep and broad ecosystem. While Hallam has designed and crafted custom software utilizing web, mobile and desktop technologies, good quality reliable data is the key to a successful solution. The challenge of quickly turning data into useful information for digestion by humans and machines has led Hallam to specialize in database design and process automation. Showing customers how leverage new technology to change and improve their business processes is one of the key drivers keeping Hallam coming back to the keyboard.