Power BI has changed the business intelligence (BI) landscape forever, enabling BI professionals and regular Excel users alike to work with big data and build beautiful and insightful dashboards. Integral to the Power BI platform is the data model.
This course discusses fundamental concepts of data modeling design within Power BI. We explore Microsoft's recommended best practices as well as common data modeling complications and the solutions to them. This course will prepare learners to develop data models strategically to achieve the best performance.
Learning Objectives
- Identify the primary components of data models and describe flat schemas and star schemas
- Diagnose common Power BI data modeling issues within models and recommend appropriate solutions to them
- Identify common challenges in Power BI data models, implement smart solutions, and avoid common mistakes
Intended Audience
- Business professionals whose job requires them to design and build data models in Power BI
- Anyone preparing to take the Microsoft DA-100 exam
Prerequisites
To get the most out of this course, you should be familiar with preparing data using Power BI.
There are lots of ways to define a good data model, and in this course, we will be talking about model schemes to achieve best performance and strategic relationships to obtain best data flow, and other vital functional items. But I'd like to start by talking about what it means to have a good data model from our user's perspective.
Now we may be designing a data model for ourselves, or designing it for colleagues, or for clients, but whoever the end user is, if they can't use it, all the analysis power in the world isn't worth much. So let's start here, by keeping the end user in mind when designing a data model.
First off, data models that are complex or convoluted take more time to navigate, require more complex data analysis expressions, and can just be frustrating for the end user. Simple models, on the other hand, offer a shorter startup time, and more energy can be spent on discovering insights. So it's good to simplify our data models as much as possible.
In this video, we're gonna look at some of the ways data models get messy and they just about always do get messy, and ways to simplify them and tidy them up. Data models that have fewer tables are just easier to navigate. The model view in PowerBI gives us an eagle-eye view of our data model. Each box in the model view represents a table in the model, and each line in the boxes represents a column in that table. If too many tables are added to the model, it can just be visually overwhelming for the user. And also, too many tables can sometimes cause functional issues or even lead to report inaccuracies. So this is why it is super helpful to know when you can and should consolidate your data into fewer tables, and how to do that.
In this model, I have these three tables, which are queried from three different sources that contain the same type of data. Notice that even though they come from different sources, their structure is really similar and they all also have the same level of detail. This is a really common real-world scenario, and it's a great example of data that can and should be combined into a single table. In this case, we would append the data using the Append tool in the query editor. The Append tool will keep the columns just as they are and just tack each table onto the bottom of the previous table so they become one long table.
We'll start by opening the query editor by clicking on Transform Data. Inside this command group, I'll find my Append tool. I do have three or more tables, and we'll see this current table is already included on the right side here, so let's add those other two tables, and click Okay. And because we're being really mindful right now of our end users and our model's readability, I'm just gonna rename this query right now to better describe this new combined data table.
All right, I have now combined that data into one table, but looking here, I can see those precedent tables where I got that data, they're still hanging out there in my model. You can actually remove them from that model view from right here in the query editor. You can right-click on the query and turn off enable load. This safely removes from the model view without ruining any of the dependent queries. This one adjustment we just did did some important things, it removes some visual clutter from our model, but I have also exponentially multiplied my analysis options by consolidating all of that data into one table.
Now, I can analyze those regions together, I can also compare them to each other, I can drill down into segments of those regions for a more granular analysis, and I can look at those at the same time on the same visual. This is a win-win. Every column that isn't needed should be removed from your data model, because every column adds a data point for every row in that data set. So in the case of fact tables or long tables, for instance, each column can actually add millions of additional data points, and this affects the performance of your data model in longer scan times, longer load and refresh times, and it can throttle your computer's memory.
In addition to that, it's just not very user-friendly to have a hundred columns to scan through to find the relevant field. So every column we don't need should be removed to improve performance and readability. But there are cases when even though the report builders won't be using it on their reports, we can't totally remove it because it's essential for some important calculation, or security settings. These can actually be hidden from report view to reduce that visual clutter.
You can do this from the model view by right-clicking the field and choosing Hide in Report View. You can also do this from report view by right-clicking and choosing Hide. You can even hide entire tables if they aren't necessary in report view. The following tools are gonna allow us to make our model view more readable as well. First of all, we have a lot of flexibility to adjust the model layout. We can just click and drag to move tables in a more logical arrangement.
If you find you need to scroll to see all the tables, remember that you can scale to fit using the button in the bottom right along with that zoom slider as needed. But even tidied up models can look excessive if the reader is only there to consume a small portion of the data model. For this reason, PowerBI introduced a feature that allows us to focus on a subsection of the model. You can start by adding a new view tab at the bottom of your modeling view, and then we'll open up our field pane and drag in just the tables we care to see in this view. We can right-click on a table and choose to bring in its related tables.
When I've got it the way that I want it, I'm just gonna make sure to name it so that the users that need it will be able to read it. After carefully combing our model and removing dispensable data, and consolidating information, our users will be able to hit the ground running and have a much better experience within our data models.
Now that we have a better understanding of what makes a good data model from the user standpoint, let's delve into designing data models for technical performance.
Chelsea Dohemann is a Senior Technical Trainer and Microsoft Certified Master with almost a decade of experience in technology training. She has taught an array of applications from Microsoft products including Office 365 web apps, Microsoft Office Suite, Power BI, VBA for Excel, and SharePoint to Adobe Acrobat Pro and Creative Cloud. Being a persistent learner herself, Chelsea is acutely in-tune with the challenges of learning. She presents her topics in plain language, with real-world examples, reducing complex concepts down to their simple parts.