Assessing Column Characteristics for Power BI
The course is part of this learning path
Assessing Column Data looks at the tools and features in Power BI Desktop and Power Query Editor for assessing and profiling data within a dataset's columns. We cover essential characteristics such as column data types through to distribution and statistical properties of a column's data. This course shows you how to use the various graphical tools in Power Query Editor to visualize data distribution, including uniqueness and distinct values. These tools provide access to filtering functions and detection of empty column values and potential errors due to data type mismatches.
This course also looks at factors that impact correctly importing table relationships into a Power BI data model and pitfalls to look for when assessing and profiling column data.
- Learn how to carry out column profiling and assess table relationships in Power BI and Power Query Editor
This course is intended for anyone who wants to use Power BI for data analysis and wants to learn how to assess the characteristics of their data before doing so.
To get the most out of this course, you should have some experience with Power BI.
Before we dive into a Power BI demo, I want to give you an overview of the data we will be using. It's real but somewhat old data from a point-of-sale system used in a company that sold mobile phones. Apart from being old, I'll say up front that the data structure is not best of breed. Also, being a point-of-sale system, its design is transactional as opposed to a star structure or dimension and fact tables, as you would expect in a reporting database.
I'll be using a subset of tables from the SQL Server database, and they will be as follows. The products being sold are items categorized in three ways described by the departments, subdepartments, and classes tables. A customer walks into a store which is called a branch. When the customer buys something, a sales transaction record is created and all the items bought are recorded in the Translines table. Three fields uniquely identify a sales transaction. Branch, then station, the checkout or point of sale terminal within the branch, and the transno field.
Each station has its own database, so transaction numbers are unique to that station. The TransHeaders and Translines tables have a many-to-one relationship based on branch, station, and transno. The Translines table has another field called line, which uniquely identifies each record in combination with branch, station, and transno. Items are uniquely identified by a text field called UPC, also present in the Translines table. B_Users is the users' table where the unique combination of branch and login name uniquely identifies a user.
Hallam is a software architect with over 20 years experience across a wide range of industries. He began his software career as a Delphi/Interbase disciple but changed his allegiance to Microsoft with its deep and broad ecosystem. While Hallam has designed and crafted custom software utilizing web, mobile and desktop technologies, good quality reliable data is the key to a successful solution. The challenge of quickly turning data into useful information for digestion by humans and machines has led Hallam to specialize in database design and process automation. Showing customers how leverage new technology to change and improve their business processes is one of the key drivers keeping Hallam coming back to the keyboard.