Advanced Analysis with Power BI
The course is part of this learning path
Advanced Analysis with Power BI examines various methods for teasing out insights from data using statistical methodologies and presenting significant findings in visually compelling formats. The course starts with basic statistics such as standard deviation and then progresses to AI and machine learning analysis where Power BI does all the heavy lighting allowing the user to investigate and dynamically explore significant findings.
- How to use Z-scores to display outliers and use the Outlier Detection visualization from Microsoft
- How to use Power BI's Anomaly Detection and Fluctuation Analysis functionality
- Use time-series forecasting to predict future data points with varying degrees of certainty
- Use groups to classify categorical data and bins to categorize continuous data
- Learn about Key Influencers
- Use the Decomposition tree to drill down into a metric manually using known factors or let AI functionality determine which factors are the major contributors
- Use the power of Azure's AI and machine learning to analyze text for positive and negative sentiment, keywords and phrases, and image tagging
This course is intended for anyone who wants to discover insights hidden in their data.
- Have a basic understanding of statistics, like knowing the difference between a mean and median, a normal distribution, and conceptually how standard deviation is related to that
- Know how to connect a data source, load data, and generally use the Power BI Desktop and Power Query Editor environments
- AI Insights demonstration requires a PowerBi.com premium account
Typically, when you analyze data, you look for trends or exceptions to a trend, as in an outlier or an anomaly. You can go old school and detect outliers by creating calculated columns and measures using Dax formulas or utilize analysis functions built into Power BI, downloaded from the Power BI marketplace, or available through AI and machine learning integration.
Outlier detection is a visualization from Microsoft you can freely add to your visualizations palette. The outliers detection component uses R-based analysis to detect and display outliers in a scatterplot, box, or density plot. Depending on the distribution of the data in question, you can select from various statistical methods, like Z-score, Tukey, local outlier factor, or Cook's distance regression. In all of these methods, you can set the detection sensitivity level.
You can think of anomaly detection as outlier detection plus or extra, where Power BI tries to determine from your dataset which other factors contribute to, or at least are associated with, the anomalous data point. While detecting anomalies works well, determining which factors are associated with the anomaly, not so much at this time. Power BI includes an analysis feature to explain fluctuations in some types of charts. Right-clicking on a data point allows you to select an "analyze the increase or decrease" function. This feature will display charts for all the factors that Power BI believes are correlated with the data change.
Forecasting is another feature available with time-series data. With various levels of confidence, you can plot future values based on past data points. The forecast length parameter specifies how far in the future you want to plot values, whereas ignore the last data points enables you to include current data, that is, start forecasting before the end of the time series.
Groups and bins enable you to classify categorical and continuous data, respectively. You can create a group by control-clicking on a chart's data points. A group becomes another column within your dataset, where you can edit and manage further data value groupings by right-clicking on the group column. Bins are groups for continuous data where you specify bin membership by either the data value size or the number of data points that each bin should contain. In the case of bin size, Power BI divides the data range into equally distributed value segments and assigns a data point based on its value. If allocation is based on the number of bins, Power BI divides the data so that an equal number of data points will be in each bin. From a user's point of view, this is a very quick and easy way to categorize continuous data, but the one size fits all nature of bin distribution is not suitable for all scenarios.
The key influences visualization analyses explain by fields you specify to determine which ones, and the values within the fields, are most influential in determining the value of the column you're investigating. It does this by seeing which values within a categorical column are greater than the average of the whole column and how that value is related to the analysis column. You need to specify which fields you want to include as explanations. The top segments tab of key influencers will come up with segments displaying their size, impact on the variable being analyzed, along with the key data attributes.
The decomposition tree chart is another way to segment and explore your data using explain by fields that you specify. You can drill down into the data by manually selecting fields of interest or using the AI function, where Power BI determines the highest or lowest absolute or relative value in the next branch level.
AI insights is Power BI's umbrella term for Azure cognitive services integration. This feature requires a Power BI Premium subscription, and all processing is done online, even from Power BI Desktop. You can use it to analyze textual data like comments for positive or negative sentiment. It has an extract key phrases function that you can use to graph keywords against some other variable. It also includes an image tagging function that will allocate a text tag to an image based on AI and machine learning image recognition.
My name is Hallam Webber, and we've been looking at some of the advanced analytics features of Power BI. While we've covered a lot of ground, we really only scratched the surface in terms of what is possible with these AI-enabled analytics. As we've seen, not all of these features appear to be fully mature, and while they have potential, always be skeptical and verify findings that you feel don't make sense.
Hallam is a software architect with over 20 years experience across a wide range of industries. He began his software career as a Delphi/Interbase disciple but changed his allegiance to Microsoft with its deep and broad ecosystem. While Hallam has designed and crafted custom software utilizing web, mobile and desktop technologies, good quality reliable data is the key to a successful solution. The challenge of quickly turning data into useful information for digestion by humans and machines has led Hallam to specialize in database design and process automation. Showing customers how leverage new technology to change and improve their business processes is one of the key drivers keeping Hallam coming back to the keyboard.