Advanced Analysis with Power BI
The course is part of this learning path
Advanced Analysis with Power BI examines various methods for teasing out insights from data using statistical methodologies and presenting significant findings in visually compelling formats. The course starts with basic statistics such as standard deviation and then progresses to AI and machine learning analysis where Power BI does all the heavy lighting allowing the user to investigate and dynamically explore significant findings.
- How to use Z-scores to display outliers and use the Outlier Detection visualization from Microsoft
- How to use Power BI's Anomaly Detection and Fluctuation Analysis functionality
- Use time-series forecasting to predict future data points with varying degrees of certainty
- Use groups to classify categorical data and bins to categorize continuous data
- Learn about Key Influencers
- Use the Decomposition tree to drill down into a metric manually using known factors or let AI functionality determine which factors are the major contributors
- Use the power of Azure's AI and machine learning to analyze text for positive and negative sentiment, keywords and phrases, and image tagging
This course is intended for anyone who wants to discover insights hidden in their data.
- Have a basic understanding of statistics, like knowing the difference between a mean and median, a normal distribution, and conceptually how standard deviation is related to that
- Know how to connect a data source, load data, and generally use the Power BI Desktop and Power Query Editor environments
- AI Insights demonstration requires a PowerBi.com premium account
AI insights are the Power BI umbrella term for accessing Azure's cognitive services. These AI services include elementary natural language assessment, basic image evaluation and tagging, and Azure machine learning integration. You can only access the services from a Power BI premium account. In this demonstration, I want to look at a couple of language processing features using sample data from Microsoft. Let's start with Power BI.com, where I'll create a new workspace called AI insights.
As you can see, this is a premium-enabled workspace as denoted by the diamond icon, and I'll change my license mode to premium per user. Next, I'll create a new data flow that will pick up a file from Azure blob storage. This is a small text file of comments from interactions with customer support of a fictitious online sales company. I'll click transform data to take me to the online version of Power Query editor. Next, I'll click the AI insights button on the right of the toolbar. Within cognitive services, there are four operations available to us. Tag images uses machine learning and AI to come up with a brief textual description of a picture. Extract key phrases, which analyses text fields for meaningful words and phrases. Detect language, which will try to determine the text's language, and score sentiment, which is a rating between zero and one indicating the negativity or positivity of the text in question.
I'm going to score the sentiment of the comments in this text file. In the first text drop-down, I'll select use values in a column and select the comments column. I could also enter the language ISO code here, which would be EN-US, but cognitive services automatically figures out the language when doing the analysis. I'm not sure why this warning has come up as I only have one data source, so that's not very intelligent, but I'll just click continue. Now that's finished processing, if I scroll across to the right-hand side, we can see the new cognitive services score sentiment column with its ratings. While I'm here, I'll rename that field to sentiment score. We have one warning saying that the new column is currently untyped, so I'll change it to a decimal column. Right, let's save and close, and I'll call the data flow comment rating. I've got a sentiment rating or score, and now I will extract key phrases from within Power BI desktop.
I'll connect to my Power BI data flow and open up the comments table within Power Query editor. In the desktop version of Power Query editor, there are three buttons within the AI insights toolbar section, and key phrases is under text analytics. I want to extract the key phrases from the comments column. There is a drop-down for selecting which premium capacity you want to use for your AI insights processing on the bottom left. Because we are sending our data across the Internet to have our AI processing done in the Azure cloud, we just need to absolve Microsoft of any responsibility for our data being intercepted in transit. I'll check ignore privacy levels and click save. Once the key phrase extraction has finished, it looks like records have been duplicated. In fact, a couple of extra columns have been added, one, key phrases, contains all of the keywords and phrases as a single text field. While key phrases. key phrase has each key phrase or word on its own.
So how can we use this textual data? One option is to use a treemap where we group by the users' name, have the sentiment score as the detail, and another field like votes as the values. Also, I'll drop the first comment in as a tooltip. Now just a word of caution about the sentiment score. I initially tried this with the notes field from a point of sale database with 72,000 comments, mostly of a dry and neutral nature. I did notice that comments including a company name that ended in the word limited were given a low or negative sentiment rating based on the negative connotation of limited.
Other comments, which included positive words, but the comment itself was either neutral or negative, did get a positive sentiment score. Remembering that zero is negative and one is the most positive sentiment score, when I hover over these areas more often than not, I feel there is a disconnect between the sentiment score and the comments. Perhaps a better way to view this natural language data is to use the word cloud visualization from Microsoft. I'll use key phrases as the category and try sentiment score as the values. That's pretty neutral and not that informative, although we must remember that this is completely made up of sample data. I'll try votes as the value. That's a little bit better. In the format pane, I can specify words to exclude like of, lot, and customers under stop words. I wouldn't describe the natural language processing of AI insights as a magic bullet, but used with caution and skepticism could be a very useful tool in some scenarios.
Hallam is a software architect with over 20 years experience across a wide range of industries. He began his software career as a Delphi/Interbase disciple but changed his allegiance to Microsoft with its deep and broad ecosystem. While Hallam has designed and crafted custom software utilizing web, mobile and desktop technologies, good quality reliable data is the key to a successful solution. The challenge of quickly turning data into useful information for digestion by humans and machines has led Hallam to specialize in database design and process automation. Showing customers how leverage new technology to change and improve their business processes is one of the key drivers keeping Hallam coming back to the keyboard.