Introduction and Overview
Track Data Changes
The course is part of this learning path
In the Information age data is the new currency and like anything valuable, it needs to be protected. Azure SQL and its environment provide a range of mechanisms for protecting your data from a multitude of hazards. The potential threats range from bad actors trying to steal information to unintentional human error corrupting your data. To cover all eventualities Azure provides pre-emptive protection in the form of network security, several types of data encryption, data classification, and vulnerability assessment services. After the fact protection is available in the form of built-in data change tracking. This course shows not only tells you about what protection is available for your database but also how to implement it.
If you have any feedback relating to this course, feel free to contact us at email@example.com.
- Learn what security components are available within Azure SQL
- Understand how these elements work together to provide a secure environment
- Learn how to implement infrastructure security
- Learn how to secure your data from external and internal hazards
- Learn how to implement data change tracking
- Anyone who wants to learn how to implement secure Azure SQL databases
- Those preparing for Microsoft’s DP-300 exam
To get the most out of this course, you have should a general understanding of the fundamentals of Microsoft Azure. Experience using databases — especially SQL Server — would also be beneficial.
The GitHub repository for this course can be found here: https://github.com/cloudacademy/azure-sql-data-security-dp-300
Within Advanced Data Security of Azure SQL, managed instances, and Azure Synapse Analytics you can classify your data by information type and sensitivity and perform a vulnerability assessment on a database. The data classification service will allow you to report when sensitive data was accessed and by whom. Information types are categories like contact info, credentials, and financial, while sensitivity can be general, public, confidential, and highly confidential, As part of this classification feature, Azure offers a data discovery service that will look at the table columns within your database and try and determine the information type and data sensitivity that should be applied. The vulnerability assessment identifies aspects of your database and server configuration that are potential security weak points. Let’s have a look at the data discovery and classification service, along with vulnerability assessment within Azure SQL.
In the Azure portal go to your database and under security select advanced data security. I’ll go into vulnerability assessment and set that up. The first thing I need to do is set a storage account for the vulnerability reports to be stored in. I’ll also have scan reports and alerts sent to me. As you can see advanced data security is set up at the server level rather than at the individual database. Here I’ll click scan to do the first assessment. Right, now the results are in and I haven’t completely covered myself in glory with seven failing checks, but 43 passing. The first thing I’ll do is download the scan results to an Excel spreadsheet. The summary page is pretty much what we see in the portal, and the results sheet is quite a comprehensive report. Another feature of the vulnerability assessment is you can add a finding as a baseline metric. An example of this is the database-level firewall rules. This is just a test database that I access from my personal computer. By approving this as a baseline this warning will not come up in future vulnerability scans. Before we move on from vulnerability scanning let’s quickly have a look at the data that was saved to the storage account. We downloaded the Excel report, but there is also a JSON file that gets saved to the storage account. I’ll just download that and have a quick look inside. This JSON file allows you to integrate assessment reports with a Rest API. Turning our attention to data discovery and classification, as this hasn’t been set up yet the overview page has nothing to report. Now, if we go into classification we are presented with columns and their suggested information type and sensitivity labels. To me, it appears as if the information type is dictated mainly by the name of the column. For example, credit limit is classified as credit card which doesn’t really ring true, I would have classified it as financial. There are also cost fields in the items table that don’t feature at all, that I would consider worth classifying so salespeople don’t sell something below cost. I’ll select a few of the fields and click accept selected recommendations. Obviously, you can adjust the sensitivity label and the information type using the drop-down lists. I’ll save those data classifications and add a classification for the cost column in the items table. Information type will be financial, and I’ll make the sensitivity label confidential. On the advanced data security summary page, we can see the classifications that we have just set up. Having set up the data classifications we want to be able to report on the columns being accessed. To do this we go into auditing under security and select a destination for the audit log. You can choose to send your audit log to storage, log analytics, event hub or any combination of these destinations. I’m going to send my audit log to a new log analytics workspace. I’ll create a new workspace called whosbeenlookingatmydata and click okay. While you haven’t been looking, I’ve been hitting a couple of tables in the database with queries. We can see in the audit records dashboard that I’ve been accessing 2 confidential, 1 general data protection regulation confidential field and one highly confidential field. You can view the same data through log analytics, and at the end of the dataset is the data sensitivity information field. To get a better look I will remove most of the other fields. Now we can properly see what makes up that data sensitivity information field. You’re not limited to the portal for accessing or manipulating data classifications. You can also do it through T-SQL. The sys.sensitivity_classifications table tells you what data classification types are being used in your database. If you want to join this table to the relevant columns you link the major ID field to get the tables and the minor ID to get the columns. You can also add or modify a data sensitivity classification with the add sensitivity classification command. Here I’m adding the highly confidential label with information type of financial and a rank of critical to the ppuValue and CostAve columns of the items table. Remember at the beginning of the demo I asked for alert emails to be sent to me? Here is an example of a potential data exfiltration alert email that notifies me of sensitive data being queried.
Dynamic data masking is a way to protect sensitive information without completely hiding it or making it totally unintelligible. In Azure SQL, SQL managed instance or Azure synapse analytics you can specify particular table columns to have their contents partially masked or hidden when being returned in a result set. This is not an entirely new concept and is something that most of us have seen. A common example is when a saved credit card number is displayed in a shopping cart checkout confirmation where all but the last four digits are replaced with stars.
To set up data masking within the Azure portal go to your database and under security click data masking. You will be presented with data masking recommendations in terms of columns to be added to your masking rules. Here I’ll add a mask for customers phone1 field by clicking the add mask button. Now I will click on the rule so we can have a look at it in more detail. All I can change here is the masking field format. We can see there are some pre-set masks like the ubiquitous credit card and email plus a custom string mask. As phone1 is a string column the random number option is disabled. I’ll create a custom mask showing the first two digits or characters of the phone number and the last two, hiding everything in between. Next, I’ll click update, close the edit window and click save. Switching over to SQL Server management studio and executing a select on the customers table we can see the masked phone1 data returned. I have connected to the database as the sales support user which has only select permissions. The default data masking rule behavior is to exclude the database and server admin roles. Both of these roles plus the SQL security manager role can configure data masking rules. Now, I’ll add a couple of rules using the preconfigured masks. First, I’ll mask the email, then I’ll create a new mask on the cost column of the items table using the random number mask. When I query both of these tables, we can see the data has been correctly masked. I’ll go back to the portal and add the sales support user to be excluded from the data masking, not forgetting to click save. Now when I execute the queries, I can see the unmasked data. Having demonstrated masked and unmasked data with the sales support user, I now want to include them back into the data masking rules. Using PowerShell I’ll get the current masking policy with
Get-AzSqlDatabaseDataMaskingPolicy, which takes the resource group name, the server name, and the database name as parameters and returns the data masking state, either enabled or disabled and privileged users, which are those excluded from the data masking rules. I’ll include sales support back in the masking policy by executing the set-AZSqlDatabaseDataMaskingPolicy command, setting privileged users as an empty string. Having executed these commands we can go back to the portal and see that sales support has now disappeared from the excluded users field, and when I execute the select statements, we can see that the data is masked again.
Hallam is a software architect with over 20 years experience across a wide range of industries. He began his software career as a Delphi/Interbase disciple but changed his allegiance to Microsoft with its deep and broad ecosystem. While Hallam has designed and crafted custom software utilizing web, mobile and desktop technologies, good quality reliable data is the key to a successful solution. The challenge of quickly turning data into useful information for digestion by humans and machines has led Hallam to specialize in database design and process automation. Showing customers how leverage new technology to change and improve their business processes is one of the key drivers keeping Hallam coming back to the keyboard.