image
Enriching Metadata
Start course
Difficulty
Intermediate
Duration
45m
Students
584
Ratings
4.7/5
Description

In this course, users will explore the suite of tools available in Microsoft Purview for registering and scanning data sources, connecting a business glossary, searching the data catalog, and customizing metadata with enrichments and classifications. In addition, this course will review some of the management and administrative functionality in Purview, including creating roles, managing authorizations, and using the Apache Atlas API for custom implementations. This course will also review deployment best practices and network security considerations. By completing this course, users will have a strong understanding of the suite of functionality currently available in Purview and how these tools support a larger governance initiative within an organization.  

Learning Objectives

  • Provision and install Microsoft Purview
  • Create and manage a role
  • Register and scan data sources
  • Create a business glossary
  • Enrich metadata with classifications
  • Review data lineage tooling
  • Understand deployment best practices
  • Take network security considerations into account

Intended Audience

This course is designed for individuals who are responsible for setting up, monitoring, or exploring data catalog and governance programs within their organization.  

Prerequisites 

To get the most from this course, you should have some familiarity and experience with governance tooling as well as a basic understanding of the Azure portal.

Transcript

Enriching Metadata. As we just learned, we can discover assets in the Microsoft Purview data catalog by either browsing the data catalog or searching the data catalog. Once we find the asset we're looking for, we can view all of its details. Let's look at the details available for each asset. The Overview tab shows an asset's basic details like description, classification, hierarchy, and glossary terms. The Properties tab contains the technical metadata and relationships discovered in the data source. The Schema tab shows the schema of the asset, including column names, data types, column level classifications, terms, and descriptions. The Lineage tab contains lineage graph details for assets where it's available. 

The Context tab contains any assigned owners and experts. The Related tab lets us navigate through the technical hierarchy of assets that are related to the current asset we are viewing. Elsewhere on the Overview tab, we can view the full asset hierarchy. As an example, if we had selected an asset that was a SQL table, this hierarchy view would show us the schema, database, and the server this table belongs to. On the bottom of the page is asset classification. These identify the kind of data being represented and are applied manually or during a scan. 

For example, a passport number is a supported classification. The Overview tab reflects both asset level classifications and column level classifications that have been applied, which we can also view as part of the schema. Asset glossary terms are a managed vocabulary for business terms that can be used to categorize or relate assets across our environment. For example, terms like customer, buyer, cost center or any terms that give us data context for our users. We can view the glossary terms for an asset in the overview section, and we can add a glossary term on an asset by editing the asset. We can edit an asset by selecting the edit icon on the top left corner of the asset. 

At the asset level, we can edit or add a description, classification, or glossary term by staying on the Overview tab. We can navigate to the schema tab of the edit screen to update column names, data types, column level classifications terms, or asset descriptions. On the Contact tab of the edit screen, we can update owners and experts linked to the asset. We can search by full name or email to find the relevant person within our Azure Active Directory. If we edit an asset by adding the description, asset level classification, glossary term, or a contact, later scans will still update the asset schema. If we make some column level updates, like adding the description, column level classifications, or glossary terms, subsequent scans will also update the asset schema. Importantly, if we update the name or data type of a column in a Microsoft Purview asset, later scans will not update the asset schema. New columns and classifications will not be detected.

 

About the Author
Students
1894
Courses
3

Steve is an experienced Solutions Architect with over 10 years of experience serving customers in the data and data engineering space. He has a proven track record of delivering solutions across a broad range of business areas that increase overall satisfaction and retention. He has worked across many industries, both public and private, and found many ways to drive the use of data and business intelligence tools to achieve business objectives. He is a persuasive communicator, presenter, and quite effective at building productive working relationships across all levels in the organization based on collegiality, transparency, and trust.