In this course, users will explore the suite of tools available in Microsoft Purview for registering and scanning data sources, connecting a business glossary, searching the data catalog, and customizing metadata with enrichments and classifications. In addition, this course will review some of the management and administrative functionality in Purview, including creating roles, managing authorizations, and using the Apache Atlas API for custom implementations. This course will also review deployment best practices and network security considerations. By completing this course, users will have a strong understanding of the suite of functionality currently available in Purview and how these tools support a larger governance initiative within an organization.
Learning Objectives
- Provision and install Microsoft Purview
- Create and manage a role
- Register and scan data sources
- Create a business glossary
- Enrich metadata with classifications
- Review data lineage tooling
- Understand deployment best practices
- Take network security considerations into account
Intended Audience
This course is designed for individuals who are responsible for setting up, monitoring, or exploring data catalog and governance programs within their organization.
Prerequisites
To get the most from this course, you should have some familiarity and experience with governance tooling as well as a basic understanding of the Azure portal.
Explore Management Functionality. The management portal within Purview provides a place to review and set up various aspects of the catalog, including lineage connections, workflows, the Purview instance name, as well as security and access. Let's take a look at these features in more detail. Data integration and ETL tools can push lineage into Microsoft Purview at execution time. Tools such as Data Factory, Data Share, Synapse, Azure Data Bricks, and so on belong to this category of data processing systems. Lineage connections for Data Factory and Data Share are originated in the lineage connection section of the management pane. Microsoft Purview admins can use Azure monitor to track the operational state of Microsoft Purview accounts.
Metrics are collected to provide data points for us to track potential problems, troubleshoot, and improve the reliability of the Microsoft Purview account. The list of available metrics are shown on the screen. Workflows are automated repeatable business processes that users can create within Microsoft Purview to validate and orchestrate create, update, delete, operations on their data entities. Enabling these processes allows organizations to track changes and force policy compliance and ensure quality data across their data landscape. Since the workflows are created and managed within Microsoft Purview, manual change monitoring or approval are no longer required to ensure quality updates to the data catalog.
For example, if a user attempts to delete a business glossary term that is bound to a workflow, when the user submits this operation, the workflow runs through its actions instead of or before the original delete operation. Currently, there are two kinds of workflows, data governance for data policy, access governance and loss prevention, which is scoped at the collection level, and data catalog, which is used to manage approvals for create, update, delete operations for glossary terms. These are scoped at the glossary level. These workflows can be built from pre-established workflow templates provided in the Microsoft Purview governance portal, but are fully customizable using the available workflow connectors. A credential is authentication information that Microsoft Purview can use to authenticate to our registered data sources.
A credential object can be created for various types of authentication scenarios, such as basic authentication requiring a user name and password. Credentials capture specific information required to authenticate based on the chosen type of authentication method. Credentials use our existing Azure key vault secrets for retrieving sensitive authentication information during the credential creation process. We can use Azure private endpoints for our Microsoft Purview accounts to allow users on a virtual network to securely access the catalog over a private link. A private endpoint uses an IP address from the Vnet address space for our Microsoft Purview account. Network traffic between the client on the Vnet and the Microsoft Purview account traverses over the Vnet and a private link on the Microsoft backbone network.
Steve is an experienced Solutions Architect with over 10 years of experience serving customers in the data and data engineering space. He has a proven track record of delivering solutions across a broad range of business areas that increase overall satisfaction and retention. He has worked across many industries, both public and private, and found many ways to drive the use of data and business intelligence tools to achieve business objectives. He is a persuasive communicator, presenter, and quite effective at building productive working relationships across all levels in the organization based on collegiality, transparency, and trust.