image
Azure Data Factory
Start course
Difficulty
Beginner
Duration
1h 10m
Students
7880
Ratings
4.4/5
Description

This Designing Data Flows in Azure course will enable you to implement the best practices for data flows in your own team. Starting from the basics, you will learn how data flows work from beginning to end. Though we do recommend an idea of what data flows are and how they are used, this course contains some demonstration lectures to really make sure you grasp the concept. By better understanding the key components available in Azure to design and deploy efficient data flows, you will be allowing your organization to reap the benefits.

This course is made up of 19 comprehensive lectures, including an overview, demonstrations, and a conclusion.

Learning Objectives

  • Review the features, concepts, and requirements that are necessary for designing data flows
  • Learn the basic principles of data flows and common data flow scenarios
  • Understand how to implement data flows within Microsoft Azure

Intended Audience

  • IT professionals who are interested in obtaining an Azure certification
  • Those looking to implement data flows within their organizations

Prerequisites

  • A basic understanding of data flows and their uses

Related Training Content

For more training content related to this course, visit our dedicated MS Azure Content Training Library.

Transcript

As far as big data goes, unorganized raw data is going to often be stored in multiple storage systems, including relational and non-relational systems. The raw data on its own doesn't have much context, nor does it offer any meaningful insight. As such, it requires services to orchestrate and operationalize processes that can refine that data and convert it into useful business information. Azure Data Factory is an Azure offering that's built for complex hybrid ETL, ELT, and data integration projects. To illustrate the usefulness of Azure Data Factory, consider a casino that collects petabytes and petabytes of game logs from its slot machines on the floor. The casino needs to analyze these logs in order to gain insight into customer gameplay, customer demographics, and other useful information. The casino wants to identify reward offers that match the players and develop new games that drive growth and provide a better experience to its players. In order to analyze its gaming logs, the casino needs to leverage data such as customer info, game selection info, and marketing campaign information that is hosted in their on-prem data store. 

The casino wants to utilize this on-prem data, combining it with additional information that it has in a cloud data store. To extract meaningful information from this data, the casino needs to process the joined data by using a Spark cluster, such as Azure HDInsight, and then it wants to publish the transformed data to a cloud data warehouse such as Azure SQL Data Warehouse. Doing so will allow the casino to build a report on top of the data. The casino wants to automate the workflow, monitoring and managing it on a daily schedule. The entire data flow process needs to kick off when files land in the blob store container. For this example, Azure Data Factory can be leveraged because it's a cloud-based data integration service. It allows organizations to create data-driven workflows in the cloud for orchestrating and automating data movement and for data transformation. By using leveraging Azure Data Factory, the casino can create and schedule pipelines, or data-driven workflows, that can ingest data from different data stores. Azure Data Factory can process and transform the casino's data by using several different compute services, including Azure HDInsight, Hadoop, Spark, Azure Data Lake Analytics, and even Azure Machine Learning. The output data can then be published to a data store like Azure SQL Data Warehouse, where BI applications can consume it. When all is said and done, Azure Data Factory allows the casino to organize their raw data into meaningful data stores and data lakes for better business decisions.

About the Author
Students
90159
Courses
89
Learning Paths
56

Tom is a 25+ year veteran of the IT industry, having worked in environments as large as 40k seats and as small as 50 seats. Throughout the course of a long an interesting career, he has built an in-depth skillset that spans numerous IT disciplines. Tom has designed and architected small, large, and global IT solutions.

In addition to the Cloud Platform and Infrastructure MCSE certification, Tom also carries several other Microsoft certifications. His ability to see things from a strategic perspective allows Tom to architect solutions that closely align with business needs.

In his spare time, Tom enjoys camping, fishing, and playing poker.