In this course, we're going to review the features, concepts, and requirements that are necessary for designing data flows and how to implement them in Microsoft Azure. We’re also going to cover the basics of data flows, common data flow scenarios, and what all is involved in designing a typical data flow.
Learning Objectives
- Understand key components that are available in Azure that can be used to design and deploy data flows
- Know how the components fit together
Intended Audience
This course is intended for IT professionals who are interested in earning Azure certification and for those who need to work with data flows in Azure.
Prerequisites
To get the most from this course, you should have at least a basic understanding of data flows and what they are used for.
Another offering that often finds its way into many data flows is Azure Synapse Analytics. In this lesson, we’ll take a quick look at what it is.
Azure Synapse is a cloud-based analytics service in Azure that combines enterprise data warehousing and Big Data analytics. This service allows you to query data using either serverless on-demand or provisioned resources at scale. At a high level, what Azure Synapse does it allow you to ingest, prepare, manage, and serve data for immediate business intelligence and machine learning needs.
Azure Synapse Analytics combines several technologies into one offering. It combines SQL technologies that are often used in enterprise data warehousing, Spark technologies that are used for big data, Data Explorer for log and time series analytics, and Pipelines for data integration, ETL, and ELT. It also integrates with other Azure offerings, like Power BI and Cosmos DB.
Azure Synapse allows you to import big data and to run high-performance analytics on that data. Because data warehousing is a huge piece of any cloud-based big data solution, Azure Synapse plays a key role.
For example, in a typical cloud-based big data solution, the data is first ingested into big data stores from various sources. Once the data makes its way into the big data store, it’s prepared and trained by a solution like Hadoop, Spark, and machine learning algorithms.
After being prepared and trained, the data is then ready for analysis. At this point, the dedicated SQL pool queries the big data stores, using PolyBase, which, in turn, uses standard T-SQL queries to pull the data into dedicated SQL pool tables.
The SQL pool in Synapse stores the data in relational tables with columnar storage, which reduces data storage costs, while also improving query performance. You can then run analytics on the stored data at a massive scale. Such analysis queries will often complete in seconds, instead of minutes, when compared to traditional database systems.
Results of the analysis can be sent to reporting databases or applications all over the world, where business analysts will use the information to make better-informed business decisions. If you are interested in reading about the nuts and bolts of the architecture of Azure Synapse Analytics, visit the URL that you see on your screen.
Tom is a 25+ year veteran of the IT industry, having worked in environments as large as 40k seats and as small as 50 seats. Throughout the course of a long an interesting career, he has built an in-depth skillset that spans numerous IT disciplines. Tom has designed and architected small, large, and global IT solutions.
In addition to the Cloud Platform and Infrastructure MCSE certification, Tom also carries several other Microsoft certifications. His ability to see things from a strategic perspective allows Tom to architect solutions that closely align with business needs.
In his spare time, Tom enjoys camping, fishing, and playing poker.