Data Flow Basics
Data Flow Components
Building a Dataflow with Azure Data Factory
The course is part of these learning pathsSee 3 more
In this course, we're going to review the features, concepts, and requirements that are necessary for designing data flows and how to implement them in Microsoft Azure. We’re also going to cover the basics of data flows, common data flow scenarios, and what all is involved in designing a typical data flow.
- Understand key components that are available in Azure that can be used to design and deploy data flows
- Know how the components fit together
This course is intended for IT professionals who are interested in earning Azure certification and for those who need to work with data flows in Azure.
To get the most from this course, you should have at least a basic understanding of data flows and what they are used for.
Hello, and welcome to Data Lifecycle. In this next lecture, we’re going to discuss the data lifecycle. It’s important to understand the data lifecycle because the different stages of the lifecycle affect data flow.
Overall, there are really five key stages in the data lifecycle. You have the initial Collection of the data, the Preparation of the collected data, Ingestion of the data into storage, Processing or transformation of the data into usable form, and then you have analysis of the transformed data.
During collection, data is acquired from other processes or maybe even user input. Such data might be in varied formats, or it may be unstructured. Preparation of the collected data may or may not happen next, depending on the process. In cases where ETL is in play, and data needs to be transformed before it is ingested or loaded, there is certainly a preparation step that occurs. Once data has been collected and prepared, it needs to be ingested into storage. In the context of this discussion, the data would typically be ingested into cloud storage. Once the data has been ingested into storage, it needs to be processed or, if ELT is being used, transformed into a usable format. Finally, one the data has progressed through all previous stages of the lifecycle, it can be analyzed and interpreted.
With the typical data lifecycle in mind, you then need to think about some things as you design a data flow that encompasses this data. You need to think about where the data is coming from, and in what format it’s arriving. You need to determine how it needs to be transformed and if so, how that needs to be done. You need to think about the ultimate destination of this data and its analysis. Where is the data going and what questions does it need to answer?
Only after considering all these concepts can you begin to formulate a data flow plan and come up with data flow requirements.
Tom is a 25+ year veteran of the IT industry, having worked in environments as large as 40k seats and as small as 50 seats. Throughout the course of a long an interesting career, he has built an in-depth skillset that spans numerous IT disciplines. Tom has designed and architected small, large, and global IT solutions.
In addition to the Cloud Platform and Infrastructure MCSE certification, Tom also carries several other Microsoft certifications. His ability to see things from a strategic perspective allows Tom to architect solutions that closely align with business needs.
In his spare time, Tom enjoys camping, fishing, and playing poker.