1. Home
  2. Training Library
  3. Microsoft Azure
  4. Courses
  5. Designing Data Flows in Azure

Azure HDInsight

Start course
1h 10m

This Designing Data Flows in Azure course will enable you to implement the best practices for data flows in your own team. Starting from the basics, you will learn how data flows work from beginning to end. Though we do recommend an idea of what data flows are and how they are used, this course contains some demonstration lectures to really make sure you grasp the concept. By better understanding the key components available in Azure to design and deploy efficient data flows, you will be allowing your organization to reap the benefits.

This course is made up of 19 comprehensive lectures, including an overview, demonstrations, and a conclusion.

Learning Objectives

  • Review the features, concepts, and requirements that are necessary for designing data flows
  • Learn the basic principles of data flows and common data flow scenarios
  • Understand how to implement data flows within Microsoft Azure

Intended Audience

  • IT professionals who are interested in obtaining an Azure certification
  • Those looking to implement data flows within their organizations


  • A basic understanding of data flows and their uses

Related Training Content

For more training content related to this course, visit our dedicated MS Azure Content Training Library.


In this lecture, I want to talk a little bit about HDInsight. Azure HDInsight is an offering that's built for big data analysis. It's a fully managed, open-source analytics service that's been designed from the ground up for enterprises. It's a cloud service that allows for fast, easy, and cost-effective processing of large amounts of data. And it supports many scenarios, including ETL, data warehousing, machine learning, and even IoT. HDInsight is a solution that's going to work in virtually all those scenarios. It also supports multiple cluster types, including Hadoop, Spark, and more. The role that HDInsight plays in a data flow is the engine if you will. It's going to power transformations that need to be completed. Using HDInsight solutions offers the ability to seamlessly integrate with many different Azure data stores and services. 

Such services include things like SQL Data Warehouse, Azure Cosmos DB, and Data Lake Storage. HDInsight also integrates with other offerings, such as Blob Storage, Event Hubs, and Data Factory. Because HDInsight cannot be paused, you might just want to create an instance of the service when some sort of analysis or transformation work is required and then delete it when the work is complete. This would obviously be part of some sort of automation process and would typically be something that you do when processing data infrequently in batches. When designing a data flow that incorporates an HDInsight cluster, the programming languages that you use in other parts of the data flow are going to be dictated, in part, by the cluster type that you deploy. For example, if you use Spark, you might consider using Scala, Python, Java, etc. That being said, it takes us back to one of the initial questions we touched on earlier in the course. What skill sets do I have in-house? When designing a data flow that includes HDInsight, be sure to choose technologies and services that are supportable by your staff.

About the Author
Learning Paths

Tom is a 25+ year veteran of the IT industry, having worked in environments as large as 40k seats and as small as 50 seats. Throughout the course of a long an interesting career, he has built an in-depth skillset that spans numerous IT disciplines. Tom has designed and architected small, large, and global IT solutions.

In addition to the Cloud Platform and Infrastructure MCSE certification, Tom also carries several other Microsoft certifications. His ability to see things from a strategic perspective allows Tom to architect solutions that closely align with business needs.

In his spare time, Tom enjoys camping, fishing, and playing poker.