Adding Transformation Logic Demo
Start course
Difficulty
Intermediate
Duration
1h 5m
Students
1272
Ratings
4.7/5
starstarstarstarstar-half
Description

In this course, we're going to review the features, concepts, and requirements that are necessary for designing data flows and how to implement them in Microsoft Azure. We’re also going to cover the basics of data flows, common data flow scenarios, and what all is involved in designing a typical data flow.

Learning Objectives

  • Understand key components that are available in Azure that can be used to design and deploy data flows
  • Know how the components fit together

Intended Audience

This course is intended for IT professionals who are interested in earning Azure certification and for those who need to work with data flows in Azure.

Prerequisites 

To get the most from this course, you should have at least a basic understanding of data flows and what they are used for.

Transcript

Hello and welcome back. So, in the last demonstration which was a little bit longer than I wanted it to be, we went through and created the first bit of transformation logic. What we're going to do in this demonstration is perform a little bit more transformation. Basically what we're going to do is add an aggregate modifier. What we're going to do here is configure an aggregate transformation. So, to add that aggregate transformation, what we do is click the plus button here next to our FilterYears transformation. And then again we can search or we just select aggregate here. 

Now what we'll do is we'll call this transformation AggregateComedyRatings. So, we'll call it Aggregate ComedyRatings. Again we're following Microsoft's official tutorial here. So, we're trying to stay within their naming conventions. And then what we're going to do here is we're going to group these by the year they came out. And then for the aggregates, we'll call our aggregate column AverageComedyRating. And then what we'll do is for the expression, we'll just do an average integer for the ratings. 

And then from here, we'll go to our data preview. And we can refresh here. And what we're going to get here are two columns: year and AverageComedyRating. Now at this point, we want to add one more transformation. And that's a sink transformation. Basically, this is where we want our transformed data to be stored or dumped into, I guess. Sunk, sink, get it? So,

we'll click our Plus button here and we'll search for sink. Again destination, that's where everything's going to be stored at the end of the transformation, at the end of the data flow. And we'll call it Sink. We'll be simple here. If I can get the end right... And then we need to create the new dataset. This is the final dataset where everything's being written to. And this is going to go to Azure Data Lake Storage Gen2. I'm going to continue. It's going to be delimited text. And we'll just call the new set, MoviesSink. 

And we already have a linked service. We're going to select the ADLS Gen2 that we created earlier in our demonstrations. And then what we're going to do here is store it in the same container, sample data. But we're going to create a new directory called output. So, the file path will be sample-data. But this time it'll go to an output folder. Now, this folder does not have to already exist. If it doesn't exist, this will create it when it runs. 

And as was the case with the original data, we do have the first row as a header, but we're not importing any schema at this point, we're just writing it out. We're not importing anything. So, we're just going to write out what we have and then we can okay it. And let's take a look at our data preview. And this is what we're going to have here. So, now that we've finished the data flow here, we can run it in our pipeline. And we'll do that in the next demonstration. So, let's call it a wrap here, and I'll see you over there.

 

About the Author
Students
84510
Courses
82
Learning Paths
62

Tom is a 25+ year veteran of the IT industry, having worked in environments as large as 40k seats and as small as 50 seats. Throughout the course of a long an interesting career, he has built an in-depth skillset that spans numerous IT disciplines. Tom has designed and architected small, large, and global IT solutions.

In addition to the Cloud Platform and Infrastructure MCSE certification, Tom also carries several other Microsoft certifications. His ability to see things from a strategic perspective allows Tom to architect solutions that closely align with business needs.

In his spare time, Tom enjoys camping, fishing, and playing poker.