AWS Control Tower
AWS Systems Manager
AWS Health Dashboard
Improve Planning and Cost Control with AWS Budgets
AWS Cost Management: Tagging
AWS Data Pipeline vs. AWS Glue
Finding Compliance Data with AWS Artifact
The course is part of this learning path
Instructor: Alana Layton
AWS Glue DataBrew vs. Glue Studio
A few years ago, Glue released another transformation tool called Glue DataBrew. On the surface, DataBrew looks very similar to Glue Studio. So, what is Glue DataBrew?
Glue DataBrew is a true no-code service for transforming data. Here’s how it works:
You first upload your data. You can upload it directly to the service, or connect to other data sources like Amazon S3, Amazon Aurora, Amazon Redshift, Glue Data Catalog, or other JDBC Connections. It can additionally connect to AppFlow, Data Exchange and Snowflake.
Once you upload your data, you can preview your data in a visual interface. From there you can choose from hundreds of built-in transformations. Some of these transformations include formatting your data, modifying columns, working with duplicate or missing values, encoding data, and more.
Once you apply your transformation, you can store the output in Amazon S3. Note that Amazon S3 is the only place you can store your transformed data.
Glue DataBrew vs Glue Studio
So if both of these services provide transformations, function in similar ways, and if Glue Data Studio also provides some no-code options, which service do you use?
Well, there are a four main differences between the two that might help you distinguish when to use each service:
1. No-Code vs Custom Code
Glue DataBrew is a no-code tool. Unlike Glue Studio, you can’t write your own custom code for transformations even if you wanted to. However, that means that DataBrew provides a lot more options for built-in transformations. DataBrew has over 250+ built-in transformations, while Glue Studio has around 10. These transformations are different as well. Glue Studio built-in transformations focus mostly on ETL, while DataBrew's transformations mostly prepare data for machine learning.
2. Different Tools for Different Users
These services are meant for different audiences. Glue Studio is meant for ETL engineers and is focused on ETL itself, while Glue DataBrew is mostly for business analysts and data scientists that may not have coding experience. You don’t need specialized expertise to transform data with DataBrew.
3. Programmatic Creation of ETL Jobs
Both services provide a graphical interface for visualizing your transformations. Glue Studio, however, is the only option that provides programmatic opportunities for working with ETL through Jupyter notebooks and shell scripts.
4. Data Profiling
DataBrew has a profiling feature, which enables you to get statistics about your data. For example, with profiling, you can get information about how many rows you have in your data set or how many unique values you have in each column. Glue Studio does not have a data profiling feature.
That’s it for this one - see you next time!
This section provides detail on the AWS management services relevant to the Solution Architect Associate exam. These services are used to help you audit, monitor and evaluate your AWS infrastructure and resources. These management services form a core component of running resilient and performant architectures.
- Understand the benefits of using AWS CloudWatch and audit logs to manage your infrastructure
- Learn how to record and track API requests using AWS CloudTrail
- Learn what AWS Config is and its components
- Manage your accounts with AWS Organizations, including single sign-on with AWS SSO
- Learn how to carry out logging with CloudWatch, CloudTrail, CloudFront, and VPC Flow Logs
- Understand how to design cost-optimized architectures in AWS
- Learn about AWS data transformation tools such as AWS Glue and data visualization services like Amazon Athena and QuickSight
Stuart has been working within the IT industry for two decades covering a huge range of topic areas and technologies, from data center and network infrastructure design, to cloud architecture and implementation.
To date, Stuart has created 150+ courses relating to Cloud reaching over 180,000 students, mostly within the AWS category and with a heavy focus on security and compliance.
Stuart is a member of the AWS Community Builders Program for his contributions towards AWS.
He is AWS certified and accredited in addition to being a published author covering topics across the AWS landscape.
In January 2016 Stuart was awarded ‘Expert of the Year Award 2015’ from Experts Exchange for his knowledge share within cloud services to the community.
Stuart enjoys writing about cloud technologies and you will find many of his articles within our blog pages.