Efficiently Storing Data in S3 for Data Analytics Solutions

Lab Steps

lock
Connecting to the Virtual Machine using EC2 Instance Connect
lock
Transferring Data from EC2 to S3
lock
Partitioning Data in S3 for Use with Athena
lock
Converting Data Files in S3 for Use With Athena
lock
Compressing Data Files in S3 for Use With Athena

The hands-on lab is part of this learning path

Ready for the real environment experience?

DifficultyBeginner
Time Limit1h 15m
Students108
Ratings
5/5
starstarstarstarstar

Description

Amazon S3 is a fully-managed service for storing data in the cloud. S3 frees you from managing servers, NAS and SAN devices, and from worrying about individual physical disks.

S3 is very flexible and because of that, it is used in a large number of different types of solutions. When building a solution in AWS and you need storage, S3 is likely the best option when considering cost and performance. If you are using S3 with AWS Data Analytics services, there are a number of things you should be aware of to minimize costs and maximize the performance of your Data Analytics solution.

In this lab, you will create data, store it in S3, and transform the data to be more performant and cost-efficient.

Learning Objectives

Upon completion of this beginner level lab you will be able to:

  • Use the AWS command-line tool to copy data from an EC2 instance to S3
  • Partition data files in S3
  • Compress data to reduce costs
  • Convert data into different formats to reduce costs and maximize performance

Intended Audience

  • Candidates for the AWS Certified Data Analytics Speciality exam
  • Data Engineers
  • Cloud Engineers

Prerequisites

Experience with S3 and the Linux command-line will be beneficial but is not required.

The following courses can be used to fulfill the prerequisites:

Environment before
PREVIEW
arrow_forward
Environment after
PREVIEW
About the Author

Andrew is a Labs Developer with previous experience in the Internet Service Provider, Audio Streaming, and CryptoCurrency industries. He has also been a DevOps Engineer and enjoys working with CI/CD and Kubernetes. He holds the AWS Certified Developer - Associate certification.