1. Home
  2. Training Library
  3. Amazon Web Services
  4. Courses
  5. AWS Databases used with Data Analytics

DEMO: Creating an Amazon Redshift Cluster

Contents

keyboard_tab
Introduction
1
Course Introduction
PREVIEW1m 39s

The course is part of this learning path

play-arrow
Start course
Overview
DifficultyIntermediate
Duration1h 6m
Students81
Ratings
5/5
starstarstarstarstar

Description

This course introduces a number of different AWS database services that are commonly used with data analytics solutions and that will likely be referenced within the AWS Data Analytics Specialty certification.

As such, this course explores the following database services: Amazon RDS, Amazon DynamoDB, Amazon ElastiCache, Amazon Redshift. There are also guided demos on the AWS platform showing you exactly how to use each of these services.

If you have any questions relating to this course, feel free to contact us at support@cloudacademy.com.

Learning Objectives

  • Gain an overall understanding of the different database services available in AWS, and which service would be best suited to your needs
  • Learn how to create databases with both Amazon RDS and Amazon DynamoDB
  • Learn how to create clusters with Amazon ElastiCache and Amazon Redshift

Intended Audience

This course has been designed to help those who are preparing to take the AWS Data Analytics Specialty certification.

Prerequisites

As a prerequisite to this course, you should have a basic understanding of database architectures and of the AWS global architecture. For more info on the latter, please see our existing blog post hereYou should also have a general understanding of the principles behind different EC2 instance families.

Transcript

Hello and welcome to this lecture. This is going to be a quick demonstration on how to set up an Amazon Redshift cluster. So as you can see, I'm in the AWS management console at the moment, and to find Redshift, we can scroll down to the database category and we can see Amazon Redshift here. So if we select on that, we're then taken to this splash screen here. So I don't have any Redshift clusters at the moment.

So to start with, all I need to do is to click on the orange create cluster button, and this takes us to the configuration page. And the first thing we need to do is give it a name. So I'm just going to call this my cluster. Then we can choose our node type. So we have the RA3 nodes and the dense compute nodes here. Now, AWS recommends the RA3 nodes due to the high performance and the managed storage aspect, or you have the dense compute nodes here.

Now, for this demonstration, I'm just going to select a dense compute node. Now, if you scroll down, we can select the number of nodes that we would like. As you can see, it ranges from one to 32, so we can scroll up or down. And also if you see in this configuration summary section the cost per month, and the total compressed storage will also change with the amount of nodes that I select. So there you can see it is scrolling up and down. I'm just going to leave it as two nodes.

Now I can scroll down to the database configurations. We can give the database a name, and also the port that it's going to be using, and also a master username and password. So let me just enter a password. And then we have cluster permissions. Now, this is an optional step. So if you want your AWS Redshift cluster to interact with other AWS services on your behalf, for example, maybe Amazon S3, you might want to import data, then you can associate an IAM role that has access to S3 to allow that process to happen. But as I said, this is an optional component.

Now, at the very bottom here, we have additional configuration. Now, these are the default settings. So we have a default network, default backup options, maintenance, default security groups, and also a parameter group, as well. But if you turn off those default settings, then you can go through and modify any of those components. For example, network and security. You can select the VPC for it to run in. You can select the security groups that are associated with your clusters to define what resources can access it. You can also define a subnet group which defines what subnets that the clusters will be launched in and also any availability zones. You can also specify if you want any cluster traffic to purely route through your VPC and if you want your cluster to be publicly accessible or not. So there's a few network and security features that you can change there.

Looking at database configurations, here you can select a parameter group if you have any configured, and you can also configure any encryption using AWS KMS, and if you want to use the default Redshift key, or if you want to use one of your own CMKs, for example, I have a CMK here in my account. For this demonstration, I'm just gonna disable encryption.

Under maintenance, you can set a maintenance window so that the day and time of the week that any maintenance will be carried out to your cluster. And also you can specify which cluster version you'd like, and you have three options. Either use the most current approved cluster version, use the cluster version before the current version, or use the cluster version with beta releases of new versions. I'll just leave that as current.

Under monitoring, you can have CloudWatch alarms. So for example, you can create a new alarm for disk usage threshold when that reaches 80%, and then you can notify people via an SNS topic that you might already have configured. I'll say no alarms. And finally, backup. And also you can specify your snapshot retention, which is how long you'll keep the backups for. And finally, if you want to configure cross-region snapshot, you can either enable that or disable it. And this will back up your cluster to a different region. So if you enable it, you can then select an alternate region to where your cluster currently resides. I'm just going to disable that.

So there are the different options that are available, but I'm just going to select the defaults that it already suggested. And then once you're happy with your settings, simply click create cluster. As we can see here now, it's now creating our cluster. This might take a few minutes, so I'll come back when that's done.

Okay, as you can see, the cluster is now available. If we select the dashboard, then we can see that we have one new cluster in the Ireland region with two nodes, and we can see that it's already taken an automated snapshot, as well.

So cluster overview here, so we can see a number of queries, any database connections, disk space used, CPU utilization. As you can see, there's not much going on at the moment. We've simply just created it. If we have any alarms, and down here, any events, and also a query overview here. So I won't go into any more detail than that.

This is just a very high-level, quick introduction on how to create an Amazon Redshift cluster. And that's it.

Lectures

Course Introduction - Amazon Relational Database Service - DEMO: Creating an Amazon RDS Database - Amazon DynamoDB - DEMO: Creating a DynamoDB Database - Amazon ElastiCache - DEMO: Creating an ElastiCache Cluster - Amazon Redshift

About the Author
Students113956
Labs1
Courses95
Learning paths63

Stuart has been working within the IT industry for two decades covering a huge range of topic areas and technologies, from data center and network infrastructure design, to cloud architecture and implementation.

To date, Stuart has created 90+ courses relating to Cloud reaching over 100,000 students, mostly within the AWS category and with a heavy focus on security and compliance.

Stuart is a member of the AWS Community Builders Program for his contributions towards AWS.

He is AWS certified and accredited in addition to being a published author covering topics across the AWS landscape.

In January 2016 Stuart was awarded ‘Expert of the Year Award 2015’ from Experts Exchange for his knowledge share within cloud services to the community.

Stuart enjoys writing about cloud technologies and you will find many of his articles within our blog pages.