This section of the Solution Architect Associate learning path introduces you to the AWS database services relevant to the SAA-C03 exam. We then understand the service options available and learn how to select and apply AWS database services to meet specific design scenarios relevant to the Solution Architect Associate exam.
Want more? Try a lab playground or do a Lab Challenge!
- Understand the various database services that can be used when building cloud solutions on AWS
- Learn how to build databases using Amazon RDS, DynamoDB, Redshift, DocumentDB, Keyspaces, and QLDB
- Learn how to create Elasticache and Neptune clusters
- Understand AWS database costs
- Learn about data lakes and how to build a data lake in AWS
Hello, and welcome to this short lecture, in which we'll look into the final database service of this course series, Amazon Keyspaces for Apache Cassandra. Firstly, let's answer the question that some people ask when seeing this service. What is Apache Cassandra? To summarize it quickly, Wikipedia explains that "Apache Cassandra is a free, open-source, distributed, wide column store, NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure."
So now we have a high-level awareness of Amazon Cassandra. Let's see how Amazon Keyspaces fits into this. Keyspaces is a serverless, fully-managed service designed to be highly scalable, highly available, and importantly, compatible with Apache Cassandra, meaning you can use all the same tools and code as you do normally with your existing Apache Cassandra databases.
Being a serverless service. It removes the need for you to provision, patch, and manage instances yourself. Instead, all of this is taken care of by AWS on your behalf. Boasting unlimited throughput, Amazon Keyspaces is designed for massive scale solutions, allowing you to service business-critical workloads requiring thousands of requests per second. The key features of Amazon Keyspaces is that it can offer extreme performance, scalability, and elasticity, and grows at the rate of demand for your applications, ensuring you only pay for what you use.
Traditionally, Cassandra architectures are comprised of a cluster of nodes, which have to be created, provisioned, managed, patched, and backed up by you. As your Cassandra database grows, so does the amount of nodes, leading to greater administrative resources in managing the infrastructure. Using Amazon Keyspaces removes the need for you to manage this infrastructure, and instead you can focus on the business logic of the database and your applications that interact with it to ensure you are getting the best performance possible.
Amazon Keyspaces is a great choice if you're looking to build applications where low latency is essential, for example, route optimization applications or trade monitoring. And of course, if you're looking for an easier way of managing your existing Cassandra databases prices in the cloud without the burden of maintaining your own infrastructure.
To help understand the service in greater detail, let's look at some of the components of the service.
First let me explain the difference between keyspaces and tables. In Cassandra, a keyspace is essentially a grouping of tables that are related and are used by your applications to read and write data. Also, the keyspace in Cassandra also helps to define how your tables are replicated across multiple nodes in the cluster. However, because Amazon Keyspaces is fully managed and serverless, the entire storage layer is abstracted from being administered and configured by us as customers. Instead, it is managed by AWS. And so here, the keyspace component in Amazon Keyspaces exist in their logical meaning rather than holding the responsibility for us to manage any kind of replication.
Tables are where your database writes are stored, effectively, the data that is held within your database. In each table, there will be a primary key that consists of a partition key and one or more columns. When a new table is created, encryption at rest is automatically enabled, and any clients that want to connect to your tables will require a transport layer security connection for encrypted in transit connectivity.
In the next lecture, I will show you how to set up a keyspace and then a table that will reside within that keyspace. Much like Amazon DynamoDB, Keyspaces offers two different throughput capacity modes when working with your read and writes to and from your tables. These options allow you to customize how your throughput is managed, helping you to optimize it for your workloads.
The options available are on-demand and provisioned. On-demand throughput capacity is a default option when creating your tables and is capable of processing thousands of requests per second. The pricing for this option is based upon the number of read and writes made against your tables by your applications, meaning you only pay for what you're using.
As your workload fluctuates, it is able to scale to any increased throughput that the database has previously reached instantaneously. However, if additional throughput is required above and beyond existing thresholds, then Amazon Keyspaces works quickly to respond to meet the needs required by your applications.
As a result, this can be a good selection for your throughput if you're dealing with unknown or unpredictable workloads.
Provisioned throughput capacity is a better choice for you if you are dealing with more predictable workloads, which allows you to specify your predicted number of reads and writes per second, which would enable your tables to meet those throughput speeds faster than on-demand would. You can also use automatic scaling to alter the change of throughput if you experience fluctuation, or as your database naturally grows, using upper and lower the thresholds.
When working with Amazon Keyspaces, you'll need to use CQL, the Cassandra Query Language, which is the language you use to communicate with your Amazon Keyspaces. In many respects, it is similar to SQL, structured query language. And as a result, this helps to reduce the learning curve when moving from a relational database using SQL, such as MySQL.
There are a number of ways to run queries using CQL. Firstly, from within the Amazon Keyspaces dashboard within the AWS management console, you can use the CQL editor, which can return as many as a thousand records per query. If you are querying more than a thousand records, then you will need to run multiple queries together. You can run them on a CQLSH client, and more information on this can be found here, or you can run them programmatically using an Apache 2 licensed Cassandra client driver (more info here).
In the next lecture, I'll be demonstrating how to create a keyspace and then a table within that keyspace, so let's take a look.
Course Introduction - Amazon Redshift - DEMO: Creating an Amazon Redshift Cluster - Amazon Quantum Ledger Database (QLDB) - DEMO: Creating a Ledger using Amazon QLDB - Amazon DocumentDB (With MongoDB Compatibility) - DEMO: Creating an Amazon DocumentDB Cluster - DEMO: Creating a Keyspace and Table in Amazon Keyspaces (for Apache Cassandra) - Course Summary
Stuart has been working within the IT industry for two decades covering a huge range of topic areas and technologies, from data center and network infrastructure design, to cloud architecture and implementation.
To date, Stuart has created 150+ courses relating to Cloud reaching over 180,000 students, mostly within the AWS category and with a heavy focus on security and compliance.
Stuart is a member of the AWS Community Builders Program for his contributions towards AWS.
He is AWS certified and accredited in addition to being a published author covering topics across the AWS landscape.
In January 2016 Stuart was awarded ‘Expert of the Year Award 2015’ from Experts Exchange for his knowledge share within cloud services to the community.
Stuart enjoys writing about cloud technologies and you will find many of his articles within our blog pages.