Security Best Practices for AWS Databases
The course is part of this learning path
This course explores the security best practices when working with AWS databases, specifically looking at RDS and DynamoDB with some extra content related to Aurora. If you have any feedback relating to this course, feel free to get in touch with us at email@example.com.
- Recognize common security vulnerabilities in regards to DynamoDB and RDS
- Recommend ways to resolve these security issues as well as understand some best practices that will help create secure architectures for your database
This course is recommended for anywho who is looking to broaden and reinforce their AWS security understanding, or anyone who is interested in creating secure databases in general.
To get the most from this course, you should have a good understanding of cloud computing, preferably with Amazon Web Services and you should be able to deploy and manage either RDS or DynamoDB databases on a basic level.
When you start looking at the best practices for creating secure and efficient architectures, one of the first things you should look to do is understand the value and the importance of the data you are working with. Not all data is necessarily created equal, as some must be stored with more care than others. For example: when looking after credit card information or personal health information - you want to be particularly careful with it and spend extra resources ensuring its governance and protection.
However, if you have hundreds of gigabytes of clickstream data, that doesn't exactly have national secrets tied to it, you can afford to be more relaxed with its care. This data doesn't need to be under the same kind of scrutiny that the credit card information does. In fact, if you were to try to treat all data the same way, your budget would balloon up as you attempt to apply the highest level of protection to data that doesn't necessarily need it.
When classifying data you should create as many different levels as your solution requires. This could be as simple as just having two types: Confidential and nonconfidential - or having more granular levels of control such as: Secret, Protected, Confidential, and Public.
Each of these levels might allow end users and internal employees different methods of control and access requirements. On the secret end of things, this data might only be accessible from on-premises or through the corporate VPN; while the public data is something that anyone who accesses the external website will be able to see.
Some Best Practices for data classification:
- Keep it simple and easy to understand - you should be able to simply explain your data classification levels with just a few sentences.
- Perform a risk assessment of your data - there can be severe legal and regulatory consequences for data breaches or accidentally leaking users personal information. Once you know your categories of data, you need to know what data should go where.
- Re-evaluate your data on a regular basis - your data profile might change over time as your application, business model, or local laws change. Suddenly something that was benign before might need to be moved into a more secure data category.
Once you have your data classified we can set up the appropriate levels of protections that the data deserves.
Probably the easiest way to secure your data within AWS is to make sure it is encrypted. For many services, it is literally just a click of a button away. For example: when using RDS you can enable encryption at rest for free - although you have to enable it at creation time. This encryption uses the industry-standard AES-256 encryption algorithm to ensure all your underlying storage, automated backups, read replicas, and snapshots are protected.
Amazon RDS has support for Transparent Data Encryption for SQL Server and Oracle. This encryption mode allows the server to automatically encrypt data before it is written to storage. This is extremely useful in case a bad actor had the ability to read data from that drive itself. With TDE there is never a moment where any meaningful data is left available.
Another general best practice for security of your database is to encrypt your data in transit. That means you should keep the communication between your application and your Database encrypted using SSL/TLS. Amazon RDS is responsible for creating an SSL certificate that is installed on the database on creation.
Now each database engine has its own process for implementing SSL/TLS but here is a simple example for MySQL
When you attempt to connect with the database by launching the MySQL client, simply launch it using the --ssl-ca parameter.
You will have to download the public key and import the certificate into your operating system for everything to work of course.
Much is the same for Amazon DynamoDB as it provides encryption of all your data at rest by default. You also have the option to use your own custom keys to encrypt your data based on regulatory compliance reasons. It too uses AES 256 encryption and provides single-digit millisecond latency so you shouldn't have to worry about any bottlenecks from the encryption side of things.
DynamoDB has a client-side encryption library that helps you to protect your table data before you send it to Amazon DynamoDB. It encrypts the attribute values for each item in the table using a unique encryption key. The encryption library then signs the items to protect them from unwanted changes such as deleting, modification, and swapping encrypted values.
This data can then be sent to DynamoDB without worry of external parties or even AWS themselves seeing plain text information.
Security of your database and your general health have a lot in common, specifically addressing issues and preventing problems before they arise. You will save a lot of trouble and heartache if you can create systems that prevent the problems before they happen.
A common issue these days that you might experience is not necessarily someone trying to steal or compromise your data - but to stop access to it all together. Attackers perform DDoS attacks for a variety of reasons but in the end, if you can weather the storm everything will be alright.
When thinking about protecting your databases from these attacks, we need to try to limit the attackable surface as much as possible. That means deploying our architectures in a layered fashion where we can control the flow of traffic within our network. Each level only allows the bare minimum of traffic through to the next level.
We might have an external web layer that would deal with the brunt of the DDoS attack but it in turn would only allow legitimate traffic onto the next layer - the application layer - from people who are trying to use the website for its intended purpose. This layer forwards on requests to the database which can provide the information the customers are looking for.
Specifying the allowable traffic is controlled by the security groups of the individual instances and would look something like this:
Traffic comes in on port 443 from the load balancer or even directly from DNS which can come from anywhere on the internet. The web server sends HTTP traffic on port 80 to the app tier, and that flows through to the database with TCP on port 3306.
With this setup our database is protected from the DDoS attack (even if the poor web tier gets taken down) and would allow those who are already connected with the app to continue to use it.
If our database allowed all traffic from anywhere, it could be attacked directly and prevent the legitimate users from enjoying their sessions.
Creating secure applications that interact well with your database requires years of experience and a host of knowledge. One of the primary worries besides allocating the right amount of throughput for a database is protection against SQL injection attacks. These types of attacks make up two thirds of all attacks against web applications. And with the amount of pressure that developers can be put under to create prototypes and minimal viable products to “ just get something out there” they can leave their data up for grabs if they are unaware.
AWS WAF (web application firewall) is a service that gives you control over the type of traffic that is allowed to your application. This service provides the ability to create rules that can help prevent SQL injection attacks and cross-site scripting attacks.
These rules operate by creating a web access control list that monitors traffic through either your application load balancer, API gateway, or Cloudfront.
If you do not want to build the rules yourself, AWS has recently created a set of managed rules that can with the press of a button give you “Instant protection” against SQL injection and a whole host of other common threats.
“The AWS Threat Research Team maintains the rules, with new ones being added as additional threats are identified.”
These rules are all supported by AWS CloudFormation, so you can have them enabled right as you build up your infrastructure. It should also be noted that there is no additional charge when using managed rules vs creating your own.
You will be charged for each web ACL that you create and each rule that you create per web ACL. In addition, you will also be charged for the number of web requests processed by the web ACL
A potential serious security concern is the malicious or accidental deletion of a production database. Even if you have been up to date in keeping backups, the amount of downtime and potential lost transactions could be disastrous to your application or company.
Just as a general reminder it is extremely important to have appropriate RPO and RTO standards setup with these expectations and SLAs made apparent to your customers where applicable.
Everyone has heard stories of some new intern or junior developer dropping or deleting all the tables from prod. One of the best ways to stop these kinds of actions is of course to not give the intern IAM permissions to do anything of that caliber. Here is a quick example of that :P
However even when we do lock down the intern's permission to barely being able to spin up burstable ec2 instances, there is still the route for malicious actors to hijack high-level AWS accounts within your org and cause damage as well.
What do you do when your DB admin goes away on vacation for a week, and someone cracked their credentials? That account has all the power to demolish your workload. Well with the simple addition of Multi-factor authentication, a lot of these worries will be taken care of.
Take a look at the same IAM permissions we just showed, but with the extra addition of MFA being required. This alone might protect you one day if your credentials are ever leaked to the public - like accidentally in a public repo
And MFA is not limited to just protecting your databases. I highly recommend having everyone who has any kind of power within your AWS environment to run MFA. It may add an extra 30 seconds of log in time, but could save you potentially hundreds of hours of heartache and lots of money.
As simple as it may sound, you need to make sure that only the people who absolutely need to access your database have the ability to do so.
The classic example of the new intern that deletes all of your production data is what always comes to mind when I think of least privilege.
When creating users that can access dangerous parts of your architectures, such as your database, make sure they can only change or modify what is currently necessary. That might mean they only have that permission for a week or a day.
Some services such as DynamoDB allow extremely fine-grained access - even to granting permission like read-only or write-only access to certain attributes in a specific table.
One best practice I think everyone should look into is the idea of using roles more frequently for security and protection of your data alongside the least privilege.
For example, maybe you have several groups set up with various permissions according to need. However, no one has the direct power to delete a database themselves. In order to accomplish this task, they need to assume a role of DB admin. You could limit the assumption of this role to only your most trusted users.
Having your system set up in this way can prevent the accident deletion, or even programmatic deletion of system-critical components - Because it is very hard to accidentally assume a role in the AWS console. This provides another hoop that a user needs to go through in order to do something dangerous accidentally.
Even with the best security measures in place, intrusion and security events will still happen. The goal of your architectures in regards to security is to detect these events as they happen or at the bare minimum understand if something did occur, and be able to update and react appropriately.
Amazon Guard duty is an automated machine learning-based security service that monitors VPC flow logs, CloudTrail, and DNS query logs to look for unauthorized access and activity. It can protect you by noticing strange behaviour trends from internal AWS accounts as well as external traffic within your network. Amazon Guard duty can notify you based on CloudWatch rules you specify with Amazon SNS - sending you an email when it does detect suspicious behaviour.
Guard duty breaks down its security into three distinct categories. Simply put, there are low, medium, and high threat levels. Amazon gives the follow examples of what these mean:
“A “Low” severity level indicates suspicious or malicious activity that was blocked before it compromised your resource. A “Medium” severity level indicates suspicious activity. For example, a large amount of traffic being returned to a remote host that is hiding behind the Tor network, or activity that deviates from normally observed behavior. A “High” severity level indicates that the resource in question (e.g. an EC2 instance or a set of IAM user credentials) is compromised and is actively being used for unauthorized purposes.”
With this level of detection and analysis, it's easy to see how guard duty would benefit most architectures for improving its security.
William Meadows is a passionately curious human currently living in the Bay Area in California. His career has included working with lasers, teaching teenagers how to code, and creating classes about cloud technology that are taught all over the world. His dedication to completing goals and helping others is what brings meaning to his life. In his free time, he enjoys reading Reddit, playing video games, and writing books.