Moving Beyond Spreadsheets
The course is part of this learning path
This course discusses some of the fundamental concepts of data management and looks at the differences between spreadsheets and databases for managing data. We'll look at some specific examples to understand when spreadsheets makes sense and when it makes sense to switch over to a database, which is sometimes a much better option for more complex datasets.
Specifically, this course aims to give students a practical hands-on introduction to database concepts. In addition, we'll gain an understanding of how to select the right database and we'll go through the basics of setting up an RDS instance on Amazon. This course includes a practical example of a company that is looking to choose a database, to give you an understanding of how databases work in the real world.
If you have any feedback relating to this course, please contact us at firstname.lastname@example.org.
- Understand the difference between spreadsheets and databases and when to use one or the other
- Learn about the different types of database available and the various features and characteristics to consider
- Learn how to choose the right database
- Learn how to deploy an Amazon Aurora instance
This course is designed for anyone who wants to improve their knowledge of databases and understand when it makes sense to use them as opposed to a spreadsheet.
To get the most out of this course, you should already have a basic understanding of simple data structures such as comma-separated values, as well as an understanding of cloud concepts in general.
To cap this class off, and really go through a good final exercise. We're gonna walk you through step-by-step how to deploy this Aurora Instance. These rules that were going through, and these steps can easily be applied to other types of databases on Amazon, and of course the Cloud Academy directory, and course content library has a lot more about other types, but navigate to the relational database service, listed as RDS within Amazon, and hit the create database button. This'll bring you to the first question, and basically it's asking, do you want Amazon to make some assumptions for you, or do you want to go through the more classic standard creation process?
For now, we are going to go through the standard creation process, just to showcase some of our choices, but if you want to go through the Easy Create, Amazon will gladly make some assumptions for you. Next, you'll be presented with the type of engine you want to run your database on. RDS does a fantastic job making all these engines run through a very similar administrative interface, but you're gonna notice you have your classics MySQL, PostgreSQL, Oracles and such, were gonna just pick Aurora for now, and you'll notice were selecting with the MySQL compatibility option, for those of you for PostgreSQL, that is also an option. Basically, this is where you pick the engine that runs under it, and the rest is managed by RDS, but as were going through the standard install.
Let's go through a couple more selections just to really showcase the flexibility of using RDS to manage your database for you. If you're going through the standard create, one of the next options you'll be presented with is do you want a production or a dev test environment?
Now I know on screen it says, fast, consistent, performance and production, and Dev/Test is only for use in development. To let everyone in on a little secret, Dev/Test is also fast, consistent, in performing. The main key here's trying to ask you is how robust do you need your high availability, backups, and scalability?
If you're putting together something for your team, as a demo, or a sample or a trial. There's no shame in going with Dev/Test, it's actually cheaper and easier to administer, but if you need highly available data, multiple servers for fail over, that's when productions is important, but just don't feel shamed in going to Dev/Test. That's what we do for many of our applications during its early stages.
When you click through that, you're going to come maybe what's the most intimidating screen out of the database creation process. The settings page, simply put here, give it a name, give it a password. Many of the other settings can just be bypassed for now if you're just doing a simple database. Don't worry about them, just get a good name in there, and a good user name and password, and make sure you noted down. And finally, this last screen that you'll have to enter options on is whether or not you want to make a replica. This is simply asking, do you want to have a hot, hot server, or a second databases available to quickly fail over.
For the person this coffee shop, and just to showcase what it looks like, were going to select high availability, but making snapshots with a single database is also a completely valid option. When you make that selection, you're going to be asked to confirm it, and Amazon will automatically start creating this database for you. It completely automated the administrator requirements of logging in and configuring it.
After a little bit, you'll be able to click, and see the Url endpoint, port number, and you hopefully wrote down your user name and password, and you'll be able to connect to it with the SQL tool of choice, and start to enter data into your database, or maybe if your a more advanced user, start to programmatically enter data with Python, Java, or your preferred programming language.
So for those of you that got this far, we'd love to hear your feedback on the course. This is part of a data series course, where we are showcasing practical, hands-on ways to immediately start using data. Hence, the series of how to make a database, how to select it, and if it's right for you. We'd love to hear your feedback on that, and check for follow-on courses and associated labs. We could start to put this really into practice.
Calculated Systems was founded by experts in Hadoop, Google Cloud and AWS. Calculated Systems enables code-free capture, mapping and transformation of data in the cloud based on Apache NiFi, an open source project originally developed within the NSA. Calculated Systems accelerates time to market for new innovations while maintaining data integrity. With cloud automation tools, deep industry expertise, and experience productionalizing workloads development cycles are cut down to a fraction of their normal time. The ability to quickly develop large scale data ingestion and processing decreases the risk companies face in long development cycles. Calculated Systems is one of the industry leaders in Big Data transformation and education of these complex technologies.