Moving Beyond Spreadsheets
The course is part of this learning path
This course discusses some of the fundamental concepts of data management and looks at the differences between spreadsheets and databases for managing data. We'll look at some specific examples to understand when spreadsheets makes sense and when it makes sense to switch over to a database, which is sometimes a much better option for more complex datasets.
Specifically, this course aims to give students a practical hands-on introduction to database concepts. In addition, we'll gain an understanding of how to select the right database and we'll go through the basics of setting up an RDS instance on Amazon. This course includes a practical example of a company that is looking to choose a database, to give you an understanding of how databases work in the real world.
If you have any feedback relating to this course, please contact us at firstname.lastname@example.org.
- Understand the difference between spreadsheets and databases and when to use one or the other
- Learn about the different types of database available and the various features and characteristics to consider
- Learn how to choose the right database
- Learn how to deploy an Amazon Aurora instance
This course is designed for anyone who wants to improve their knowledge of databases and understand when it makes sense to use them as opposed to a spreadsheet.
To get the most out of this course, you should already have a basic understanding of simple data structures such as comma-separated values, as well as an understanding of cloud concepts in general.
In addition to having to just simply spec out what technology you want, be it SQL, NoSQL, Couchbase, MySQL, Postgres, you also need to pick how you're going to host it. And, honestly, from my experience, your hosting solution, be it which cloud provider you're on, is going to make a big decision.
Now, the good news is, for a lot of the classic ones, such as Postgres and MySQL, all the cloud providers have the same offerings with a little bit of twists. But let's go through each one and we can explain kind of the twists that each cloud provider puts on their specific database offerings.
So Amazon in particular has a lot of database options, way too many to cover in a course such as this. As we previously mentioned, Cloud Academy has a lot of deep dives into specific technologies, but the big ones you're gonna run across, typically start with RDS, or Relational Database Service. This is where you'll find your classics, MySQL, SQL Server, Postgres, and then some Amazon variants like Aurora, which have Postgres compatibility, and your classic databases.
For NoSQL, Amazon has offers such as Dynamo DB, which is their in-house NoSQL solution, along with hosted solutions for things like Mongo DB. Finally, there are a few more exotic big data databases available like Athena and Redshift. These are typically falling into that class of you have more than four terabytes of data and you need to scale, but just know that there are a lot of options. And as a beginner to intermediate person deploying on the cloud, I would strongly suggest starting with RDS if you need relational, or dynamo DB or Mongo DB if you need NoSQL.
If you're more of a Google person, or your company's centered on GCP, look to their cloud SQL offering to get your standard core set of databases. Once again, your MySQL, your Postgres, your SQL Server. This honestly is not terribly differentiated from Amazon's RDS, you're pretty much getting the same hosted server environment. But the more exotic databases are where it starts to become a little more differentiated.
Google has Google Bigtable, which is one of the most scalable NoSQL solutions for going above that four terabyte option. They also have solutions such as Google Firestore, which is a very powerful NoSQL database. Typically speaking though, you're gonna find yourself sticking to the cloud SQLs, or maybe the Firestore if you're more of a mobile app front-end developer with Google offerings.
And finally, just for the sake of covering the top three big ones, let's go over Azure's offerings. Of course, they are going to offer the same set of core, standard databases, but there's, of course, a particular focus on their SQL server, because Microsoft is of course a major contributor to that. But there's a special note in my opinion of calling out their Cosmos DB. Perhaps you've seen it called Document DB in older documentation, it underwent some branding improvements. But this, in my opinion, is one of the premier, if not the best, although I don't want to say it formally, it is a phenomenal NoSQL solution that is extremely scalable.
So Azure, just to recap, strong all around offering just like the other two providers, but Cosmos DB, in my opinion, deserves a special call out as a phenomenal NoSQL database that is exclusive to Azure.
Calculated Systems was founded by experts in Hadoop, Google Cloud and AWS. Calculated Systems enables code-free capture, mapping and transformation of data in the cloud based on Apache NiFi, an open source project originally developed within the NSA. Calculated Systems accelerates time to market for new innovations while maintaining data integrity. With cloud automation tools, deep industry expertise, and experience productionalizing workloads development cycles are cut down to a fraction of their normal time. The ability to quickly develop large scale data ingestion and processing decreases the risk companies face in long development cycles. Calculated Systems is one of the industry leaders in Big Data transformation and education of these complex technologies.