Wrestling with Data
The course is part of this learning path
In this course, we're going to do a deep dive into the various tools and techniques available for manipulating information and data sources along with showing you at the end of it how you can actually solve some real-world problems.
If you are trying to handle increasingly complex data sets and round out your experience as a professional data engineer, this is a great course to get a practical field-based understanding.
- Learn to determine when it's appropriate to use a programmatic approach versus pure SQL.
- How to access and manipulate your files and data sources using programming techniques available to you in languages such as Python.
- Familiarity with relational databases and other data formats such as CSVs and JSON.
- Baseline understanding of SQL
If you don't have all of these this course will still benefit you, but you might not be able to follow all of the examples.
Hopefully, with that architecture outlined, you begin to understand how to put the pieces together. There are so many ways to begin to manipulate data. It's hard to capture in a self-contained class. Basically inexperienced data engineer as you start to work in the field and understand your options, you'll gain a better understanding of what the right tool is. Be it raw SQL or a data access language with some tools assisting it.
A rule of thumb though, is a programming language offers more flexibility and the ability to customize your experience than raw SQL but that's at the cost of some of the complexities during initial setup and ongoing maintenance. AS always, thanks for attending.
Please remember to rate this course and leave feedback if you have any thoughts. Keep your eyes open for more classes in the data engineering learning path. Be sure to check out some of the associated labs with this, thanks.
Calculated Systems was founded by experts in Hadoop, Google Cloud and AWS. Calculated Systems enables code-free capture, mapping and transformation of data in the cloud based on Apache NiFi, an open source project originally developed within the NSA. Calculated Systems accelerates time to market for new innovations while maintaining data integrity. With cloud automation tools, deep industry expertise, and experience productionalizing workloads development cycles are cut down to a fraction of their normal time. The ability to quickly develop large scale data ingestion and processing decreases the risk companies face in long development cycles. Calculated Systems is one of the industry leaders in Big Data transformation and education of these complex technologies.