What happens once your software is actually running in production? Ensuring that it stays up-and-running is important. And depending on what the system does, and how much traffic it needs to handle, that may not be particularly easy.
There are systems that will allow developers to run their code and not need to think about it. Platforms as a service option like Google’s App Engine go a long way to reducing and, in some companies, removing operations. However, not every system can or will run on such platforms. Which means that having qualified operations engineers is an important thing.
The role of an operations engineer is continually evolving; which isn’t a surprise since changes in technology never slows down.
So, if the job falls on you to keep a system up-and-running, where do you start? What needs to happen? These are the questions this course aims to answer.
In this course, we take a look at some of tasks that operations engineers need to address. I use the term operations engineer as an umbrella, to cover a wide variety of job titles. Titles such as ops engineer, operations engineer, site reliability engineer, devops engineer, among others, all fall under this umbrella.
Regardless of the name of the title, the responsibilities involve keeping a system up-and-running, with little or no downtime. And that’s a tough thing to do because there are a lot of moving parts.
If you’re just starting out, and are interested in one of those roles, then the fundamentals in this course may be just what you need. These fundamentals will prepare you for more advanced courses on specific cloud providers and their certifications.
Topics such as high availability are often covered in advanced courses, however they tend to be specific to a cloud provider. So this course will help you to learn the basics without needing to know a specific cloud provider.
If this all sounds interesting, check it out! :)
By the end of this course, you'll be able to:
- Identify some of the aspects of being an ops engineer
- Define why availability is important to ops
- Define why scalability is important to ops
- Identify some of the security concerns
- Define why monitoring is important
- Define why practicing failure is important
This is a beginner level course for anyone that wants to learn. Though probably easier if you have either:
- Development experience
- Operations experience
What You'll Learn
|Lecture||What you'll learn|
|Intro||What will be covered in this course|
|Intro to Operational Concerns||What sort of things to operations engineers need to focus on?|
|Availability||What does availability mean in the context of a web application?|
|High Availability||How do we make systems more available than the underlying platform?|
|Scalability||What is scalability and why is it important?|
|Security||What security issues to ops engineers need to address?|
|Infrastructure as code||What is IaC and why is it important?|
|Monitoring||What things need to be monitored?|
|System Performance||Where are the bottlnecks?|
|Planning and Practicing Failure||How can you practice failure?|
|Summary||A review of the course|
Welcome to Introduction to Operations. I'm Ben Lambert and I'll be your instructor for this course.
So, this way it will give you a solid foundation to build on in future courses.
We'll start the course with an overview of some of the challenges that operations engineers face.
We'll move on to talk about what availability is and follow up that discussion with a topic of high availability.
After that we'll cover scalability, what it means, and why it's important.
And then we'll have a brief overview of some security concerns.
We won't focus too much on security in this course because it's an important enough topic to warrant its own future course.
We'll talk about what infrastructure as code is and why it's so valuable.
And then we'll talk about monitoring and how it helps to ensure everyone has the data they need to make sound decisions.
Once we've covered monitoring we'll be ready to talk about system performance.
And then we'll have a brief overview of planning and practicing failure.
And after that we'll wrap up with a summary.
This course doesn't make any assumptions about your technical experience. It's a beginner level course. And so I try to explain all of the important stuff. However, if you have some sort of technical background. If maybe you're a developer or sys admin, then the concepts are possibly going to be easier to understand.
Here's what I think you should get out of this course. By the end of it, you'll understand a bit more about some of the concerns operations engineers have to deal with.
You'll know why availability is important, and you'll also know why scalability is important.
You'll be able to identify some of the places where bottlenecks can degrade system performance.
And you'll be able to identify a few security concerns as well.
You'll know why monitoring is important, and you'll know what it means to practice failure.
Before we move on, I want to point out that you can adjust the speed of the video in the video player settings. So feel free to play around and find the speed that's right for you. Throughout this course we'll cover a lot of the fundamentals that will help you prepare for more advanced courses.
Once you've completed this course, or if you find that you're already familiar with most of the information that we're covering, then you're probably ready to move on to something that's geared toward a particular cloud platform. If you have something in mind, then go for it. If not, then I recommend you start with something like the AWS Solutions Architect Associate learning path. It's gonna serve as a practical application of the knowledge in this course.
So, if this course sounds interesting to you, let's get started.
Ben Lambert is a software engineer and was previously the lead author for DevOps and Microsoft Azure training content at Cloud Academy. His courses and learning paths covered Cloud Ecosystem technologies such as DC/OS, configuration management tools, and containers. As a software engineer, Ben’s experience includes building highly available web and mobile apps. When he’s not building software, he’s hiking, camping, or creating video games.