Machine Learning Pipelines with Scikit-Learn
This course is the second in a two-part series that covers how to build machine learning pipelines using scikit-learn, a library for the Python programming language. This is a hands-on course containing demonstrations that you can follow along with to build your own machine learning models.
- Explore supervised-learning techniques used to train a model in scikit-learn by using a simple regression model
- Understand the concept of the bias-variance trade-off and regularized ML models
- Explore linear models for classification and how to evaluate them
- Learn how to choose a model and fit that model to a dataset
This course is intended for anyone interested in machine learning with Python.
To get the most out of this course, you should have first taken Part One of this two-part series.
The resources related to this course can be found in the following GitHub repo: https://github.com/cloudacademy/ca-machine-learning-with-scikit-learn
Congratulations! You've reached the end of this course! We've covered a lot and I hope you enjoyed yourself as much as I did. You learned how to fit a linear regression model in scikit-learn, and we've explored techniques to make our estimator more robust and consistent by using cross-validation. This is a vital step in building a machine learning pipeline, and it should be used in general as a benchmark when fitting your model. We then investigated regularized models, in particular, The Ridge Model.
We saw that sometimes it's better to add bias within the model, since this procedure might give you an estimator of which the coefficients are less variable than the ones you would obtain in standard regression. Finally, we covered Linear Models for classification and looked at how to evaluate them using the precision and recall metrics. We've now come to the end of this course. If you have any feedback or questions please feel free to contact us at firstname.lastname@example.org. Thanks for watching
Andrea is a Data Scientist at Cloud Academy. He is passionate about statistical modeling and machine learning algorithms, especially for solving business tasks.
He holds a PhD in Statistics, and he has published in several peer-reviewed academic journals. He is also the author of the book Applied Machine Learning with Python.