Working with Data Sources
Data Manipulation Within Amazon Machine Learning
Working with Machine Learning Models
When we saw how incredibly popular our blog post on Amazon Machine Learning was, we asked data and code guru James Counts to create this fantastic in-depth introduction to the principles and practice of Amazon Machine Learning so we could completely satisfy the demand for ML guidance within AWS.
James has got the subject completely covered:
- What exactly machine learning can do
- Why and when you should use it
- Working with data sources
- Manipulating data within Amazon Machine Learning to ensure a successful model
- Working with machine learning models
- Generating accurate predictions
Welcome to our lecture on when to use machine learning. In this lecture, we'll talk about what to consider before trying machine learning as a solution to a problem in some situations where machine learning makes sense.
Machine learning is a great tool, but like so many tools in computer science that have come before it, it is not a magic bullet. There are problems that are appropriate for machine learning and other cases where different solution makes more sense. If you can quickly and robustly cover all possible cases with simple rules, then you don't necessarily need a machine learning system. If you are programming a traffic light, the rules are simple. For example, you could write a rule which states that when traffic in one direction is given the green light, then traffic in the cross direction should be given the red light.
But there are many tasks that humans can perform easily where the rules are not so easy to figure out. The classic example is recognizing whether an e-mail is spam or legitimate. There are a large number of factors which influence the right answer to this question. When the rules are difficult to code because of edge cases, exceptions and uncertainty, a machine learning-based solution can effectively create working rules for you.
Let's go back to that example detecting spam. Even if a problem like detecting spam can be fairly easy for some humans to figure out, although not all humans are even good at that, then you may still wish to consider a machine learning solution.
The sheer volume of e-mail makes it prohibitively expensive to submit each mail to human review to determine whether it's spam. Machine learning may be an effective solution for problems beyond your normal ability to scale.
As with any problem, you will want to use the most cost-effective solution that meets your requirements. In many cases, machine learning is more cost-effective than taking the time to manually code and maintain a rule-based system or scaling up human capital in order to take advantage of natural brain power.
Let's look at some common use cases for machine learning. Many machine learning users use systems to find patterns that detect fraud either in real-time during the transaction or after the fact.
Providing product or service recommendations based on customer history and behavior are another common use of machine learning techniques. Amazon itself uses machine learning to provide recommendations, as do many others such as Netflix. Customer Churn Analysis is used to find customers that are at high-risk attrition like customers who are considering changing mobile providers or even just a customer who is likely to abandon the shopping cart without making a purchase. After identifying these customers, we can take action to try and retain them.
Marketing can be expensive and targeted marketing campaigns attempt to extend marketing offers towards those customers who are most likely to receive them favorably. There are many uses of classification and document classification has many applications whether to detect spam, identify documents containing personal data, or identify customer sentiment based on reviews and social media posts, all of these fall under the category of document classification.
Customer outreach is another interesting case. Machine learning can be used to identify customers with support issues and automatically connect them with customer care. These are just a few of the most common use cases. And throughout our series, we will look at a targeted marketing example as we work with Amazon ML.
James is most happy when creating or fixing code. He tries to learn more and stay up to date with recent industry developments.
James recently completed his Master’s Degree in Computer Science and enjoys attending or speaking at community events like CodeCamps or user groups.
He is also a regular contributor to the ApprovalTests.net open source projects, and is the author of the C++ and Perl ports of that library.