A look at what was new and interesting from Swami Sivasubramanian’s keynote.
Today we all got an hour or two of Swami’s time as he went over the many machine learning-focused releases from AWS (a total of 13 in all). Dr. Sivasubramanian is the VP of Amazon Machine Learning, and it’s always cool to hear about anything coming from his department.
Machine learning in AWS has been a long time in the works, and I have watched with piqued interest to see how it has evolved over time. When I think back to re:Invent 2017 and the release of Amazon SageMaker it’s amazing to see just how far AWS has pushed the democratization of machine learning technology in just a handful of years.
Before SageMaker it was quite a production to get any kind of machine learning workload running in the cloud. It was technically doable of course, but you needed to have a large amount of AWS experience as well as a wealth of machine learning expertise. However, with each passing year since the release of AWS’s most important machine learning service, the technology has become easier and easier to get into for people of almost any background.
It was with this excitement for the future of machine learning that I turned on the stream for today’s keynote. I grabbed a cup of coffee and sat back with great hopes that AWS would once again push the bar a little higher and make machine learning a little more friendly for the rest of us.
Amazon DevOps Guru for RDS
For today’s first announcement Swami opens up with machine learning-powered RDS performance improvements and availability with Amazon DevOps Guru for RDS. This new feature is an expansion to the already released service Amazon DevOps Guru, that came out in early May of this year. Not quite what I pumped up for, but hey, go ML!
DevOps Guru is focused on using machine learning to help developers improve their applications availability through detecting of operational issues. It does so with metrics collected via events and log data pulled from other AWS services. This new branch helps to extend the service into RDS and gives developers deeper insight into performance issues that might be extremely difficult to diagnose in a more standard way.
I don’t think this one pushes the bar much for helping someone get into machine learning, but if you are a frustrated database admin looking to pull just a shred more capacity through your overburdened systems – this one might just make your year.
Amazon RDS Custom (Now with support for SQL Server applications)
Striking again at the heart of the audience, Swami throws another database-related release like a quick jab to keep you off balance. I will be frank, I was not prepared for the first database-related release from the machine learning VP, much less a second.
Amazon RDS Custom is a managed database service that helps your applications run on customized operating systems and database environments. Just at the end of October, the service expanded to include Oracle within its purview and today it can now support SQL Server applications.
This service is generally used for legacy, custom, and packaged applications so it’s probably not something the vast majority of people will be super excited about. If you were however hanging on the edge of your seat for SQL Server support on Amazon RDS Custom, holiday presents came early.
Amazon DynamoDB Standard-Infrequent Access Table Class
By now I’m starting to think that Swami is gunning for Raju Gulabani’s job (the VP of AWS Databases and analytics) because we get another database release. Can you be VP of two organizations? Should you be VP of two organizations? When do we get to learn about robots and machine learning stuff? These questions rattle through my brain as I finish my first cup of coffee.
Anywhoo, we are now introduced to Amazon DynamoDB Standard-infrequent access table classes. This new way to store your DynamoDB table threatens to reduce costs by up to 60%. This class is ideal for long-term storage of infrequently accessed data.
It’s very similar to how the S3 infrequent access works in that if you don’t plan to access your data all that often, but it still needs to be ready at a moment’s notice, this will be a huge cost saving. Using this table class does mean that your write will be more expensive.
On the AWS pricing page for the new feature, they post an example that shows:
42.5 million writes costing 1.25$ per million
42.5 million reads costing 1.25$ per million
42.5 million writes costing 1.56$ per million
42.5 million reads costing 0.31$ per million
This does show quite a nice savings overall but be careful of those write costs!
AWS Database Migration Service Fleet Advisor
He did it again… relentlessly releasing database services and improvements like they just fall from the sky. Someday we will return to the path of machine learning and see their great gifts once more, however, we might have to wait up to five minutes for the next announcement.
This new service addition, the AWS Database migration Service Fleet Advisor, promises to help accelerate your database migrations. This is another add-on to the standard Database migration service that is focused on automating the discovery and analysis of your fleet. It collects and analyzes your database schemas and objects to help you build a customized migration plan. In theory, this plan will make it easier to move into AWS without using any third-party tools or outside migration experts.
Time will tell with this one I think… Moving onwards!
Amazon SageMaker Ground Truth Plus
It’s happening! By god, it’s finally happening! They said it couldn’t be done, but here we are with some machine learning content! Although what seems to be another theme of today besides databases is add-on services – however, this add-on is impressive in many ways.
This new version of Ground Truth is a hybrid data labeling service that uses both machine learning as well as an ‘expert workforce’ to help label your data. The original version of ground truth simply used Amazon Mechanical Turk to farm out data labeling to third-party vendors or your own private teams.
This new and improved ground truth PLUS has a few more in-depth steps that will theoretically improve the quality of your data labels. When setting up your new data labeling job you will need to fill out a form that explains the requirements for the data labeling project. This is followed up with a call from AWS Experts that will discuss your project and assumably begin to assemble the required experts your situation needs. You would then upload your data to a predetermined S3 bucket for labeling by the service.
Using the base level Ground Truth tools already in the service, the appointed experts can begin the labeling job as normal. From there, an ML system will hop into the fray and begin to pre-label the images in your dataset based on what the experts have already done. These ML-labeled images will be sent to the experts as well who will double-check its work. This teamwork approach should greatly increase the speed of labeling.
Overall, I rate this new feature pretty cool out of ten!
Amazon SageMaker Studio Notebook
Two in a row for our machine learning update! I feel like this is a good trendline and we are back to heading in the right direction. Swami returns to center stage and unveils another add-on service, but a good one!
Today we are introduced to Amazon SageMaker Studio Notebook. This update adds collaborative notebooks that can be quickly launched because they don’t require compute resources and file storage setup ahead of time. They use a set of instances which are referred to as ‘Fast Launch’ types which are designed to spin up within two minutes (this is WAY faster than normal for a notebook). Amazon SageMaker Studio Notebooks can also easily connect with Amazon EMR and Amazon S3 to import your datasets so you can transform and analyze them as you see fit.
These notebooks provide persistent storage that even lets you view and share the notebook when the instances are offline. When sharing a notebook, you can create a read-only URL that won’t allow for any changes of your underlying architectures. When your recipient opens the URL they can choose to create a copy of the notebook. This will create a duplicate of the underlying instance type and SageMaker image that your notebook was running on. Overall, this is a very cool addition to the SageMaker Portfolio.
Three New Amazon SageMaker Infrastructure Updates:
Next up we got three infrastructure upgrades for Amazon SageMaker – we are now officially on a roll with the ML content.
The first update is Amazon SageMaker Training Compiler. This new feature allows you to accelerate your deep learning model training by up to 50%. This is accomplished by using the underlying instance GPU ‘more efficiently’. When digging deeper into this update I found that the training compiler accelerates training by converting your Deep learning model (which is written in a high-level language) into a more hardware-optimized language that can be better used by the GPU. Neat!
The second update is Amazon SageMaker Inference recommender. This service was designed to help you choose the best compute option and configurations for deploying your machine learning models based on optimal inference performance vs cost.
Since there are over 70 ML instance options to choose from, each with differing resource availability, finding the correct one for your model can be very time-consuming. This new update promises to automatically select the right compute instance and type for you. This also includes the number of instances you should run, container parameters, model optimizations, and what have you, to give you the best performance per cost ratio. If all of this works out as described, this will be an amazing quality of life upgrade for pretty much everyone working on machine learning within AWS.
And the final update to the infrastructure is Amazon SageMaker Serverless inference. This new inference option allows you to deploy machine learning models for inference without the need to create or manage the underlying infrastructure. Everything will be automatically provisioned and scaled for you, and compute will be turned off when no longer needed. As per many of these types of services, you will only pay for the duration of running the inference code, and the amount of data processed.
Amazon SageMaker Canvas
Now we are really getting into the good stuff, the machine learning for machine learners, the Crème de la crème. These kinds of updates are what really bring people to re:Invent in my opinion, the things that make technology easier for the masses. Paint by numbers machine learning brought to you by AWS.
Well, it’s not that simple, but Amazon SageMaker Canvas allows you to create and run an entire ML workflow featuring a drag and drop user interface. This new feature of SageMaker allows laymen to start creating ML systems that can help with business analysis and predictions without writing any code or requiring any ML experience.
All you will require is some preexisting data, like a product catalog and some historical shipping data, living in CSV format. This can be imported into SageMaker Canvas manually or even fetched from amazon S3, amazon redshift, or even snowflake.
With this information, you could create a predictive model within SageMaker Canvas based on the data. Using the model it creates, you can then get a forecast of your next shipments and if they would be late or arrive on time based on any of the factors within your dataset. Based on what I’ve seen so far in the keynote and in the press release online, this is an incredible addition to SageMaker and will gain a lot of traction in the future.
Amazon Kendra Experience Builder
Well, we had a good run on the machine learning updates relating to building your own models, time to get back to brass tacks. By this point in the stream, I’ve made it onto coffee number two and Swami swings in with glee to introduce Amazon Kendra Experience Builder.
In case you are unfamiliar with the base service (like I was), Amazon Kendra is an ML power search service designed to work in the enterprise environment to help your employees find scattered content throughout the organization. This content might be in a document, a repository, reports, guides, S3, salesforce, and many other locations.
Amazon Kendra Experience builder works on top of all of that to help deploy a search application that is fully customizable within a few clicks. It doesn’t require any programming or machine learning experience. Everything is built through an intuitive visual workflow that allows you to create a powerful search engine for your dispirit files and figures. It’s very reminiscent of how Amazon Cognito allows you to create your own login page for SSO actions and what have you.
Amazon Lex Automated Chatbot Designer
Nearing the end of Swami’s talk Amazon Lex is invited to have a service update with Amazon Lex Automated Chatbot Designer. This new feature to the older chatbot service promises to help reduce the time it takes to create and design an advanced natural language chatbot. This update works to expand the usability of the design phase within Amazon Lex by using machine learning to provide an initial bot design. You will then get to refine and update this base model, hopefully gaining a head start over creating one from scratch.
It does this by using conversation transcripts between your callers and the agents to create common intents and related information. Given enough transcripts, this will greatly reduce the amount of grunt work required to produce a moderately tolerable chatbot. You will of course need to tune and prune the results to fit your use cases, but hey, I like doing less work!
Amazon SageMaker Studio Lab
Our second to last release from AWS in the machine learning realm comes to us as another addition to SageMaker, and that would be Amazon SageMaker Studio Lab. This is a free service offered up to help people begin their journey into machine learning.
AWS has created a way for people to quickly hop in, without an AWS account, without a credit card, heck with zero cloud knowledge whatsoever, and begin building an ML model. This is the type of update I would be looking for if I was just starting my ML journey or even remotely curious about the space.
After signing up for your free Studio account you will be given access to a machine learning environment that requires no setup or configuration. You will be allowed to use any framework you want, such as Pytorch, Tensorflow, etc., and up to 12 hours of CPU or 4 hours of GPU time per session to enjoy as you please. Studio Lab is integrated with GitHub so you can download, edit, and run any notebook you want. This is perfect for those who just want to get their feet wet, without any repercussions to their AWS bill.
AWS AI & ML Scholarship Program
Now at the end of the session, Swami lets us know that AWS is offering up a scholarship for those interested in learning about AI and Machine learning. This funding is available for underrepresented and underserved high school and college students. The goal of the program obviously is to help these people prepare and develop themselves for careers in the artificial intelligence and machine learning fields.
If you happen to fit into those categories, I would highly recommend you take a look at this program, or any program really that is related to ML. Machine learning is the future, and the future is coming a lot faster than many people may realize.