3 Great Tips to Start Learning Amazon Web Services
If you are a professional in cloud computing or if you're looking to start a career in this exciting field, chances are that you will want to learn...Learn More
“Add GPU acceleration to any Amazon EC2 instance for faster inference at much lower cost (up to 75% savings)”
So you’ve just kicked off the training phase of your multilayered deep neural network. The training phase is leveraging Amazon EC2 P3 instances to keep the training time to a minimum, but it’s still going to take a while. With time in hand, you begin to contemplate what infrastructure you’ll use to run your inferences.
You’re already familiar with the merits of using GPUs for the training phase. GPUs have the ability to parallelize massive amounts of simple math computations, which makes them perfect for training neural networks. GPUs are more expensive to run than CPUs, but because they can parallelize the number crunching, you don’t need to run them as long as you would the equivalent training performed on CPUs. In fact, training on GPUs can be orders-of-magnitude quicker. So it may cost you more per hour to run a GPU, but you won’t need to run it anywhere nearly as long when on a CPU. Besides factoring in cost, training your models faster allows you to get them into production quicker to perform inferences. So in terms of the training phase, it makes complete sense to go with GPUs.
So your contemplation now focuses on whether to use GPU or CPU infrastructure to perform inferencing once the training completes and your model is ready. We know that GPUs cost more per hour to run. Performing inferences through a trained neural network are far less taxing in terms of required computation and data volume that needs to be ingested and processed. Therefore, CPUs seem to be the way to go. However, you know from past experiences that over time, your CPU hosted inferencing tends to bottleneck due to overwhelming demand and this makes you reconsider running the inferencing on GPUs, but you now need to budget in the extra cost as a project consideration. This dilemma of whether to use GPUs versus CPUs for inferencing, with respect to both cost and performance is all too familiar for many organizations. The choice of using a GPU or CPU was a fairly mutually exclusive upfront decision made when using EC2. As of today, this is no longer the case.
Amazon Elastic Inference is a new service from AWS which allows you to complement your EC2 CPU instances with GPU acceleration, which is perfect for hosting your inferencing models. You can now select the appropriate CPU sized EC2 instance and boost its number crunching ability with GPU processing. Like with many other AWS services, you only pay for the actual accelerator hours you use. What this means is that you can get full GPU processing power but being up to 75% cheaper than running an equivalent GPU sized EC2 instance.
For starters, Amazon Elastic Inference is launching with 3 types of Teraflop mixed precision powered accelerators: eia1.medium, eia1.large, and the eia1.xlarge
Amazon Elastic Inference has been seamlessly integrated into both the AWS EC2 console and the AWS CLI. In the following EC2 console screenshot, attaching GPU acceleration, is as simple as enabling the “Add an Elastic Inference accelerator” option:
The equivalent AWS CLI command looks like the following, noting that the existing API has been extended with a new optional elastic-inference-accelerator parameter:
aws ec2 run-instances \ --image-id ami-00ffbd996ef2211e3 \ --key-name DNN_Key --security-group-ids sg-12345678 \ --subnet-id subnet-12345678 \ --instance-type c5.xlarge \ --elastic-inference-accelerator Type=eia1.large --iam-instance-profile Name="InferenceAcceleratorProfile"
The following list itemizes several prerequisites that need to be in place to leverage Amazon Elastic Inference:
As you can see with a few extra configuration options in place you can have the best of both worlds, CPU hosted inferencing with GPU acceleration. You no longer need to spend time contemplating CPUs over GPUs – take both!!
Another game changer in the machine learning space from AWS – give it a try!!
The Internet of Things (IoT) embeds technology into any physical thing to enable never before seen levels of connectivity. IoT is revolutionizing industries and creating many new market opportunities, with management consulting firm McKinsey predicting the IoT market reaching up to $581...
This morning’s Andy Jassy keynote was followed by the announcement of over 20 new services across a spectrum of AWS categories, including those in Security and Compliance, Database, Machine Learning, and Storage. One service that jumped out to me was the AWS Security Hub, currently...
“Firecracker is an open source virtualization technology that is purpose-built for creating and managing secure, multi-tenant containers and functions-based services.”One of the great things embedded in Amazon Web Services DNA is their unparalleled vision and innovation in the compute...
Another new announcement has made by AWS here at re:Invent, this time in the security category.The Key Management Service (KMS) stores and generates encryption keys that can be used by other AWS services and applications to encrypt your data. A main component of KMS is the Customer...
In true AWS style, a number of new features and services were announced yesterday, the day before the official start of re:Invent.Three of these announcements were related to Amazon S3 which included: S3 Intelligent Tiering (A new storage class) Batch Operations for Object M...