Amazon Machine Images (AWS AMI) offers two types of virtualization: Paravirtual (PV) and Hardware Virtual Machine (HVM). Each solution offers its own advantages.
Today we’re going to talk about an important aspect of Amazon Machine Images that somehow fails to capture our attention. Choosing an AWS AMI virtualization type may not seem critical or relevant at first, but I believe everyone should have at least a basic understanding of how the different virtualization options function.
How many times have you actually thought about which kind of virtualization is best suited to your needs before you select your AWS AMI? Or better: how often have you thought about it, but ignored it and just started working anyway? When you select an AWS AMI to launch an instance you will see something like this:
What are these highlighted terms all about? I’ll explain.
The AWS AMI and the Xen hypervisor
Every AWS AMI uses the Xen hypervisor on bare metal. Xen offers two kinds of virtualization: HVM (Hardware Virtual Machine) and PV (Paravirtualization). But before we discuss these virtualization capabilities, it’s important to understand how Xen architecture works. Below is a high-level representation of Xen components:
Virtual machines (also known as Guests) run on top of a hypervisor. The hypervisor takes care of CPU scheduling and memory partitioning, but it is unaware of networking, external storage devices, video, or any other common I/O functions found on a computing system.
These guest VMs can be either HVM or PV.
The AWS AMI and HVM vs. PV
HVM guests are fully virtualized. It means that the VMs running on top of their hypervisors are not aware that they are sharing processing time with other clients on the same hardware. The host should have the capability to emulate underlying hardware for each of its guest machines. This virtualization type provides the ability to run an operating system directly on top of a virtual machine without any modification — as if it were run on the bare-metal hardware. The advantage of this is that HVMs can use hardware extensions which provide very fast access to underlying hardware on the host system.
Paravirtualization, on the other hand, is a lighter form of virtualization. This technique is fast, and provides near native speed in comparison to full virtualization. With Paravirtualization, the guest operating system requires some modification before everything can work. These modifications allow the hypervisor to export a modified version of the underlying hardware to the VMs, allowing them near-native performance. All PV machines running on a hypervisor are basically modified operating systems like Solaris or various Linux distributions.
This is in contrast to HVM, which requires no modifications to the guest OS and the host OS is completely unaware of the virtualization. This may add to the performance penalty because it places an extra burden on the hypervisor.
Let’s extend this discussion to the AWS AMI. AWS supports Hardware Virtual Machine (HVM) for Windows instances as well as Paravirtualization (PV) for Linux instances. Years ago, AWS would encourage users to use Paravirtualized guest VMs, because they were then considered more efficient than HVM.
Take note here, there is one major disadvantage with Paravirtualization. You need a region-specific kernel object for each Linux instance. Consider a scenario where you want to recover or build an instance in some other AWS region. In that scenario, you need to find a matching kernel — which can be tedious and complex. Nevertheless, I can’t say that this is the only reason that Amazon now recommends using the HVM virtualization versions of latest generation of their instances: there are a number of additional recent enhancements in HVM virtualization which have improved its performance greatly.
Here are some key factors that contributed to Hardware Virtual Machine’s closing the performance gap with Paravirtualization:
- Improvements to the Xen Hypervisor.
- Newer generation CPUs with new instruction sets.
- EC2 driver improvements.
- Overall infrastructure changes in AWS.
This table shows which flavor of AWS AMI (Amazon Linux) are recommended for each Amazon EC2 instance type:
Amazon currently recommends users choose HVM instead of PV. Ignoring their advice can have very real consequences. For example, in the AWS Frankfurt region, if you try to select an AWS AMI (Amazon Linux) using PV, you will be greatly restricted in your choice of instance types:
As you can see, the cheapest instance type you can select here is m3.medium. But going with the Amazon Linux AMI on HVM, the cheapest instance type available to you is t2.micro.
I am not sure it will work this way in all AWS regions, but this should serve to make you aware about the relevance of virtualization type — which we ignore at our own peril.
Traditionally, Paravirtualized guests performed better with storage and network operations than HVM guests, because they could avoid the overhead of emulating network and disk hardware. This is no longer the case with HVM guests. They must translate these instructions (I/O) every time to effectively emulated hardware. Things have also improved since the introduction of PV drivers for HVM guest’s. HVM guests will also experience performance advantages in storage and network I/O.
Because Amazon is changing their approach towards the AWS AMI, we have no choice but to address this topic. It is possible that in near future you may see HVM types completely replacing PV types. If this happens it is critical that you make informed decisions today. If you want to know how to create an Amazon AMI take a look at one of our labs.
I have not personally completed any migration from PV to HVM yet. If you have, it would be a great service to our community if you would share your thoughts/experience right here.