This was Adam Selipsky’s first re:Invent Keynote as the AWS CEO, and as always it was packed full of feature announcements and new service releases, so in this post I want to take a look at some of those and what they offer you as customers of AWS, starting with Graviton3.
The first announcement made by Adam focused on Compute, and a new custom-designed processor was unveiled, the Graviton3 processor! This was immediately highlighted as being 25% faster than the existing Graviton2 processor, but what else do we know about it?
Well, it’s been designed to achieve better price/performance than previous Graviton processors and has twice the amount of floating-point performance when running compute-intensive workloads, for example when crunching scientific and machine learning algorithms across Amazon EC2. In addition to this, they also have twice the speed performance when working with cryptographic operations compared to that of Graviton2. So it’s a great processor choice if you are looking to support and run high compute workloads, such as batch processing, scientific modeling, and media encoding to name just a few.
With the impressive performance offered by Graviton3, it has the capacity to accelerate ML workloads, and with this in mind, it also offers support for bfloat16 (Brain Floating Point) with up to 3 x enhanced performance.
To learn more about Graviton processors, please here: https://aws.amazon.com/ec2/graviton/
The release of this new process was quickly followed by a further announcement, the C7g instances for EC2
C7g instances for EC2
This new instance is built upon the previously announced Graviton3 Processor and is a new instance type to EC2s already a substantial array of EC2 instance types available. So with this in mind, it’s clear that C7g instances are designed primarily for your workloads that require and demand a high level of performance. This makes them a perfect choice for workloads such as HPC, gaming, distributed analysis, and across industries such as media and entertainment, financial services, and scientific research.
One interesting point to mention about C7g instances is that they are the first instance type offered by AWS that comes with DDR5 memory, giving you enhanced speeds in accessing any data stored in memory. This is a significant improvement against the previous C6g instances, providing a 50% increase in memory bandwidth. Other technical specs include an increase of network bandwidth of 20%, running at 30 Gbps, and supporting Elastic Fabric Adapters (EFA). These instances also offer a rich security feature set covering:
- Always-on memory encryption
- Support for pointer authentication
- Dedicated cache for each vCPU on the instance
- Support encrypted EBS volumes by default
The C7g instance was not the only new instance type announced, it was soon followed by the release of the Trn1 instances for EC2
Trn1 Instances for EC2
The scale at which Machine Learning is growing within the industry has called for AWS to develop their underlying instances to provide even better performance. With this in mind, Adam announced the new Trn1 instances, which are the first of the AWS Trainium-based EC2 instances.
These have been custom designed by AWS specifically for training deep learning models, such as voice recognition, NLP, image classification, and more while achieving the best price performance. This means that Trainium instances now boast the highest performance from a speed and provide the most teraflops of compute when used for ML training providing you with the opportunity to reach a wider scope of ML applications with even faster results.
From a specification point of view, Trn1 instances come with high-speed intra-instance connectivity to support any ML training running on your EC2 instances. They also carry support for up to 16 Trainium accelerators and twice as much network bandwidth of GPU-based instances by supporting 800 Gbps of Elastic Fabric Adapter network throughput.
Trn1 instances are deployed using the EC2 UltraCluster framework, this allows the combined compute power to be scaled with 1000’s of Trainium accelerators providing a staggering petabit scale, non-blocking network. This provides the largest UltraClusters that are now available on AWS giving you phenomenal scalability and power to train your deep learning models at incredible speeds
By making use of both the existing AWS inferentia chips, customers can now leverage the best price performance point for ML inference in addition to using Trainium for cost-efficient model training.
We now move away from chips and machine learning models with the introduction of the next announcement.
AWS Mainframe Modernization
There are some organizations across the globe that are running services and workloads across Mainframe technology. Mainframes were first introduced into the tech world back in the 1930 and have remained as legacy systems across many large-scale organizations to date. Some examples of their use case include bulk data processing, large-scale transaction processing for critical workloads, and basically handling high volume I/O. They are also known for their reliability, storage capabilities, and processing power.
AWS has announced the new AWS Mainframe Modernization platform. It provides a set of defined resources and tooling options to help you migrate your existing mainframe workloads into AWS-managed runtime environments. This allows you to build an end-to-end migration journey while being supported by the toolchains that AWS have defined that have been tried and tested across existing successful migrations
It uses a 4-stage process to help modernize your Mainframe workloads.
- Analyze – Perform analysis on your current applications running across your Mainframe and the dependencies associated with those applications.
- Develop – At this stage you must refactor/replatform your applications in preparation for the migration to AWS
- Deploy – Implement runtime environments for your selected applications
- Operate – Using CI/CD manage, run, and automate your applications in your runtime environments
From the get-go, AWS Mainframe Modernization will work with a range of existing AWS services to support your migration and modernization journey, including: Amazon Appstream, AWS CloudFormation, AWS Migration Hub, AWS DMS, Amazon S3, Amazon RDS, Amazon FSx, and Amazon EFS
To get started with AWS Mainframe Modernization, see the documentation here: https://docs.aws.amazon.com/m2/latest/userguide/getting-started.html
AWS Private 5G
With this new announcement falling under the domain of Networking and Content Delivery, Adam introduced us to a new service, AWS Private 5G, allowing you to set up and scale a private mobile, low latency, high bandwidth network in a matter of days.
Operating as a new managed AWS service, it will now be possible to implement your own private cellular network allowing you to connect your resources wherever they might be using both hardware and software supplied by AWS. With its flexible approach to scaling there are no long planning cycles, and expanding your network can be achieved in a matter of minutes with a few clicks.
After placing your order directly from the AWS console, defining your coverage and capacity needs, AWS will deliver and connect your network hardware. This may consist of small cell radio base stations and server infrastructure. Once your hardware infrastructure is installed and connected, you can activate your hardware ready for automated network configuration. On completion of this configuration and the network is operational and activated you can then insert the SIM cards into your 5G-ready devices to connect to the network. Using the AWS Management Console you can now manage your 5G cellular network and resources much like you would any other AWS resource.
By utilizing AWS Private 5G you can take advantage of 5G technology with your existing network, allowing you to extend the edge of your infrastructure even further, helping you to manage and run a new era of workloads.
For more information on Private 5G, please see here: https://aws.amazon.com/private5g/
Quickly shifting from 5G networks, Adam then focused his discussions towards AWS Lake Formation with 2 new announcements
Row and cell-level security for Lake Formation
AWS Lake Formation makes it easy to set up a secure data lake, which is effectively a storage location for your business or enterprise to collect structured or unstructured data to perform analysis against to gain meaningful business data.
With the introduction of this new feature, it offers additional security, and as we have learned over the years security is always the number one priority of AWS. Row and cell-level security enables you to control access to certain rows and columns from your query results, in addition to your ETL jobs that you run using AWS Glue, and this security is centered around the identity of the user who is performing the actions and accessing the data. This new feature negates the need for you to create multiple subsets of data each configured and destined for different roles within your organization.
For more information on how to implement Row-level security, please see this post here: https://aws.amazon.com/blogs/big-data/part-4-effective-data-lakes-using-aws-lake-formation-part-4-implementing-cell-level-and-row-level-security/
Transactions for governed tables in Lake Formation
The 2nd announcement on Lake Formation was the introduction of Transactions for governed tables, which was quickly announced following the security enhancements I just highlighted.
Governed Tables are in fact a new type of Amazon S3 table which supports ACID (atomic, consistent, isolated, and durable) transactions, allowing you to easily ingest, manage, modify and delete data at scale from multiple users at the same time, across multiple governed tables.
To help control the size of these governed tables and improve query performance, AWS Lake Formation will automatically review them to both compact and optimize the storage.
For more information on governed tables take a look at this post here: https://aws.amazon.com/blogs/big-data/part-1-effective-data-lakes-using-aws-lake-formation-part-1-getting-started-with-governed-tables/
It wasn’t long before Adam got onto the topic of Serverless, which makes an appearance every year one way or another. This time within the domain of Analytic services.
Serverless and on-demand Analytics
Here we have four new services being announced:
- Amazon Redshift Serverless
- Amazon EMR Serverless
- Amazon MSK Serverless
- Amazon Kinesis On-demand
Amazon Redshift Serverless
Amazon Redshift Serverless has been developed to simplify the operation of running analytics within AWS while maintaining performance and scalability. As with other AWS-managed serverless services, much of the heavy lifting has been removed allowing you to focus on the business requirement at hand. As a result, there are no clusters to configure and set up and you only pay for your data warehouse while you’re using it which is billed by the second, for example, if you were loading any data. Thankfully, while and if your data warehouse is idle, the billing stops!
With automatic provisioning of the required compute resources, it’s quick and easy to get going, and if the demand and workload increase, then so do your underlying resources without you having to provision them.
For more information on Amazon Redshift Serverless see here: https://aws.amazon.com/blogs/aws/introducing-amazon-redshift-serverless-run-analytics-at-any-scale-without-having-to-manage-infrastructure/
Amazon EMR Serverless
Amazon EMR Serverless allows you to quickly and cost-effectively run data analytics at petabyte scale, and all this comes without the need to configure and optimize any EMR clusters, avoiding the need to guess your cluster size. As a result Serverless EMR no longer requires you to provision, configure, and secure these clusters, it is managed for you via the managed serverless architecture. Again, any compute or memory resources that are required by your applications that have been built using Apache Spark, Hive, Presto, etc., are automatically scaled and provisioned for you.
Another benefit of using EMR serverless, it runs at the Regional level, which means that an availability zone outage would not impact your analytics workloads, the EMR job can simply be run in a different AZ.
For more information on Amazon EMR please see here: https://aws.amazon.com/blogs/big-data/announcing-amazon-emr-serverless-preview-run-big-data-applications-without-managing-servers/
Amazon MSK Serverless
You can probably already see a pattern forming here, but Amazon MSK Serverless allows you to run applications that utilize Apache Kafka within AWS without the need to manage capacity across these new highly available and secure MSK clusters. This managed AWS service will automatically handle the provisioning and scaling of both the required storage and compute resources for your MSK workloads. Amazon MSK Serverless offers through-put based pricing meaning you only need to pay for the data your stream and keep.
For more information on Amazon MSK Serverless see here: https://aws.amazon.com/msk/features/msk-serverless/
Amazon Kinesis On-demand
Amazon Kinesis On-demand offers a new capacity mode for Amazon Kinesis. The introduction of this mode negates the requirements for having to manage and specify the number of shards for your stream.
More and more organizations are working with streaming workloads, and in some cases, these workloads can fluctuate on a huge scale. This means that teams are having to spend time planning their deployments focusing on capacity, and monitoring their throughput allowing them to change capacity when required.
With Kinesis on-demand you no need to worry about managing the capacity for your streaming data, instead the on-demand capacity mode will automatically scale depending on your workload and data traffic.
This comes integrated with all pre-existing Kinesis Data streams that you might already be running without having to worry about any new APIs, while still maintaining high availability, low latency, and secure streams.
For more information on Kinesis on-demand, see the article here: https://aws.amazon.com/blogs/aws/amazon-kinesis-data-streams-on-demand-stream-data-at-scale-without-managing-capacity/
Amazon SageMaker Canvas
This is another new service in the arsenal of Machine Learning services offered by AWS, which has been designed to bring business predictions through the use of ML architecture and applications closer to those who have no previous experience with Machine Learning or Data Science. It offers an opportunity to learn and implement an entire ML workflow via a visual drag and drop interface, without having to write any code.
Amazon SageMaker Canvas empowers those in your business such as business analysts to gather and identify business predictions that could affect various decisions and business outcomes without having to write any code or understand ML algorithms and training parameters.
Using a self intuitive UI, you can quickly and easily select data sources, pick a training model, and deliver predictions with ease, all powered by the same underlying infrastructure and technology that’s offered by Amazon SageMaker.
Next up, Adam moved into the field of Financial Services, with the new announcement of Goldman Sachs Financial Cloud for Data.
Goldman Sachs Financial Cloud for Data
This collaboration between Goldman Sachs and AWS provides and highlights a new set of analytic solutions focused on helping those working within the financial sector. It will provide guidance and help to those looking to discover and analyze huge financial data sets at scale in AWS. Thanks to the many years of experience that Goldman Sachs has had with data management and analytics in this sector, customers can now benefit from the best practices and innovations learned.
The last sector that Adam provided new announcements for turned to IoT, with the first announcement being AWS IoT TwinMaker.
AWS IoT TwinMaker
This is a new service that makes it easy to create and use digital twins of real-world systems, but what is a digital twin?
A digital twin is essentially a digital depiction of a physical system, for example, a factory. This digital factory would be built using real-world data to ensure that it is as identical to its physical representation as much as possible.
The preview of AWS IoT TwinMaker has been designed to help you get off the ground with creating your own digital twins for your own needs more easily by connecting various different data sources together, such as IoT sensors, business apps, and video feeds without having to migrate and move all of this data into a single location. Built-in data connectors allow IoT TwinMaker to integrate with different AWS services with ease, these include:
- AWS IoT SiteWise
- Amazon Kinesis Video Streams
- Amazon S3
In addition to these built-in connectors, you are also able to create your own data connectors giving you flexibility by utilizing data sources like Snowflake and Siemens MindSphere
By combining these different data sources across the integrated AWS services, AWS IoT TwinMaker can help you create your digital twin allowing you to model your real-world environment using 3D visualizations, knowledge graphs, and dashboards
For more information on AWS TwinMaker, please review the AWS documentation here: https://aws.amazon.com/about-aws/whats-new/2021/11/aws-iot-twinmaker-build-digital-twins/
Finally, the last announcement made was AWS IoT FleetWise.
AWS IoT FleetWise
This new service will allow you to easily collect data from millions of vehicles allowing you easily analyze it within AWS in near-real time.