Stuart Scott

November 30, 2022

New at AWS re:Invent: Adam Selipsky Keynote

In this post I want to recap and highlight some of the biggest announcements made by AWS CEO Adam Selipsky during his Keynote speech here in Las Vegas. As always, there was much anticipation as to what new services and features would be unveiled, and we were not disappointed. In this post I will be covering 17 announcements that were made! So let’s get straight into it!

It began with a big focus on data, and the first big reveal was the Amazon OpenSearch Service Serverless (Preview).

Amazon OpenSearch Service Serverless (Preview)

It’s not surprising that we are seeing more and more services becoming serverless with the great customer advantages that a serverless offering brings, less management, less overhead, and more time to dedicate resources on the task in hand. With the announcement of Amazon OpenSearch Service Serverless, you will no longer need to manage your own OpenSearch clusters, making it even easier to run your large scale search and analytic workloads, instead you can spend more time focusing on gaining valuable insights from your data with reduced complexity.

This means that the provisioning of any resources required are automatically created and managed for you, which can scale to cope with and manage the most demanding of workloads, removing the requirement to spend time optimizing your existing self managed OpenSearch clusters. This means that you can feel confident in the knowledge that no matter how complex your queries may be against varied sized data sets, the underlying infrastructure will scale and deliver the results you need.

Next up, we were introduced to Amazon Aurora Zero-ETL integration with Amazon Redshift.

Amazon Aurora Zero-ETL integration with Amazon Redshift (Preview)

Amazon Aurora is the fastest growing AWS service of all time, and so development into this service is always going to be expected. With that, we see that Aurora now supports zero-ETL (Extract Transform Load) integration with Amazon Redshift. So what does that offer you as a customer? Well if you are working with massive scale transactional data sets in Aurora, and we are talking petabytes of data here, then as soon as your data has been written in Aurora, you can then gain near real-time analytics of this data from within Amazon Redshift using built-in ML. This negates the need for you to build and maintain your own ETL processes and operations, simplifying your data pipelines.

Another added benefit of this feature enables you to combine transactional data from more than one Aurora database, enabling you to gain additional insights from multiple clusters at once.

Amazon Redshift integration for Apache Spark (GA)

In a nutshell, this allows you to easily run your Apache Spark applications across Amazon Redshift and Amazon Redshift serverless. This is great, as this now removes the need to introduce any 3rd party connectors that you may have been using previously to both read and write to Redshift from your Apache Spark applications. As a result this reduces complexity, configuration, security concerns and helps to maintain reliability with seamless integration capabilities.

With the support of multiple languages, including Scala and Python, it’s very easy to quickly create new Apache Spark applications to begin integrating with Amazon Redshift all while maintaining a high level of performance, enabling you to achieve a 10x application performance increase above and beyond existing connectors.

Amazon DataZone (Coming soon)

Amazon DataZone is a brand new service that provides a way of discovering, combining, and perhaps more importantly, sharing data, enabling collaboration across your whole organization. This data management service helps your organization to work cross functionally enabling you to achieve greater insight into your business data helping you to make better strategic decisions.

There are 4 main components to this service,

Business data catalog – Used to collate data from across the business for everyone to use, from different sources such as cloud providers including AWS, Azure, and GCP, on-prem DB’s and many SaaS applications
Publish/subscribe workflow with access management – Enables security management and auditing of your data ensuring people can only access what they are allowed to access, in addition to enforcing security between data producers and data consumers.
Data projects – Used to implement data groupings to simplify the management and security of working with different data sets. This enables effective collaboration between members of a given data project.
Data Portal – The DataZone portal provides a personalized homepage to help you explore and work with your data outside of the AWS management console

As you would expect from a data management service, it has integrations with many other AWS services, using Producer and Consumer data sources for example AWS Glue, Amazon Redshift, Redshift Spectrum, Amazon S3, RDS, DynamoDB, Athena, SaaS providers such as Salesforce, SAP and more.

ML Powered Forecasting with Q (GA)

This is the first of two new question types that were announced with Amazon Quicksight Q. You can now ask it to perform a forecast of future business performance based upon historical data metrics using ML and NLP.

This is ideal for those who do not have a data science background, or do not have the capabilities to learn time investing in developing new formulas to gain this insight. Instead, forecasting with Q allows anyone to forecast using up to 3 different metrics to determine the likely outcomes of business performance to determine its future outlook.

The second question type is “Why.”

“Why” questions with Q (GA)

Again using Amazon Quicksight Q, you can now ask the all important “Why” question. This is a great way to get answers to help you drive business innovation and make the right decisions for you and your customers. Analyzing data manually to determine these same answers can be a lengthy process and requires a unique skill set in analytics.

Understanding the different factors that contributed to certain outcomes based on historical data, and identifying and discovering those exact points is essential to gaining an advantage against your competitors. You no longer need to have highly skilled and trained analysts to find these answers for you, instead you can simply ask the ‘Why’ question and let Q return the answers for you in a graphical and easy to understand format.

Following this focus on data, Adam then shifted gears and moved the conversation towards security, a topic close to AWS’ heart, and something that is iterated upon every single year at re:Invent.

Amazon Security Lake (Preview)

As we have already seen with many other AWS services, centralized management is a key feature with many advantages, and Amazon Security Lake is no different. This is a new security service that enables you to gather, collate, monitor and analyze security data in a single centralized location from multiple different sources, including the cloud, your own on-premise environments, in addition to specific custom sources. Having a centralized source helps simplify your ability to analyze critical security data helping you to understand weaknesses and potential threats across your entire organization and infrastructure.

Using automation, Amazon Security Lake will take care of collating your security data across multiple regions enabling your security teams to work on identifying gaps in your environment, preventing security issues and a help to respond quickly to any security incidents that may occur. The Security Lake is stored in a specific Region and pulls data from a variety of different security sources including AWS CloudTrail, VPC Flow Logs, Route 53 Resolve query logs, Security Hub, where all data is normalized using OCSF (Open Cybersecurity Schema Framework), in addition to 3rd party security solutions.

Inf2 instances for EC2 (Preview)

New instance types generally make an appearance at most re:Invent keynotes, and so this year we have the introduction of Amazon EC2 Inf2 instances, which are designed for deep learning inference workloads, and as a result, as you could probably predict these instances promise to deliver better performance at a lower cost. So what performance do they offer? You can expect them to offer up to 3x better compute performance, and 4x higher throughput, and 10x lower latency when you compare these instances to the existing Inf1 instance types.

From a hardware perspective, they come packed with 2.3 petaflops of DL performance, up to 384 GB of accelerator memory with 9.8 TB/s bandwidth, and NeuronLink, an intra-instance ultra-high-speed, nonblocking interconnect.

These are highly optimized to help you run workloads centered around natural language understanding, speech recognition, video and image generation, helping you to deploy large language models at scale.

Hpc6id instances (GA)

In addition to the Inf2 instances, another instance type was also announced, the Hpc6id instances which are built on top of the AWS Nitro system, offering up to 200Gbps low-latency inter-node connectivity, especially designed for your high performance computing workloads (HPC).

These Hpc6id instances are powered by 3rd gen Intel Xeon scalable processors and provide a cost efficient option when running and scaling your high performance computing clusters. You should select these instance types if your workloads are data-intensive and memory-bound, and as they are optimized for vCPU performance you could reduce your compute costs by consolidating your workloads across a smaller fleet.

AWS SimSpace Weaver (GA)

This new managed service focuses on large-scale spatial simulations allowing you to create virtual worlds, and offers the ability to run them without having to manage any underlying infrastructure yourself. Spatial simulations allow you to create virtual environments whereby hundreds of thousands or even millions of virtual objects can interact together allowing you to create patterns from behavior, and these are often used in areas such as ecology, geography, and geoinformatics.

As you would imagine, running large scale virtual worlds takes a lot of processing power and can become hugely complex and generally requires a massive amount of compute power from a single host, which can become restrictive. AWS SimSpace Weaver alleviates this problem by allowing you to cluster EC2 instances together to create smaller spatial areas across a wider distribution of computing power. The service manages the provisioning of your resources, in addition to the management and operation of synchronization, replication and object transfer.

Amazon Connect – ML driven forecasting, capacity planning, and scheduling (GA)

There were 3 new announcements that related to the existing service Amazon Connect, which is a service that allows you to create a customer service contact center. The first announcement was that of ML driven forecasting, capacity planning, and scheduling which can be used to manage your staffing levels more accurately by predicting, allocating, and verifying you have the correct amount of agents to meet the demand of your incoming requests.

Powered by ML, you can now manage your contact center more efficiently by using forecasting and scheduling features to determine when you might need to alter staffing models to meet service targets based upon predicted volume rates, and in return enhancing customer satisfaction rates.

Amazon Connect – Contact Lens with agent performance management (Preview)

The 2nd Amazon Connect announcement was Contact Lens with agent performance management. This new feature, again powered by machine learning, helps you to gain a better insight into your conversations by analyzing them to spot trends, and the sentiments of the conversation itself using conversational analytics, much like Amazon Comprehend can do with text-based documents.

By classifying your conversations, it can help your contact center improve its agent interaction with your customers to ensure they are following processes and reading any required scripts they must follow during the calls. It’s also a great way to capture any issues that may be identified with regards to feedback spotting both positive and negative sentiment which could be passed onto your product teams to address if required.

This real-time analysis greatly reduces the amount of manual monitoring of your agents by your call center management team, and highlights the areas that agents need to improve upon.

Amazon Connect – Agent workspace with guided step-by-step actions (Preview)

The 3rd and final feature for Amazon Connect was Agent workspace with guided step-by-step actions and is focused on improving customer experiences by ensuring customer issues are resolved quickly and efficiently and onboarding is a simple and smooth process, all through the use of a single application available to your fleet of agents.

Simplifying the amount of tools and applications your agents have to use will greatly reduce the amount of waiting time your customers have to go through in addition to making your agents more empowered and efficient by using a single pane of glass approach to handling customer requests. The agent workspace integrates with the many different agent tools, bringing them together in a single screen. This ensures your agents have all the necessary information about the customer, in addition to step-by-step actions on how to resolve customer queries and issues.

AWS Supply Chain (Preview)

This has been developed to help you manage your supply chains and is powered by machine learning to give you a greater understanding of how to improve your end to end delivery process by giving you visibility and actionable insights.

With its ability to provide you with information to make more informed decisions, it can save you time and money by focusing on alleviating any potential supply chain bottlenecks which could cause potential low stock issues, or on the flip side, ensure you do not have an issue with overstocking.

ML is now being used across many AWS services, with one of the key advantages that it’s often able to determine actionable insights, and this is one of the key features of this service. With these insights you can reduce errors, highlight risks, deliver on your customer promises, and in turn improve your overall customer experience.

AWS Supply Chain can connect to your existing ERP management systems so you don’t have to worry about having to replatform your infrastructure or spend additional cost on licensing. With a real-time visual map you can quickly and easily view interactive dashboards to help you see your current inventory levels at various locations and facilities, this provides you with a high level overview of the overall health of your supply chain.

AWS Clean Rooms (Preview)

AWS Clean Rooms has been designed to allow the collaboration between multiple parties to gain insights into collective datasets while at the same time not revealing or allowing access to the raw underlying data itself to one another. This removes the need for storing a copy of your data outside of your AWS environment when collaborating with partners, which reduces risk and minimizes security concerns. Instead, you can create a Clean Room using the AWS Management Console, select your dataset, in addition to which partners or parties you’d like to collaborate with, and of course set the restrictions against those partners when accessing your dataset.

This allows everyone to collaborate on the same data without it leaving the confines of your AWS environment, and enables the participants to carry out the necessary queries. With configurable analysis rules, AWS Clean Rooms will ensure that control is maintained by only allowing queries to be run that have been authorized for each partner. With security being a top priority, cryptographic tools are used for all queries and to ensure that all data remains encrypted.

Amazon Omics (GA)

Amazon Omics is a brand new service that can be used to store, query, analyze, and generate insights from genomic and other omics data. This is going to be hugely beneficial to those working as scientists, researchers, and bioinformaticians as it will help them to generate faster insights and discoveries when working with genomic data, helping to speed up research in these critical areas for medical advancement.

The service itself can be broken down into 3 different components. Firstly, Omics storage which has the capability to store and share massive amounts of data at petabyte scale with a low cost point, and offers data provenance by identifying which data sets originated from the same source. Genomics generates a huge amount of data and so providing a storage platform with optimized costs was an essential part of AWS Omics.

Next, Omics Analytics helps you take your stored data and prepare it for deep analysis, allowing you to create and manage variant stores, import variant data and manage import jobs, create and manage annotation stores, and import and manage annotation jobs.

Lastly, Omics Workflows helps you to manage, provision, and scale the resources required to run your computations across your data using Amazon Elastic Container Registry.