Object Storage for SAP on AWS


Object Storage for SAP on AWS

In this course, we provide an overview of storage options for SAP environments running on AWS.

Learning Objectives

  • A greater understanding of the various storage offerings available when architecting SAP workloads on AWS
  • The use cases associated with each of the storage options and be able to describe the enhanced flexibility, durability, and security they provide

Intended Audience

  • Anyone responsible for implementing and managing SAP workloads on AWS
  • Anyone looking to take the AWS Certified: SAP on AWS - Specialty certification exam




Hello, and welcome to this lecture, where I will be discussing object storage using Amazon S3 for SAP workloads on AWS. In this lecture, you’ll learn how Amazon S3 can be used for storage of everything from SAP backups, to snapshots of Amazon EBS volumes, to Amazon Machine Images, or AMIs, of your SAP software baselines. S3 is also an ideal storage solution for SAP data archiving, which is an SAP-supported method of removing business-complete application data from the SAP database to improve application performance and reduce database storage requirements.

Amazon S3 is a secure, inexpensive, highly available, and infinitely scalable cloud service for object storage that allows you to pay only for the storage you need. It provides eleven nines, or 99.999999999% durability, making it the ideal choice for long-term storage of SAP backups and a much more reliable alternative to on-premises backup storage, which generally relies on magnetic disk or tape storage. As an added benefit, backup storage in Amazon S3 is an integral component of a good disaster recovery architecture, even for SAP deployments that are still running on-premises. This is because all data in S3 is stored off-site and replicated between multiple physical locations within an AWS Region. Data in S3 can be further replicated between regions using Cross-Region Replication, or CRR. Objects stored in S3 may also be encrypted for additional security.

And speaking of security, for SAP architectures that reside completely within AWS, Amazon S3 can be accessed privately from within your VPC by using what’s known as a VPC Endpoint. VPC Endpoints allow you to maintain an architecture where no network traffic ever needs to traverse the public internet, and instances in private subnets can access S3 without the use of a NAT Gateway.

Now when it comes to storing data, Amazon S3 provides a series of storage classes that offer flexibility and potential cost savings based on how often you need to access your data. And different objects within an S3 bucket may be assigned to different storage classes, so you can decide which storage class is appropriate for each object in your S3 buckets on a case-by-case basis. So let’s touch briefly on each of these storage classes.

The default Amazon S3 storage class is the S3 Standard class. This class offers the best performance for frequently accessed data, with millisecond-level access times, and has no required minimum storage duration.

For data that needs to be accessed less frequently, you can opt to use the S3 Standard-IA, or “infrequent access” class instead. Using this class still enables you to retrieve objects quickly when needed, but will incur a retrieval fee whenever you need to access an object. Objects must also be stored for a minimum of 30 days. So this class might be more appropriate for things like SAP backup files, which you may never need to access, but must be retrievable very quickly if and when you do need to access them.

Now when it comes to archiving objects for true long-term storage, you’ll probably want to use the S3 Glacier class. Glacier is ideal for long-term storage of backups and data archives that are infrequently accessed and can afford to be retrieved on the order of minutes to hours instead of milliseconds. Objects must be stored in Glacier for a minimum of 90 days. So this generally applies to things like archives that you may only need to maintain for regulatory or compliance purposes. And in exchange for this slower retrieval time, Glacier storage is much less expensive than S3 Standard or Standard-IA storage.

But by far, the least expensive storage class is the S3 Glacier Deep Archive class, which is best suited for very long-term data archiving that can afford to take up to 12 hours to retrieve. Objects must be stored in Glacier Deep Archive storage for a minimum of 180 days.

Now for data with unknown or changing access patterns, there is also an S3 Intelligent-Tiering class that will monitor how frequently objects are accessed and automatically move them to the most cost-effective storage class when these patterns change. To learn more about S3 storage classes in detail, including how to configure lifecycle policies that enable you to automatically manage the storage classes used for objects over time, I invite you check out this course:

So we’ve established that Amazon S3 should be your service of choice for SAP object storage backup in AWS. And one of the tools that you can use to assist with backing up your SAP HANA workloads is the AWS Backint Agent for SAP HANA, which is an SAP-certified backup and restore utility for your HANA databases and catalogs running on Amazon EC2 instances. The AWS Backint Agent for SAP HANA uses Amazon S3 to store these backups and supports full, incremental, and differential database backups, as well as backups for your SAP HANA logs. And these backups can then be restored using SAP HANA Cockpit, SAP HANA Studio, or traditional SQL commands.

Aside from the AWS Backint Agent, when it comes to your EC2 instances hosting SAP workloads and their attached EBS volumes, you can also leverage EBS snapshots to create point-in-time backups that are stored in Amazon S3, which can then be restored to new EBS volumes if necessary. Or to capture a full backup of an EC2 instance that also includes any pre-configured software, settings, and data, you can create an Amazon Machine Image, or AMI. AMIs are stored in S3 and are useful for launching new instances that need to conform to an already established baseline. And finally, you can use AWS-managed tools such as AWS Backup to centralize and automate the scheduling of backup resources that can be stored in S3.

Now it’s also possible to leverage things like custom scripts or other third-party utilities to automate the storage of SAP backup files and data in Amazon S3, but this would obviously require more effort to develop and maintain over time. You should instead strive to use tools like the AWS Backint Agent or managed services like AWS Backup wherever possible. And if you’d like more information about backup and restore strategies for SAP workloads running on AWS, please check out this course:

And as a final note: to monitor your use of S3, you can leverage Amazon CloudWatch, where S3 sends data points regarding storage usage, number of requests, and object replication. For more information on infrastructure monitoring using Amazon CloudWatch, check out this course:

About the Author
Learning Paths

Danny has over 20 years of IT experience as a software developer, cloud engineer, and technical trainer. After attending a conference on cloud computing in 2009, he knew he wanted to build his career around what was still a very new, emerging technology at the time — and share this transformational knowledge with others. He has spoken to IT professional audiences at local, regional, and national user groups and conferences. He has delivered in-person classroom and virtual training, interactive webinars, and authored video training courses covering many different technologies, including Amazon Web Services. He currently has six active AWS certifications, including certifications at the Professional and Specialty level.