Data Transfers with AWS DataSync
SAP HANA Data Tiering on AWS
The course is part of this learning path
** Not all content covered in the course introduction has been added to the course at this time. Additional content is scheduled to be added to this course in the future. **
In this section of the AWS Certified: SAP on AWS Specialty learning path, we introduce you to strategies for deploying your SAP applications and databases to AWS.
- Understand how to use AWS CloudFormation to provision and manage resources in AWS
- Identify solution architectures for SAP deployments on AWS, including SAP HANA
- Describe various approaches for migrating on-premises SAP workloads to AWS
The AWS Certified: SAP on AWS Specialty certification has been designed for anyone who has experience managing and operating SAP workloads. Ideally you’ll also have some exposure to the design and implementation of SAP workloads on AWS, including migrating these workloads from on-premises environments. Many exam questions will require a solutions architect level of knowledge for many AWS services. All of the AWS Cloud concepts introduced in this course will be explained and reinforced from the ground up.
Hello, and welcome to this lecture, where I will discuss how to implement SAP HANA Data Tiering on AWS. Before we begin, let me explain the concept of Data Tiering in a little more detail. SAP HANA is a highly performant, in-memory, multi-model data store that supports advanced, real-time analytics – so in essence, it’s a database that uses memory instead of disk for storing data. And while this makes it incredibly fast, it also forces you to use servers and EC2 instances that have extremely large quantities of memory – enough memory to hold literally all of your data.
But what if only a small subset of your data actually needs to be accessed or updated in real-time? Data Tiering allows you to categorize your SAP HANA data into hot, warm, and cold tiers based on how fast and how frequently the data needs to be accessed, as well as whether or not the data ever needs to be updated or if it should remain in a read-only, archived state.
So the data in your hot tier will always remain in memory directly on your SAP HANA instances, where it can always be queried and updated in near real-time. Your hot tier should be reserved for your organization’s most mission-critical data. And you should always use EC2 instances that have been certified for SAP HANA in your hot tier. For more information about supported EC2 instance types for SAP HANA, please check out this course:
Now unlike the data in your hot tier, data in your warm tier may be offloaded from your primary SAP HANA database instance’s memory onto another instance. And while this may add some latency to the retrieval of data when compared with the hot tier, data in the warm tier can still be both queried and updated very efficiently. And as an added bonus, data in both your hot and warm tiers are integrated together into a transparent view. So in other words, applications that need to access data in both the hot and warm tiers won’t need to know that this data may actually exist in different physical locations across these two tiers.
And finally, there’s the cold tier. The cold tier is reserved for data that will never need to be updated and doesn’t need to be retrieved in the fastest amount of time, either. We’ll see later in this lecture how services such as Amazon EFS, S3, and even S3 Glacier can be used to significantly reduce your costs when storing your non-critical SAP HANA data in the cold tier.
Now as I mentioned earlier, you’ll always want to use certified SAP HANA EC2 instances for your hot tier. But your options for supported SAP HANA Data Tiering solutions for your warm and cold tier storage will depend on which SAP product you are using – whether it’s native SAP HANA, SAP Business Warehouse or BW/4 HANA, or SAP Business Suite on HANA or S/4 HANA. So let’s talk about all of these options in a little more detail, beginning with warm data tiering.
Your first option for warm data tiering is to use what’s known as SAP HANA Dynamic Tiering. Dynamic Tiering is an optional add-on for SAP HANA that extends the HANA in-memory data store with a database that stores its data on disk and allows you to store up to 5 times more data in the warm tier than the hot tier. And to do this, it uses a service process called esserver, which runs on a separate dedicated server. So this is great for your native SAP HANA use cases.
And you can even achieve high availability by replicating this setup across two different availability zones and leveraging SAP HANA System Replication to keep the two instances in sync with each other.
Your next option for warm-tier data storage is to use what are known as Extension Nodes within an SAP HANA scale-out deployment. Now unlike Dynamic Tiering, which runs in a separate process, Extension Nodes are actually separate full instances of SAP HANA. And because of that, your Extension Nodes can leverage the full feature set of the SAP HANA database. Now you can have one or more of these scale-out instances serving as Extension Nodes to store your warm tier data, and each node can store data up to two times the total amount of memory on that node. So for instance, if your extension node instance has 1 TB of memory, it can store up to 2 TB of warm-tier data. And the performance of warm tier data in Extension Nodes is nearly equivalent to the in-memory performance you get with your hot tier storage as well.
Now if you’re running SAP Business Suite on HANA or SAP S/4 HANA, you also have the option to use what’s known as Data Aging for your warm-tier storage. Data Aging allows you to move older data that is less frequently accessed from memory, or what is called the “current area” onto disk, or what is called the “historical area.” And when you do need to access this historical data to read or update it, Data Aging will automatically load the data back into memory, or the “current area” again.
So that’s hot and warm data tiering. And the thing to remember here is that your hot data is your most critical, most frequently accessed data, and it will always remain in memory on your SAP HANA instance. But depending on the solution you choose to implement for warm data tiering, your warm tier data may be offloaded onto a Dynamic Tiering instance, or an extension node, or even onto disk using Data Aging. But your warm tier data can always still be both queried and updated. Now for your legacy or archive data that doesn’t ever need to be updated, you can leverage cold data tiering instead. So let’s briefly talk about your options for cold data tiering.
Now your options for managing cold tier data will depend on the flavor of SAP HANA that you’re running, whether you’re running native SAP HANA, SAP Business Warehouse on HANA, or SAP S/4HANA or Suite on HANA. But no matter which option you choose, your cold tier data will always reside on some sort of external data store. So let’s discuss all of your options, beginning with native SAP HANA.
If you’re running native SAP HANA, you’ll use something called the Data Lifecycle Manager, or DLM tool, to move data from memory to your external cold storage location. Your first option is to use DLM with the SAP Data Hub to move data to and from your cold store. The SAP Data Hub can be deployed on a series of Kubernetes nodes using the Amazon Elastic Kubernetes Service, or EKS. And from there, it leverages Amazon S3 to store your cold tier data.
Your other option is to use the SAP HANA Spark Controller, which works alongside the Spark SQL SDA adapter. Now to use the Spark Controller, instead of having Kubernetes nodes running the SAP Data Hub, you’ll use a service like Amazon Elastic Map Reduce, or EMR, to run a Hadoop cluster instead. And just like the previous use case with the SAP Data Hub, your Hadoop cluster can persist the cold tier data in S3.
Now if you’re running SAP Business Warehouse on HANA, or SAP BW/4 HANA, you can either use what’s called Near Line Storage, or NLS, or Data Tiering Optimization, or DTO, instead of the DLM you would use with native HANA. And if you choose to use NLS, you’ll have two options from there. Your first option with NLS is to use SAP IQ to store your cold tier data, which can run on an EC2 instance like you see depicted here.
Or instead of SAP IQ, your other option is to use NLS with Hadoop. From there, you can leverage a third-party connector to persist your cold tier Hadoop data in S3.
Your final option is to use DTO with the SAP Data Hub, which also stores your cold tier data in S3. Now this option is only available if you’re running SAP BW/4HANA, but you’ll notice the architecture looks just like the DLM with SAP Data Hub use case we saw earlier for native SAP HANA. So again, this will use Kubernetes nodes and EKS for the SAP Data Hub, and your cold tier data will be stored in S3.
And finally, I want to talk about cold tier storage options for SAP S/4HANA or Suite on HANA. For these, you can use SAP Information Lifecycle Management, or ILM, with SAP IQ. And this architecture should also look familiar, as it’s the same one we saw for SAP Business Warehouse on HANA that used NLS with SAP IQ. In this case, we’re using ILM with SAP IQ instead, but the idea here is the same: you’ll run SAP IQ on an EC2 instance to store your cold tier data.
Your next option involves an approach called SAP Archiving. So for SAP Archiving, you could also utilize ILM, or you can choose to just leverage your existing data archiving process. And for your shared file system with SAP Archiving, you can use Amazon EFS for your Linux-based workloads, or FSx for your Windows-based workloads. In either case, you’ll mount your shared file system as your archive file system and then use the SAP transaction code SARA to archive your cold tier data into this file system. And from there, you can also configure backups of your archive file system to go to S3.
Your other option for SAP Archiving is to use an EBS volume in place of the EFS or FSx shared file system. And in this case, it makes the most sense to use an sc1, or Cold HDD type for your archive EBS volume. It’s the least expensive volume type, but it’s also ideally suited for data archiving. And you still have the option to back up your archive data from this EBS volume to S3, where you can also set up a lifecycle policy that will ultimately copy this data to S3 Glacier for long-term retention and storage.
So that covers some of the most common options for setting up SAP HANA warm and cold data tiering, and now you have an idea how these architectures might look when you’re implementing them on AWS.
Stuart has been working within the IT industry for two decades covering a huge range of topic areas and technologies, from data center and network infrastructure design, to cloud architecture and implementation.
To date, Stuart has created 150+ courses relating to Cloud reaching over 180,000 students, mostly within the AWS category and with a heavy focus on security and compliance.
Stuart is a member of the AWS Community Builders Program for his contributions towards AWS.
He is AWS certified and accredited in addition to being a published author covering topics across the AWS landscape.
In January 2016 Stuart was awarded ‘Expert of the Year Award 2015’ from Experts Exchange for his knowledge share within cloud services to the community.
Stuart enjoys writing about cloud technologies and you will find many of his articles within our blog pages.