AWS Step Functions
AWS Step Functions
3h 46m

Domain One of The AWS Solution Architect Associate exam guide SAA-C03 requires us to be able to Design a multi-tier architecture solution so that is our topic for this section.
We cover the need to know aspects of how to design Multi-Tier solutions using AWS services. 

Want more? Try a lab playground or do a Lab Challenge!

Learning Objectives

  • Learn some of the essential services for creating multi-tier architect on AWS, including the Simple Queue Service (SQS) and the Simple Notification Service (SNS)
  • Understand data streaming and how Amazon Kinesis can be used to stream data
  • Learn how to design a multi-tier solution on AWS, and the important aspects to take into consideration when doing so
  • Learn how to design cost-optimized AWS architectures
  • Understand how to leverage AWS services to migrate applications and databases to the AWS Cloud

The primary option that comes to mind when thinking about Amazon Web Services and Serverless workloads is AWS Lambda. It is a fantastic resource that allows for serverless compute without having to deal with the burden of the underlying compute infrastructure.

Unfortunately, Lambda is not exactly well known for its flexibility and ability to perform long-running and complex operations. For example, Lambda was limited for quite a while to 5 minutes of execution time for your code, with it only just recently extending out to 15 minutes. 

Now that may seem like quite a lot of time when you are thinking about running a script or some simple calculations, but for anything more complex, it might not be enough.

For example; If you have ever played around with building simple applications with Lambda, you might have wanted to retry a connection, or wait until something becomes available before moving onto the next action, or even simply having the ability to run something in parallel. These are common workflows that many people desire and expect. Unfortunately, these features are not natively included with Lambda.

Don't let that dissuade you from using Lambda all together because this is where AWS Step Functions can take a leading role.

AWS Step Functions can help guide and shape these interactions and allow you to create interactive and complex systems that utilize all these features we just went over, and more, with complete orchestration and ease of transparency! With that in mind let’s dive in and talk about it.

AWS Step Functions can best be described as a state machine service. For those who don't know what a state machine is, think of your standard vending machine. 

A vending machine sits there waiting for a customer to come up to it and input money (that's its idle state). Once money has been added into the machine, it movies onto the next state, which would be item selection. The user inputs their choice, and the machine moves into the final state of vending the product. After the workflow has been completed it returns back to the idle state, waiting for another customer.

AWS Step Functions allow you to create workflows just like the vending machine, where you can have your system wait for inputs, make decisions, and process information based on the input variables.

With this kind of orchestration, we are able to run Lambda functions in ways that are not inherently supported by the service itself.

For example, we can use Step Functions to run our code.

  • In parallel, for when you have multiple items or tasks you want to process at one time In sequence, for when order is important.
  • In retry, maybe you want your code to keep executing until it succeeds, or reaches a time out of some sort.
  • If then, allows branching and logical trees for decision making.

With these options available for your Lambda functions, we are able to overcome probably the greatest hurdle of serverless and Lambda, which is the 15 minute limit of code execution.

This ability allows you to create very powerful fully serverless applications and workflows.

AWS Step Functions operates by reading in your workflow from an amazon state language file - a JSON based structured language used to define your state machine and its various components. 

Amazon State Language is a proprietary language that consists of a collection of states. These states in turn can do some type of work, and from there the machine can make the decision to move onto the next state. 

Here is an example of what Amazon State language looks like.

As you can see it is very much a JSON type language, and this is helpful because it's a familiar syntax that many developers are already used to writing in, but it might be confusing to those who are new. 

The good news is that AWS Step Functions provides a visual representation of your state machine right in the console. 

This visual graph updates in real time as you edit your code and provides valuable feedback during creation of your machines. 

Additionally, this visual flow graph is inspectable during runtime and after completion. This feature allows you to get a deeper understanding of what is happening behind the scenes. Each element can be inspected to show the inputs and outputs as they appear.

There are eight states that your state machine can be in at any time.Let me go over these individually. 

The Pass State is basically a debugging state or one to be used when first creating your machine. It allows you to pass its input value straight through to its output, as well as add a fixed result 

The Task State. This is where the work actually happened. With a task, you define a resource you wish Step Functions to run as well as a timeout period. For example, you could plug in your Lambda function here to run some code. This state is used often as a sub state (or action)  within other states.

The Choice State - given an input, the state machine chooses the correct output. Basically, an if then operation where you can run further application logic.

Wait - the state machine will pause and can wait until a specific time or until x amount of time has passed. This might be useful if you wanted an email for fire out at 8am everyday for example.

Succeed - simply the termination of the state machine in a successful fashion. Can be a part of a choice state for example to end the state machine.

Fail - also a termination state for the state machine, in a failed way. Fail states must have an error message and a cause.

Parallel State -  Executes a group of states as concurrently as possible and waits for each branch to terminate before moving on. The results of each parallel branch are combined together in an array-like format and will be passed onto the next state.

Map State - allows you to iterate through a list of items and perform tasks on them. You can also define the number of concurrent items being worked on at one time. Think of this like a for loop for processing data.

Using combinations of these states to create your specific state machines - allows you to build some very dynamic and impressive serverless solutions that can scale extremely well.

Here is a high-level example of what a few of these state can look like in action. Imagine we wanted to create a simple app that provides image tagging and creates thumbnails for png images.

The state machine might look something like this.

Upon taking in an input image the first step would be to extract any metadata about the image if possible. This would be a task state.

The output from that task can be sent onto the next state which would check to see if the image format is supported. This state is a choice state 

From there we either find that the image is unsupported and the operation fails or we move onto storing this metadata. Storing would be another task.

We can then send the image off to Amazon Rekognition to generate our tags and create a thumbnail in parallel. This would be a parallel state.

Finally, we would add the rekotags to the image itself or into a database and associate them later. 

And then the state machine ends. Neato burrito.

So far I have talked a lot about using Lambda as your interaction medium when performing tasks with Step Functions, but there are actually quite a few services that Step Functions can interact with directly.  If you take a look at this chart, you can see that Step Functions has quite a breadth of services available for you to use.

For example, you do not have to use Lambda to add an item into a table within DynamoDB, you can do it directly by calling that function specifically within DynamoDB. Here is what that might look like.

Here are a few other examples of what you can do natively within Step Functions:

  • Run an Amazon Elastic Container Service or AWS Fargate task
  • Submit an AWS Batch job and wait for it to complete
  • Publish a message to an Amazon SNS topic
  • Send a message to an Amazon SQS queue
  • Start an AWS Glue job run
  • Create an Amazon SageMaker job to train a machine learning model or batch transform a data set

One of the most impressive features of Aws Step Functions is its capacity for asynchronous callbacks. This means that if you have a workflow that requires something to be approved by a managing authority, or maybe you utilize a third party API that provides a service that takes hours or days or weeks to complete, Step Functions provides this ability, which can add dynamacy and resilience to your workflows. 

We also have the ability to nest child state machines within parent state machines. This provides greater benefits the longer you work with Step Functions, because you will find repeatable patterns occurring within your workflows fairly often. For example, you might have a core step function that needs to be referenced by other tangential services.   Having the capacity to nest your functions will save you a lot of time down the road, and help with encapsulation of that core business logic.

Now that we have a basic understanding of what AWS Step Functions is, and the pieces that make it up, I think it would be good to see a full example of what you can create with the service. 

This is quite possibly my favorite example, and it's a complete video on demand workflow leveraging AWS Step Functions, AWS Elemental MediaConvert, and AWS Elemental MediaPackage.

This architectural diagram shows a three-part, multi-faceted architecture that deals with the complete lifecycle of a video on-demand service.

All parts of this operation function completely serverlessly and involve multi-phased step function elements to orchestrate the entire process.

Starting with our source files, which might already be set up in our s3 bucket or could be placed there as they come in, we would have raw video. This video is archived in Amazon S3 Glacier while at the same time being pushed into our ‘Ingest’ Workflow by a Lambda function.

Let’s take a look at the ingest workflow and see what it does.

Inside here we have a few states that are all pretty simple to understand.

Input validation - Checking the file types to make sure they are supported.

Mediainfo, generates signed URLs for the source files and extracts metadata about the video.

DynamoDB Update takes all this relevant information and drops it into DynamoDB.

SNS Choice. A simple flag to determine if we want to be notified about the status of the uploads.

SNS Notification. Using Amazon SNS, it sends a notification about the status of the ingestion process, such as did it pass or fail.

Process Execute starts the processing workflow.

Again, all these tasks are completed automatically without you having to spin up any servers. Step Functions works through each state until completion and then proceeds onto the next one.

After ingestion, the video is processed and converted into various bitrates and sizes, with icons, and all the good stuff that you would expect from a steaming type service. Next, all the pertinent information is pushed out to be published, where it can then be delivered to the customer.

I won't go into each of the workflows because I think you can see the point here. Every step function workflow you see in this architecture has its own unique job and tasks that it performs. Each can be as complicated or as simple as it requires.

About the Author
Learning Paths

Andrew is fanatical about helping business teams gain the maximum ROI possible from adopting, using, and optimizing Public Cloud Services. Having built  70+ Cloud Academy courses, Andrew has helped over 50,000 students master cloud computing by sharing the skills and experiences he gained during 20+  years leading digital teams in code and consulting. Before joining Cloud Academy, Andrew worked for AWS and for AWS technology partners Ooyala and Adobe.