Cloud Academy Team

June 30, 2017

How to Implement a Scheduled Low Cost and High Performance Microservice

Sometimes, we need to execute scheduled tasks with a time/event based approach and we need to have enough capacity to complete these tasks quickly, without affecting other running services and in a way that is also cost effective. For developers, one of the biggest benefits of cloud computing is the ability to run their own code, on a pay-per-use basis, without worrying too much about the underlying infrastructure. Cloud services such as AWS Lambda, Azure Functions, or Google Cloud Functions are made precisely for this, allowing developers to focus only on the implementation of their code.

One downside, however, is that they can be only used for fast, low intensive functions. To take advantage of higher computing capacity, you’ll also want to be able to control costs. In this post, I’ll be sharing some different approaches for getting the capacity you need while optimizing your costs.

FaaS

Developers can use AWS Lambda, Azure Functions, or Google Cloud Functions to execute their code. Those services will then automatically take care of the underlying infrastructure, monitoring, and scaling in/out of the resources needed to run the code. While there are many advantages of using microservices, asynchronous tasks, web APIs, and other software scenarios, unfortunately, they won’t suit all of the use cases that you’re likely to require.

Your code is executed using an event-based strategy. For example, if you are using AWS Lambda you can execute your code on specific events such as web API calls with AWS API Gateway, AWS DynamoDB table changes, or AWS SNS notifications. Based on the amount of triggered events, which reflects in Lambda execution times, AWS Lambda will automatically scale up the infrastructure where your code is running. This gives you the ability to automatically satisfy huge amounts of invocations and automatically scale down when the invocation number is decreased.

The fact that they can only be used for fast and low-intensive functions is one of the biggest drawbacks of such services. For example, AWS Lambda has some hard limits on both memory allocation (1536 MB) and maximum execution duration (300 seconds). Additionally, all of those services can only execute code written in a limited number of programming languages. This means that you would not be able to use these services if you need to run a memory expensive, time-consuming task.

If you would like to go deeper into the serverless world, our learning path, Getting Started with Serverless Computing, is a great introduction. It includes a mix of video courses, hands-on labs, and use cases to help you fully understand the key concepts.

AWS Step Functions

A workaround for the mentioned serverless limits is to use an orchestration service for your serverless functions, such as AWS Step Functions. This kind of service helps developers orchestrate a customizable workflow of different serverless functions. For example, if you use AWS Step Functions, this approach requires that you implement your application as a set of different Lambda functions and then use AWS Step Functions to orchestrate their execution in a user-defined flow.

Even this approach has its limitations. It can be difficult to orchestrate different functions, and it’s not useful if your code cannot be decomposed into smaller, specific functions. Also, you still have the previous Lambda hard limits on each of your Lambda functions. This post on AWS Step Functions is a great resource for learning more about AWS Step Functions and how to use it.

Existing infrastructure/spot instances

If you need more computing capacity than serverless services offer, you’ll have to use one or more instances from your cloud provider of choice. In doing so, we will still want to optimize costs for this computing capacity. Here are some different approaches that we could use to obtain more capacity while controlling costs.

If you already have some EC2 instances with enough available compute capacity, you can make use of them to run your brand new code. Unfortunately, in this case, you are losing one of the greatest benefits of a serverless approach: not worrying about where your code is executed. For example, you will need to make sure that your code doesn’t affect any of the services that are already running. You will also need to know some networking details of your instances such as VPC, subnets, and security groups. Last but not least, you can no longer only monitor your code. You will also need to monitor your infrastructure and take appropriate actions if a component is no longer working as expected.

AWS gives us the ability to bid on spare Amazon EC2 computing capacity. When your bid is accepted, AWS provides you the requested amount of instances. In this way, we are able to obtain some computing capacity (in the form of EC2 instances) at a lower cost than using standard on-demand instances.

Unfortunately, even this approach doesn’t work for every use case. For example, if your application can’t be stopped once started, you will face problems with spot instances. A workaround using this approach is to implement your code in an incremental manner. Once started, your script can be stopped and when started back again, it starts from where it left off. By the way, implementing or refactoring your code in such a way is not always an easy and can require a lot of time. Additionally, once required instances are provisioned, you can never be sure that they will remain available until the completion of your service. You also can not always be sure that the requested computing capacity will always be provisioned. In fact, the requested amount of computing capacity could not be available, or your bid could not be accepted.

Dedicated on-demand instances

Standard on-demand instances seem to be the right solution for running a microservice in a dedicated environment for the required amount of time.

Take into account that our microservice is a fairly simple python script such as this one:

from my_module import Job
if __name__ == '__main__':
    job = Job()
    job.execute_long_process()

This code must be executed once a day in a dedicated EC2 instance that must be in running state only from the beginning to the completion of our script.

The first thing I want to do is to pack my application as a Docker image. In this way, I am absolutely sure that it will run exactly as I expect. This is very simple using the following Dockerfile definition:

FROM python:2.7
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
COPY requirements.txt /usr/src/app/
RUN ["pip", "install", "-r", "requirements.txt"]
COPY main.py /usr/src/app/
CMD ["python", "main.py"]

Then, I create the Docker image running the following command:

docker build -t my-long-service .

I can now safely store my image on my Docker registry of choice. For example, if I am using AWS ECR, I can store my image by executing the following commands:

docker tag my-long-service <account_id>.dkr.ecr.<region>.amazonaws.com/my-long-service
docker push <account_id>.dkr.ecr.<region>.amazonaws.com/my-long-service

Now that my code is ready to be executed, I need something that can start an EC2 instance on a daily basis. Of course, this component can’t be executed in any existing or 24-hour instance.

AWS can help us achieve our goal using the following services:

AWS Lambda: We need to implement a simple serverless function aimed at starting an EC2 instance.
CloudWatch Event: An event scheduled on a daily basis that triggers the previous Lambda function.

The Lambda function could be implemented in several different ways:

It can start an already configured EC2 instance that has been stopped.
It can spin up a brand new instance every day using a standard AMI. The Lambda function must provide the required configuration as a user data parameter.
It can spin up a brand new instance every day using a pre-configured AMI. An AMI could be created manually or using something like HashiCorp Packer.

For the purpose of this post, I prefer to use the first approach to keep each component separate. If you prefer to use a different approach, you will not have to make many changes inside the Lambda code. To fulfill the requirements of the first approach, the Lambda function could be something simple, such as the following:

import boto3
def lambda_handler(event=None, context=None):
    ec2 = boto3.client('ec2', region_name=<region>)
    ec2.start_instances(InstanceIds=[<instance_id>])
    print('Instance started')

Once started, we want to be sure that our instance only runs our service and then stops or terminates. To do so, we can use an init script using something like Init.d or upstart. For example, our init script could be written like this:

#!upstart
description "my-long-service"
start on started docker
stop on shutdown
script
  eval "$(aws ecr get-login --region=<region>)"
  docker pull <repo>/<image>
  docker run <repo>/<image>
  shutdown -h now
end script

Now, we will have to wait until the Docker process is running. Once this state has been reached, we will authenticate it to the ECR registry, pull the latest version of the Docker image, and execute a Docker container from it. Once the Docker container is finished, the instance is stopped.

Now, everything we need to run a memory expensive, long-running service on a dedicated instance has been set up. Using this architecture, we do not need any instances that run all day long, and we can make use of AWS to automate the entire process to execute our service.

AWS provides some additional services and capabilities that could be easily integrated into the described flow. For example, in a production environment, I would add the following to the described architecture:

Integrate CloudWatch Logs to monitor our code in real time from CloudWatch.
Integrate some CloudWatch alarms. For example, if we monitor instance memory utilization, we could receive an alert when it hits a certain threshold. We could then take appropriate actions to fix the problem, such as upgrading the EC2 instance type.
Integrate SNS and send an email when the service is completed or when any error occurs.

I hope this post will help you understand some of the different architectural approaches that you can use to implement a scheduled microservice in the cloud.
All of the above approaches are valid, and you should always choose the best architecture based on your unique needs and expectations.