AWS Lambda and the Serverless Cloud
AWS Lambda evolution and re:Invent 2015 news Tim Wagner - general manager of AWS Lambda and AWS IoT at AWS - talked about how AWS Lambda - and the...Learn More
Among the AWS Lambda announcements at the 2016 re:Invent, Werner Vogels introduced Lambda@Edge (in preview) for running Lambda functions at CloudFront locations. It’s part of the larger movement focused on letting you execute your code wherever you want in your AWS architecture.
I signed up for the preview, and today, I’d like to share my personal experiment with Lambda@Edge for serving dynamically generated, minimized, and compressed HTML pages. We’ll also look at some of the most interesting use cases that Lambda@Edge supports.
Although still in preview mode, you can already see the CloudFront trigger showing up in the AWS Lambda Console, as well as a new Edge Node.js 4.3 runtime.
If you are a curious developer, you have probably already tried Lambda@Edge and experienced the strong limitations imposed by this new programming model. For example, your functions can’t run longer than 50 ms, and you can only access or manipulate the upcoming request (i.e. by the client) or the generated response (i.e. by the origin), depending on which event is being handled. While some of these limitations may disappear with general availability, others are not likely to change.
Even if you already have access to the preview, you couldn’t do much with Lambda@Edge other than header manipulation until last week, when Jeff Barr announced a new feature for generating custom HTTP responses.
Jeff Barr also announced another feature: custom logging statements. In fact, Lambda@Edge can now write custom logs directly to Amazon CloudWatch. This functionality will allow you to generate custom reports about your CloudFront distributions. However, in this article, I will mostly focus on content generation and latency optimization. For now, let’s just say that you can write up to 4KB of custom logs per invocation, simply by calling the usual console methods (log, info, error, and warn).
While the set of use cases for Lambda@Edge is relatively small, I think it offers huge potential.
Since you don’t have access to any remote storage or database and you can’t execute an HTTP call, all of the information required by your Lambda Function must be already available at the Edge location, either by inspecting the request or the response (headers and cookies included).
I think the primary use cases can be organized into three categories:
Let’s see what each category can offer and take a look at the most interesting use cases.
Note: I made up these three categories and you won’t find them on the official AWS documentation.
This category includes all of the use cases where you need to enhance or modify the upcoming request or the generated response with additional headers or content. In this scenario, you will always forward the request to your CloudFront origin (or to the CloudFront cache).
For example, you may want to add custom HTTP headers to an upcoming request to improve your distribution caching layer. You can find a real world scenario by tiny.pictures here. Since the User-Agent header is not white-listed by CloudFront for caching reasons, they have implemented the computing logic to understand whether or not the client supports WebP image format with Lambda@Edge. To achieve this, they inspected the User-Agent header and then added a custom header (which will be white-listed and cached by CloudFront) to the same request before the origin receives it.
Similarly, you may manipulate the response already generated by the origin and implement edge-dependent modifications. For example, you could customize some timezone-related information based on the particular Edge location, or handle special response formatting based on client capabilities, such as converting JSON to XML or vice versa. Note: Serving different data formats based on client capabilities is generally not considered a best practice (i.e. it’s not RESTful at all and SEO experts would discourage it too). Here, I’m considering special scenarios where you need to solve compatibility issues or unexpected corner cases.
Another interesting use case that requires request or response manipulation is A/B testing. You could implement cookie-based A/B testing either by editing the request path or by manipulating the response body. I would recommend the first option, as it makes the chosen variation more explicit at the origin, where you may also take care of creating the initial cookie and tracking events and conversions.
With the very same header-based technique, you can quickly implement URL rewriting, temporary or permanent redirects, or simple cookies initialization. For example, you might alter the request path and convert query string parameters to explicit paths for your origin (i.e. pagination, filtering, navigation, etc.).
Here is a recap of the possible use cases based on request/response manipulation:
The main advantage of generating content directly at the Edge is that requests will not hit the CloudFront origin at all. It means that you don’t have to worry about distributing your server-side code across multiple regions because each user will only interact with the closest Edge location.
Please note that you can already distribute static content this way with S3 and CloudFront.
Lambda@Edge allows you to achieve the same performance with dynamic content as well. Of course, all of the information must be available at the Edge without any additional network calls (i.e. no databases or 3rd-party integrations), therefore the use cases are quite limited. Server-side rendering at the Edge could be a new way to handle simple cases without messing up client-side rendering, which often adds a considerable amount of assets and dependencies (i.e. more HTTP calls!).
For example, you may generate error pages or login/signup forms on the fly in the case of non-authenticated users. Although this is not a best practice for SEO concerns (without redirects), it may be an interesting solution for optimizing the user experience on certain pages of your website. Given the importance of pages related to user onboarding, customer support, and maintenance windows, Lambda@Edge might optimize the UX by reducing latency to the minimum.
Although these scenarios may sound pretty restricted, I think Lambda@Edge has a very underestimated potential when combined with CloudFront’s caching and server-side rendering.
It’s worth mentioning that Lambda@Edge currently allows you to upload only 1MB of code, dependencies included. This is quite limiting if you need complex libraries to manipulate images or render HTML templates. Also, it currently supports only Node.js 4.3.
This category only aims at optimizing the network latency, and it could include a mix of the previous scenarios. In some use cases, you simply don’t want to hit the origin if not needed.
For example, if your origin is a RESTful API that always requires the Authorization header, why would you need to hit the origin when a request does not contain it? You could generate a 401 Unauthorized response at the edge, and therefore reduce response time and origin load.
With the same technique and goal, you could implement any custom validation logic or transformation at the Edge and optimize special API endpoints. Even if the overall cost might be higher, user experience will improve without increasing your load at the origin. Also, the logic you implement with Lambda@Edge could be completely transparent to the client and the origin—provided that you are already using a CloudFront distribution—so that you won’t need to rewrite URLs around and you can easily remove or improve the Lambda@Edge without impacting the system.
Eventually, if AWS allows you to associate AWS@Edge with your API Gateway distributions you may even end up optimizing costs this way, assuming that invoking Lambda@Edge will be cheaper than invoking API Gateway + Lambda. If the pricing is the same with AWS Lambda, you might spend the same while reducing network latency.
I would need to resolve three issues:
My Lambda@Edge Function code looks like the following (complete gist and instructions here):
You could make the dynamic rendering arbitrarily complex to cover specific scenarios. Your rendering could involve data stored in cookies or data that depends on the current time or even the specific Edge location. As a simple proof of concept, I’m happy with the result, and I’m looking forward to improving it and using it in real production use cases.
I will be happy to receive your feedback as well, although I don’t expect this experiment to become a reference code base since things might change consistently with general availability.
We have great expectations for Lambda@Edge, as it’s bringing computing capabilities not only out of your virtual machines but also out of data centers and as close as possible to your end users. The preview phase is opening up interesting scenarios for serverless computing at the edge. Or can we say origin-less already?
At the same time, I expect to see even more possibilities coming soon with general availability. For example, I would love to see Lambda@Edge support for Python and native integration with API Gateway distributions. Also, I think many interesting use cases will require more than 1MB of code, and I hope to see it increased to at least 10MB. I can understand why it is a tough tradeoff between propagation efficiency, storage at the edge, and programming flexibility. On the other hand, I think the constrained memory (128MB) and execution time (50ms) will disappear soon, although I don’t feel like they are such an annoying limitation for the most likely use cases.
I can’t wait to see more statistical proofs regarding performance efficiency and to hear about more advanced or unexpected scenarios where Lambda@Edge can make a difference.
Let us know what you think about Lambda@Edge, and feel free to share your experiments and the use cases that we may have missed.
If you are just starting out on your journey toward mastering AWS cloud computing, then your first stop should be to understand the AWS fundamentals. This will enable you to get a solid foundation to then expand your knowledge across the entire AWS service catalog. It can be both d...
The DevOps Handbook introduces DevOps as a framework for improving the process for converting a business hypothesis into a technology-enabled service that delivers value to the customer. This process is called the value stream. Accelerate finds that applying DevOps principles of flow, f...
The speed at which machine learning (ML) is evolving within the cloud industry is exponentially growing, and public cloud providers such as AWS are releasing more and more services and feature updates to run in parallel with the trend and demand of this technology within organizations t...
AWS re:Inforce 2019 is a two-day conference for security, identity, and compliance learning and community building. This year's keynote, presented by AWS Vice President and CIO, Stephen Schmidt, announced the general availability of AWS Control Tower and the new VPC Traffic Mirroring fe...
Being able to architect your own isolated segment of AWS is a simple process using VPCs; understanding how to architect its related networking components and connectivity architecture is key to making it a powerful service. Many services within Amazon Web Services (AWS) require you t...
AWS is renowned for the rate at which it reinvents, revolutionizes, and meets customer demands and expectations through its continuous cycle of feature and service updates. With hundreds of updates a month, it can be difficult to stay on top of all the changes made available. Here ...
Amazon Web Services (AWS) offers three different ways to pay for EC2 Instances: On-Demand, Reserved Instances, and Spot Instances. This article will focus on effective strategies for purchasing Reserved Instances. While most of the major cloud platforms offer pre-pay and reservation dis...
If you’re building applications on the AWS cloud or looking to get started in cloud computing, certification is a way to build deep knowledge in key services unique to the AWS platform. AWS currently offers 11 certifications that cover major cloud roles including Solutions Architect, De...
The AWS Solutions Architect - Associate Certification (or Sol Arch Associate for short) offers some clear benefits: Increases marketability to employers Provides solid credentials in a growing industry (with projected growth of as much as 70 percent in five years) Market anal...
Moving data to the cloud is one of the cornerstones of any cloud migration. Apache NiFi is an open source tool that enables you to easily move and process data using a graphical user interface (GUI). In this blog post, we will examine a simple way to move data to the cloud using NiFi c...
Amazon DynamoDB is a managed NoSQL service with strong consistency and predictable performance that shields users from the complexities of manual setup. Whether or not you've actually used a NoSQL data store yourself, it's probably a good idea to make sure you fully understand the key ...
As companies increasingly shift workloads to the public cloud, cloud computing has moved from a nice-to-have to a core competency in the enterprise. This shift requires a new set of skills to design, deploy, and manage applications in cloud computing. As the market leader and most ma...