How to Minimize Content Latency with Lambda@Edge

Among the AWS Lambda announcements at the 2016 re:Invent, Werner Vogels introduced Lambda@Edge (in preview) for running Lambda functions at CloudFront locations. It’s part of the larger movement focused on letting you execute your code wherever you want in your AWS architecture.

I signed up for the preview, and today, I’d like to share my personal experiment with Lambda@Edge for serving dynamically generated, minimized, and compressed HTML pages. We’ll also look at some of the most interesting use cases that Lambda@Edge supports.

Lambda@Edge is still evolving

Although still in preview mode, you can already see the CloudFront trigger showing up in the AWS Lambda Console, as well as a new Edge Node.js 4.3 runtime.
If you are a curious developer, you have probably already tried Lambda@Edge and experienced the strong limitations imposed by this new programming model. For example, your functions can’t run longer than 50 ms, and you can only access or manipulate the upcoming request (i.e. by the client) or the generated response (i.e. by the origin), depending on which event is being handled. While some of these limitations may disappear with general availability, others are not likely to change.
Even if you already have access to the preview, you couldn’t do much with Lambda@Edge other than header manipulation until last week, when Jeff Barr announced a new feature for generating custom HTTP responses.

Jeff Barr also announced another feature: custom logging statements. In fact, Lambda@Edge can now write custom logs directly to Amazon CloudWatch. This functionality will allow you to generate custom reports about your CloudFront distributions. However, in this article, I will mostly focus on content generation and latency optimization. For now, let’s just say that you can write up to 4KB of custom logs per invocation, simply by calling the usual console methods (log, info, error, and warn).

How to minimize content latency with Lambda@Edge
Learn how to execute AWS Lambda@Edge on CloudFront Edge locations in response to CloudFront events to generate dynamic content and reduce network latency.

What can you do with Lambda@Edge?

While the set of use cases for Lambda@Edge is relatively small, I think it offers huge potential.
Since you don’t have access to any remote storage or database and you can’t execute an HTTP call, all of the information required by your Lambda Function must be already available at the Edge location, either by inspecting the request or the response (headers and cookies included).

I think the primary use cases can be organized into three categories:

  • Request/Response manipulation
  • Dynamic content generation
  • Pure latency optimization

Let’s see what each category can offer and take a look at the most interesting use cases.
Note: I made up these three categories and you won’t find them on the official AWS documentation.

Request/Response manipulation

This category includes all of the use cases where you need to enhance or modify the upcoming request or the generated response with additional headers or content. In this scenario, you will always forward the request to your CloudFront origin (or to the CloudFront cache).
For example, you may want to add custom HTTP headers to an upcoming request to improve your distribution caching layer. You can find a real world scenario by tiny.pictures here. Since the User-Agent header is not white-listed by CloudFront for caching reasons, they have implemented the computing logic to understand whether or not the client supports WebP image format with Lambda@Edge. To achieve this, they inspected the User-Agent header and then added a custom header (which will be white-listed and cached by CloudFront) to the same request before the origin receives it.
Similarly, you may manipulate the response already generated by the origin and implement edge-dependent modifications. For example, you could customize some timezone-related information based on the particular Edge location, or handle special response formatting based on client capabilities, such as converting JSON to XML or vice versa. Note: Serving different data formats based on client capabilities is generally not considered a best practice (i.e. it’s not RESTful at all and SEO experts would discourage it too). Here, I’m considering special scenarios where you need to solve compatibility issues or unexpected corner cases.

Another interesting use case that requires request or response manipulation is A/B testing. You could implement cookie-based A/B testing either by editing the request path or by manipulating the response body. I would recommend the first option, as it makes the chosen variation more explicit at the origin, where you may also take care of creating the initial cookie and tracking events and conversions.

With the very same header-based technique, you can quickly implement URL rewritingtemporary or permanent redirects, or simple cookies initialization. For example, you might alter the request path and convert query string parameters to explicit paths for your origin (i.e. pagination, filtering, navigation, etc.).

Here is a recap of the possible use cases based on request/response manipulation:

  • Custom HTTP headers (request)
  • Timezone-related information (response)
  • Response format adapters (response)
  • Cookie-based A/B testing (request)
  • URL rewriting (request)
  • Temporary redirects (request)
  • Cookies initialization (response)

Dynamic content generation

The main advantage of generating content directly at the Edge is that requests will not hit the CloudFront origin at all. It means that you don’t have to worry about distributing your server-side code across multiple regions because each user will only interact with the closest Edge location.
Please note that you can already distribute static content this way with S3 and CloudFront.

Lambda@Edge allows you to achieve the same performance with dynamic content as well. Of course, all of the information must be available at the Edge without any additional network calls (i.e. no databases or 3rd-party integrations), therefore the use cases are quite limited. Server-side rendering at the Edge could be a new way to handle simple cases without messing up client-side rendering, which often adds a considerable amount of assets and dependencies (i.e. more HTTP calls!).

For example, you may generate error pages or login/signup forms on the fly in the case of non-authenticated users. Although this is not a best practice for SEO concerns (without redirects), it may be an interesting solution for optimizing the user experience on certain pages of your website. Given the importance of pages related to user onboarding, customer support, and maintenance windows, Lambda@Edge might optimize the UX by reducing latency to the minimum.

In other cases, you could include a special resource in your deployment package and manipulate it on the fly based on the capabilities of the client. For example, you could generate ad-hoc image formats (e.g. SVG vs. PNG) or serve optimized variations of a given JavaScript library (similarly to what Polyfill.io offers for JavaScript polyfills). This pattern might introduce new practices in the web pre-processor era by creating new possibilities for “post-processors” as well.

Although these scenarios may sound pretty restricted, I think Lambda@Edge has a very underestimated potential when combined with CloudFront’s caching and server-side rendering.

It’s worth mentioning that Lambda@Edge currently allows you to upload only 1MB of code, dependencies included. This is quite limiting if you need complex libraries to manipulate images or render HTML templates. Also, it currently supports only Node.js 4.3.

I will demonstrate how to render dynamic, minimized, and compressed HTML in less than 100 lines of JavaScript later in this post (public gist here). While HTML minimization is not strictly needed unless you have huge templates, I think that dynamic rendering and gzip/deflate compression on the fly will be a fairly common scenario with Lambda@Edge.

Pure latency optimization

This category only aims at optimizing the network latency, and it could include a mix of the previous scenarios. In some use cases, you simply don’t want to hit the origin if not needed.

For example, if your origin is a RESTful API that always requires the Authorization header, why would you need to hit the origin when a request does not contain it? You could generate a 401 Unauthorized response at the edge, and therefore reduce response time and origin load.

With the same technique and goal, you could implement any custom validation logic or transformation at the Edge and optimize special API endpoints. Even if the overall cost might be higher, user experience will improve without increasing your load at the origin. Also, the logic you implement with Lambda@Edge could be completely transparent to the client and the origin—provided that you are already using a CloudFront distribution—so that you won’t need to rewrite URLs around and you can easily remove or improve the Lambda@Edge without impacting the system.

Eventually, if AWS allows you to associate AWS@Edge with your API Gateway distributions you may even end up optimizing costs this way, assuming that invoking Lambda@Edge will be cheaper than invoking API Gateway + Lambda. If the pricing is the same with AWS Lambda, you might spend the same while reducing network latency.

A real-world experiment with server-side rendering and compression

I took the time to implement a proof of concept for dynamic content generation at the edge. The idea is to include dynamic HTML template(s) in your deployment package that will be rendered via JavaScript when the requested path matches their filename. I also wanted to prove that I could minimize the resulting HTML body and compress it with gzip or deflate.
I would need to resolve three issues:

  1. How to map requests to local resources (i.e. HTML files): I decided to map URLs into local files so that a request to /test.html will be handled by rendering a local test.html file with Plates. Any request to non-existing files will be forwarded to the distribution origin, practically bypassing Lambda@Edge.
  2. How to minimize and compress the resulting file in JavaScript: I used html-minifier and zlib to achieve minification and compression. While minification can always be executed, not every client supports compressed responses, and I had to check for the Accept-Encoding header. Luckily, the zlib module supports both gzip and deflate compression, and it is included in the Node.js core so it didn’t make my deployment package heavier.
  3. How to shrink everything into 1MB of code: This turned out to be the most challenging step, as my dependencies initially added up to more than 6MB. Sadly, I had to manually hack the html-minifier package and remove a few unused dependencies (i.e. clean-css and uglify-js), even after reducing the node_modules size with modclean. I’m not too proud of this hack, but I will be happy to learn about more robust and elegant solutions.

My Lambda@Edge Function code looks like the following (complete gist and instructions here):

You could make the dynamic rendering arbitrarily complex to cover specific scenarios. Your rendering could involve data stored in cookies or data that depends on the current time or even the specific Edge location. As a simple proof of concept, I’m happy with the result, and I’m looking forward to improving it and using it in real production use cases.
I will be happy to receive your feedback as well, although I don’t expect this experiment to become a reference code base since things might change consistently with general availability.

Conclusions

We have great expectations for Lambda@Edge, as it’s bringing computing capabilities not only out of your virtual machines but also out of data centers and as close as possible to your end users. The preview phase is opening up interesting scenarios for serverless computing at the edge. Or can we say origin-less already?
At the same time, I expect to see even more possibilities coming soon with general availability. For example, I would love to see Lambda@Edge support for Python and native integration with API Gateway distributions. Also, I think many interesting use cases will require more than 1MB of code, and I hope to see it increased to at least 10MB. I can understand why it is a tough tradeoff between propagation efficiency, storage at the edge, and programming flexibility. On the other hand, I think the constrained memory (128MB) and execution time (50ms) will disappear soon, although I don’t feel like they are such an annoying limitation for the most likely use cases.

I can’t wait to see more statistical proofs regarding performance efficiency and to hear about more advanced or unexpected scenarios where Lambda@Edge can make a difference.
Let us know what you think about Lambda@Edge, and feel free to share your experiments and the use cases that we may have missed.

Cloud Academy