The course is part of this learning path
In this short course, we’ll cover how to configure Google Cloud CDN (Content Delivery Network) cache updates using expiration time or versioned URLs to ensure that users get the latest content from the cache. We’ll also cover how to reduce the number of cache misses in Cloud CDN by using custom cache keys.
Learning Objectives
- Configure Cloud CDN cache updates using expiration time or versioned URLs
- Reduce the number of cache misses in Cloud CDN by using custom cache keys
Intended Audience
- Google Cloud Platform system administrators and architects
Prerequisites
- Basic Google Cloud Platform experience (or take our Overview of Google Cloud Platform course)
As you probably know, Cloud CDN is a service that caches your web content in Google’s delivery network. Then when a user goes to your website, they retrieve your content from the nearest CDN location rather than from your web server.
Cloud CDN is a great service for serving content to your users faster, but it does complicate things, so you may need to change some settings to optimize the cache. Here’s the first problem. When you update your content, how do you make sure the cache gets updated, too, so your users get the latest content? This might sound like a simple problem, but there are several different approaches to dealing with it.
First, you need to set an appropriate expiration time. If you have content that changes frequently, then you should set a short expiration time so users won’t get stale content from the cache. For example, stock prices change rapidly, so you’d set a very short expiration time for them.
You wouldn’t want to set a short expiration time for everything, though. For example, if you set a short expiration time for your company’s logo, then your users would frequently get cache misses, and the logo would have to be retrieved from the web server, which would defeat the purpose of using a CDN. And if your company changes its logo at some point, it wouldn’t be a big deal that it would take a while before your customers see the new one.
Another approach for dealing with content that changes infrequently is to use versioned URLs. The idea is that if you change the name of a piece of content, then it won’t be in the cache, so users will always get the latest version.
Here are three different ways to use versioning. You could add a query string with a version number in it. You could add a version number to the filename. Or you could add a version number in the path.
You might be wondering why you can’t just remove stale cache entries directly rather than relying on expiration times and versioning. Well, you can. It’s called invalidation, but you should only use it as a last resort because Google charges for invalidations and also enforces rate limits on how many invalidations you can do at a time.
Now that we’ve covered ways to prevent cache entries from becoming stale, we should also cover the opposite problem—how to reduce the number of cache misses. As I mentioned earlier, if the content that users need is often not in the cache, then response times will be slower. Aside from setting the expiration time appropriately, another way of improving the cache hit ratio is to use custom cache keys.
By default, Cloud CDN creates each cache key using the full URL, but this can be inefficient. For example, many sites serve up the same content regardless of whether the URL contains http or https. If you cache a separate copy of each piece of content for each protocol, then there would be a lot of cache misses. By using a custom cache key that doesn’t include the protocol, you’d only have one copy of each piece of content in the cache for both protocols, so you’d have way fewer cache misses.
Why? Suppose you have a web page that hasn’t been accessed for a while, so the cached version of that page has expired and is no longer in the cache. Then suppose a user browses the http version of that page. There’ll be a cache miss, and the page will be retrieved from the web server into the cache. When another user browses the https version of that page, there will be a cache miss again if the cache keys are based on the full URL. But if you use a custom key without the protocol, then the request for the https version of the page will result in a cache hit.
Similarly, you can create custom cache keys that leave out the host. If you have multiple copies of your website on different hosts, then it would make sense to only have one copy of your content in the cache.
Dealing with the query string can be a bit more complicated. If the content should always be the same for a URL regardless of what’s in the query string, then that’s easy. You can just create a custom cache key that leaves out the query string. But if certain parts of the query string will result in different content being retrieved, then you need to specify which parts of the query string to include in the cache key.
Note that you can leave out any combination of protocol, host, and query string when you create your custom cache keys.
And that’s it for Cloud CDN configuration.
Guy launched his first training website in 1995 and he's been helping people learn IT technologies ever since. He has been a sysadmin, instructor, sales engineer, IT manager, and entrepreneur. In his most recent venture, he founded and led a cloud-based training infrastructure company that provided virtual labs for some of the largest software vendors in the world. Guy’s passion is making complex technology easy to understand. His activities outside of work have included riding an elephant and skydiving (although not at the same time).