Here at Cloud Academy, we use WordPress to serve our blog and product/public pages, such as the home page, the pricing page, etc.
With WordPress, the marketing and content teams can quickly and easily change the look & feel and the content of the pages, without reinventing the wheel.
State of the art
The first implementation of WordPress infrastructure was to deploy the whole code (core, theme and plugins) on an EFS storage, which basically is an NFS storage, attached to a couple of EC2 instances. Moreover, we installed W3 Total Cache plugin to handle full page cache and to serve static files (such as images, CSS, js, etc.) from a Cloudfront CDN.
Here is a simplified schema of our infrastructure:
Considering this implementation, we found two different problems:
- The first is related to the EFS that doesn’t serve PHP files as fast as needed.
- The second is related to W3 Total cache plugins that, as previously mentioned, also handle the full page cache. This plugin retrieves a page from Redis cache but, despite Redis being a good choice to handle cached files, before getting the file from cache, W3 Total Cache needs to start the whole WordPress framework to understand the right object to get — bringing us back to the first problem.
With this approach, we waste a lot of time in getting the PHP files from the network file system and the cached page from Redis.
To solve the above-mentioned problems, we’ve rethought the whole infrastructure, moving it to a more standard Cloud Academy approach using Docker containers and ECS orchestrator.
Furthermore, we moved the CDN component to act as a full page cache instead of serving just the static assets.
Here is the schema of the new implementation:
As you can see, the whole WordPress code is now built in a docker container (using our standard Jenkins pipeline) and then deployed to an ECS cluster managed by Spotinst.
However, we kept the EFS storage because the uploaded files from WordPress editors must be shared across all ECS containers.
The Docker build
One of our main goals was also to keep the minimum number of files versioned in the git repository. This means that the only versioned files for a WordPress project are related to custom themes and custom plugins.
To build the right docker image, we used the following approach:
Starting from the PHP image, the Dockerfile installs the wp-cli application and then downloads the WordPress core and all public plugins. To choose which plugins must be installed, the script reads a CSV file (that’s versioned) containing the list of all plugins and the relative version. In this way, when we want to install a new plugin, we just add it to the CSV file and then rebuild the Docker image.
As shown in the schema, we moved the Cloudfront CDN to work as the main WordPress entry point. In this way, the “Hit” requests are handled directly by Cloudfront without loading the entire WordPress stack. In addition, Cloudfront gives us the ability to configure different behaviors depending on the requested page. More specifically, we can configure the cache TTL depending on the throughput that a single page has in order to maximize the cache performance (from 3 minutes to 10 minutes).
Unfortunately, CloudFront as a single entry point introduces a problem regarding the WordPress admin: as you can guess, we cannot have the admin pages cached. This is a problem because if we cache the admin pages, the editors can have unexpected behaviors with their sessions mixed together. To solve this issue, we created a new CDN behavior dedicated to the admin section that basically skips the cache, just using all headers and cookies as part of the cache key.
Now, what about the W3 Total Cache plugin? We decided to keep it installed because it will continue to optimize the cache miss from Cloudfront and the minification of static files.
Of course, after the infrastructure refactoring, we made some benchmarks to understand the real benefit gained. This chart compares the performance of the old infrastructure versus the new:
In the chart, the blue, brown and orange lines are related to the old infrastructure and the green, purple and red lines to the new infrastructure.
First of all, notice how the average response time of the new infrastructure (red line) is about half of the old one (blue line). But the very big improvement is on the 95 percentile brown line vs. purple line. This means that, thanks to the new infrastructure, 95% of requests are now served in less than a second.
Another effect that we measured with the ‘ab’ tool is the increase in throughput. We can now handle about twice as many requests than before using the same hardware configuration.
Considering the EFS usage, as you can see in the following image, the throughput that the file system must handle is now significantly lower than the old infrastructure (except for the switch moment, where there was a spike caused by all Cloudfront cache miss requests). This allowed us to decrease the EFS provisioned throughput which, of course, means a cost savings.
In addition, the time needed to scale up in case of a spike is considerably lower because we simply need to add new extra containers to the ECS service to handle new requests.
After this refactoring, which mainly focused on infrastructure, we are fully aware that most of the time the major issues are at the application level. So we are investigating how to refactor the WordPress front end, replacing it with a react app (like the rest of our platform), using either Next.js or Gatsby, in order to completely avoid the load of the WordPress framework except when serving API requests.
New Content: Get Ready for the CISM Cert Exam & Learn About Alibaba, Plus All the AWS, GCP, and Azure Courses You Know You Can Count On
This month our team of intrepid certification specialists released five learning paths, seven courses, 19 hands-on labs, and three lab challenges! One particularly interesting new learning path is Certified Information Security Manager (CISM) Foundations. After completing this learn...
New Content: AWS Terraform, Java Programming Lab Challenges, Azure DP-900 & DP-300 Certification Exam Prep, Plus Plenty More Amazon, Google, Microsoft, and Big Data Courses
This month our Content Team continues building the catalog of courses for everyone learning about AWS, GCP, and Microsoft Azure. In addition, this month’s updates include several Java programming lab challenges and a couple of courses on big data. In total, we released five new learning...
New Content: AWS Data Analytics – Specialty Certification, Azure AI-900 Certification, Plus New Learning Paths, Courses, Labs, and More
This month our Content Team released two big certification Learning Paths: the AWS Certified Data Analytics - Speciality, and the Azure AI Fundamentals AI-900. In total, we released four new Learning Paths, 16 courses, 24 assessments, and 11 labs. New content on Cloud Academy At any ...
New Content: Azure DP-100 Certification, Alibaba Cloud Certified Associate Prep, 13 Security Labs, and Much More
This past month our Content Team served up a heaping spoonful of new and updated content. Not only did our experts release the brand new Azure DP-100 Certification Learning Path, but they also created 18 new hands-on labs — and so much more! New content on Cloud Academy At any time, y...
Docker Image Security: Get it in Your Sights
For organizations and individuals alike, the adoption of Docker is increasing exponentially with no signs of slowing down. Why is this? Because Docker provides a whole host of features that make it easy to create, deploy, and manage your applications. This useful technology is especiall...
Constant Content: Cloud Academy’s Q3 2020 Roadmap
Hello — Andy Larkin here, VP of Content at Cloud Academy. I am pleased to release our roadmap for the next three months of 2020 — August through October. Let me walk you through the content we have planned for you and how this content can help you gain skills, get certified, and...
New Content: Alibaba, Azure AZ-303 and AZ-304, Site Reliability Engineering (SRE) Foundation, Python 3 Programming, 16 Hands-on Labs, and Much More
This month our Content Team did an amazing job at publishing and updating a ton of new content. Not only did our experts release the brand new AZ-303 and AZ-304 Certification Learning Paths, but they also created 16 new hands-on labs — and so much more! New content on Cloud Academy At...
New Content: AWS, Azure, Typescript, Java, Docker, 13 New Labs, and Much More
This month, our Content Team released a whopping 13 new labs in real cloud environments! If you haven't tried out our labs, you might not understand why we think that number is so impressive. Our labs are not “simulated” experiences — they are real cloud environments using accounts on A...
New Content: AZ-500 and AZ-400 Updates, 3 Google Professional Exam Preps, Practical ML Learning Path, C# Programming, and More
This month, our Content Team released tons of new content and labs in real cloud environments. Not only that, but we introduced our very first highly interactive "Office Hours" webinar. This webinar, Acing the AWS Solutions Architect Associate Certification, started with a quick overvie...
DevOps: Why Is It Important to Decouple Deployment From Release?
Deployment and release In enterprise organizations, releases are the final step of a long process that, historically, could take months — or even worse — years. Small companies and startups aren’t immune to this. Minimum viable product (MVP) over MVP and fast iterations could lead to t...
DevOps Principles: My Journey as a Software Engineer
I spent the last month reading The DevOps Handbook, a great book regarding DevOps principles, and how tech organizations evolved and succeeded in applying them. As a software engineer, you may think that DevOps is a bunch of people that deploy your code on production, and who are alw...
Linux and DevOps: The Most Suitable Distribution
Modern Linux and DevOps have much in common from a philosophy perspective. Both are focused on functionality, scalability, as well as on the constant possibility of growth and improvement. While Windows may still be the most widely used operating system, and by extension the most common...