Elasticsearch vs. CloudSearch: AWS Cloud Search Choices

Elasticsearch vs. CloudSearch: What’s the main difference?

Let’s compare AWS-based cloud tools: Elasticsearch vs. CloudSearch. While both services use proven technologies, Elasticsearch is more popular, open source, and has a flexible API to use for customization; in comparison, CloudSearch is fully managed and benefits from managed service features such as (near) plug-and-play startup and auto-patching and updating.

Want to jump-start your learning on search engines and analytics in general? Get some direct Elastisearch knowledge in our Analytics Fundamentals for AWS course:

Analytics Fundamentals for AWS course

In part one of this series, we described what search engines are, how they solve the problem of accessing content stretched across large websites, and how Amazon CloudSearch provides a solution for a cloud environment. AWS CloudSearch is certainly a powerful and appealing service from Amazon. However, there are also other popular players in the search engine market, and Elasticsearch ranks right behind Solr as the most popular search and analytics engine. We’ll explore the battle of the Amazon search providers: Elasticsearch vs. CloudSearch.

Elasticsearch vs. CloudSearch: Provisioning

Both Elasticsearch and CloudSearch are provided by Amazon as AWS services. However, Elasticsearch is an independent product developed by elastic.co, which means you can set up Elasticsearch independently by downloading and extracting the tarball, or through a yum/apt-get install.

Amazon CloudSearch, on the other hand, is fully managed by AWS, which, once you choose your instance type, handles the complete provisioning. Users are able to select High-Availability (AZ level), replication, and partitioning options through the AWS Management Console or AWS CLI.

Elasticsearch vs. CloudSearch: Upgrading

Elasticsearch is easy to upgrade. The process can be as easy as replacing the lib folder of an older version with a new version.

Updates of Amazon CloudSearch are pushed by AWS, relieving users of the responsibility. However, this can result in delayed upgrades of new releases, though with delay comes stability.

Elasticsearch vs. CloudSearch: Data Import/Export

When existing data need to be searchable, they should be imported to the search engines. In Elasticsearch, there are plugins called “rivers” to push data into a cluster. There are many popular river plugins available such as elasticsearch–river- mongodb, elasticsearch-river-couchdb, Elasticsearch-jdbc. However, for various reasons, river plugins are being deprecated.

Logstash Forwarders, are normally used to push logs from application or database servers to Elasticsearch. This makes them available for searching logs or to plot graphs in Kibana. Recently, Logstash and input_changes plugins have taken center stage to replace rivers as tools to push data to Elasticsearch, too. Some of the recently developed input_changes are couchdb_changes, Twitter, and rabbitmq.

In Amazon CloudSearch, data and documents (in either XML or JSON format) are pushed in batches. Data can also be pushed to S3, with the data path given to index the documents.

Elasticsearch vs. CloudSearch: Data and Index Backup

In Elasticsearch, data is backed up (and restored) using the Snapshot and Restore module. Usually, users are required to define a shared mount path. In the cloud, they can instead opt for Amazon S3, HDFS, or Azure storage. Curator is a tool that acts as a cron job manager that users can set to automate the backup process.

In Amazon CloudSearch, the service itself takes care of the whole backup process, once again sparing users the bother. Unlike Elasticsearch, where users must manually run the restore activity from backed up indexes, CloudSearch does it automatically.

Elasticsearch vs. CloudSearch: Security and User Management

Elasticsearch provides a plugin called shield to handle authentication and authorization. Shield also provides features like encryption, role-based access control, IP filtering, and auditing. However, shield is a licensed product that must be purchased.

You can also integrate your AD server to control access locally.

Amazon CloudSearch provides IAM-based access control (and all the granularity this affords, such as restricting access to specific domains). CloudSearch also supports HTTPS for all requests.

Elasticsearch vs. CloudSearch: Cluster Management

In Elasticsearch, adding or deleting nodes within a cluster must be done manually. If the cluster instances are upgraded – i.e. vertical scaling – then you’ll need to run through the setup process from scratch. Old data must be backed up and restored to the new cluster. In the case of horizontal scaling, where servers are added or removed from the cluster, cluster rebalancing and resharding are mandatory. These, too, are manual processes. Users need to be very careful during the process.

Amazon CloudSearch, on the other hand, has built-in scaling and upgrade tools. When a server in a CloudSearch service reaches its threshold, it automatically upgrades to the next larger instance type. And when the capacity goes beyond the largest available instance types, the index is partitioned into multiple instances.

Elasticsearch vs. CloudSearch: Monitoring

In Elasticsearch, there are cluster monitoring tools like Marvel which allow a user to send RESTful queries to check cluster health. Another product called Watcher provides an alerting mechanism. These tools are all provided by Elasticsearch itself. Users can, of course, also bring their own monitoring tools, like SPM or the New Relic plugin for Elasticsearch to keep an eye on their clusters.

Amazon CloudSearch is fully integrated with Amazon CloudWatch, which can monitor metrics like SuccessfulRequests, Searchable Documents, Index Utilization, and Partition Count. Like Watcher in Elasticsearch, AWS Simple Notification Service (SNS) can be integrated with CloudSearch for alerting.

Elasticsearch vs. CloudSearch: High Availability

As they’re both built for running search engines in the cloud, Elasticsearch and CloudSearch are designed for high availability (HA).

Elasticsearch is built for distributed computing where the cluster grows horizontally. The indexes are split into shards and replication factors provide shard redundancy. Whenever a node fails, the replicated shards are used to replace lost data.

Elasticsearch employs a technique called zen discovery, where all the nodes communicate with each other through an “elected” master. In case the master node fails, another node takes over as master.

A similar architecture is followed in CloudSearch to handle failure and provide HA. CloudSearch also has an optional feature for multi-AZ replication within a single region to provide HA and Availability Zone failover.

Elasticsearch vs. CloudSearch: Search and Indexing

In Elasticsearch, searching happens on both index and types using a search API. The search API also includes Faceting and Filtering for searching data.

In CloudSearch, users create a search domain that includes sub-services to upload documents. A search service provides the means to search indexed data.

In Elasticsearch, many built-in libraries are provided for analyzers, tokenizers, and filters for indexing. Amazon CloudSearch, on the other hand, provides a much simpler configuration service for all indexing operations and relevance ranking.

Elasticsearch vs. CloudSearch: Client Libraries

There are many clients available for Elasticsearch. Official clients are Java API, .NET, Ruby, Groovy, PHP, PERL, Python, and JavaScript. Elasticsearch also supports RESTful APIs.

Amazon CloudSearch supports many SDKs along with RESTful API calls. The most popular SDKs are in Java, Ruby, Python, .Net, PHP, and Node.js.

Elasticsearch vs. CloudSearch: Cost

As Elasticsearch requires manual set up, the true cost of deployment must include infrastructure costs, licensing for all non-open source software tools and the OS, and the Elasticsearch binary. This may require a large operational expenditure to cover skilled Elasticsearch admins and a monitoring team.

Amazon CloudSearch is priced according to the search instance size. Here’s an example:
Elasticsearch vs CloudSearch - instance size
With Multi-AZ enabled, the cost of redundant search instances will also be added. If an index is partitioned, the cost of each new search instance in each AZ is also added to the cost.

CloudSearch Document Batch Uploads

Document batch upload costs are $0.10 per 1,000 Batch Upload Requests (the maximum size for each batch is 5 MB).

CloudSearch IndexDocuments Requests

Re-indexing is required for indexes when a new field is added to the index. The charge for a re-indexing request is $0.98 per GB of data stored in your search domain.

CloudSearch Data Transfer

Inbound data transfers are free between Amazon CloudSearch and other AWS Services. There are charges for outbound data transfers:

  • Data transferred between Amazon CloudSearch and AWS services in different regions will be charged as Internet Data Transfers on both sides of the transfer.
  • Traffic sent between Amazon CloudSearch and Amazon EC2 instances in the same region is only billed for the Data Transfer in and out of the Amazon EC2 instances. Standard Amazon EC2 Regional Data Transfer charges apply.
Elasticsearch vs CloudSearch - cost

Elasticsearch vs. CloudSearch: How to get the most benefit from your choice

Both Elasticsearch and Amazon CloudSearch are built on proven technologies and are the choice of many demanding organizations. Because of its flexibility and active developer community, Elasticsearch is more popular. But Amazon CloudSearch scores when it comes to operational efficiency.

As a side note, because of its popularity, AWS provides Elasticsearch as a Service (Amazon Elasticsearch Service) which, in many ways, provides the best of both worlds. Elastic.co also provides Elasticsearch as a cloud service Found.

Your search solution is an important part of the overall architecture plan for your application or product. Keep your momentum going with AWS and all its changing parts by checking out a Solutions Architect Certification learning path. They’re constantly updated, give you hands-on practice, and provide guidance to as you try to organize the big projects in your career.

AWS Solutions Architect Professional Learning Path AWS Solutions Architect Associate Learning Path

 

Joe Nemer

Written by

Joe Nemer

Joe is a Technical Researcher at Cloud Academy and works to help readers connect concepts in ways they haven't thought of before. Side interests include all sorts of waves — ocean waves, sine waves, just not goodbye waves.


Related Posts

Avatar
Cloud Academy Team
— July 9, 2020

Which Certifications Should I Get?

As we mentioned in an earlier post, the old AWS slogan, “Cloud is the new normal” is indeed a reality today. Really, cloud has been the new normal for a while now and getting credentials has become an increasingly effective way to quickly showcase your abilities to recruiters and compan...

Read more
  • AWS
  • Azure
  • Certifications
  • Cloud Computing
  • Google Cloud Platform
Alisha Reyes
Alisha Reyes
— July 2, 2020

New Content: AWS, Azure, Typescript, Java, Docker, 13 New Labs, and Much More

This month, our Content Team released a whopping 13 new labs in real cloud environments! If you haven't tried out our labs, you might not understand why we think that number is so impressive. Our labs are not “simulated” experiences — they are real cloud environments using accounts on A...

Read more
  • AWS
  • Azure
  • DevOps
  • Google Cloud Platform
  • Machine Learning
  • programming
Joe Nemer
Joe Nemer
— June 19, 2020

Kickstart Your Tech Training With a Free Week on Cloud Academy

Are you looking to make a jump in your technical career? Want to get trained or certified on AWS, Azure, Google Cloud Platform, DevOps, Kubernetes, Python, or another in-demand skill?Then you'll want to mark your calendar. Starting Monday, June 22 at 12:00 a.m. PDT (3:00 a.m. EDT), ...

Read more
  • AWS
  • Azure
  • cloud academy content
  • complimentary access
  • GCP
  • on the house
Alisha Reyes
Alisha Reyes
— June 11, 2020

New Content: AZ-500 and AZ-400 Updates, 3 Google Professional Exam Preps, Practical ML Learning Path, C# Programming, and More

This month, our Content Team released tons of new content and labs in real cloud environments. Not only that, but we introduced our very first highly interactive "Office Hours" webinar. This webinar, Acing the AWS Solutions Architect Associate Certification, started with a quick overvie...

Read more
  • AWS
  • Azure
  • DevOps
  • Google Cloud Platform
  • Machine Learning
  • programming
Rebecca Willis
Rebecca Willis
— June 3, 2020

Azure vs. AWS: Which Certification Provides the Brighter Future?

More and more companies are using cloud services, prompting more and more people to switch their current IT position to something cloud-related. The problem is most people only have that much time after work to learn new technologies, and there are plenty of cloud services that you can ...

Read more
  • AWS
  • Azure
  • certification
Alisha Reyes
Alisha Reyes
— June 2, 2020

Blog Digest: 5 Reasons to Get AWS Certified, OWASP Top 10, Getting Started with VPCs, Top 10 Soft Skills, and More

Thank you for being a valued member of our community! We recently sent out a short survey to understand what type of content you would like us to add to Cloud Academy, and we want to thank everyone who gave us their input. If you would like to complete the survey, it's not too late. It ...

Read more
  • AWS
  • Azure
  • blog digest
  • Certifications
  • Cloud Academy
  • OWASP
  • OWASP Top 10
  • Security
  • VPCs
Alisha Reyes
Alisha Reyes
— May 11, 2020

New Content: Alibaba, Azure Cert Prep: AI-100, AZ-104, AZ-204 & AZ-400, Amazon Athena Playground, Google Cloud Developer Challenge, and much more

This month, our Content Team released 8 new learning paths, 4 courses, 7 labs in real cloud environments, and 4 new knowledge check assessments. Not only that, but we introduced our very first course on Alibaba Cloud, and our expert instructors are working 'round the clock to create 6 n...

Read more
  • alibaba
  • AWS
  • Azure
  • gitops
  • Google Cloud Platform
  • lab playground
  • programming
Avatar
Rhonda Martinez
— May 4, 2020

Top 5 Reasons to Get AWS Certified Right Now

Cloud computing trends are on the rise and have been for some time already. Fortunately, it’s never too late to start learning cloud computing. Skills like AWS and others associated with cloud computing are in high demand because cloud technologies have become crucial for many businesse...

Read more
  • Amazon Elastic Book Store
  • Amazon Elastic Compute Cloud (EC2)
  • AWS
  • AWS Certifications
  • Glacier
Alisha Reyes
Alisha Reyes
— May 1, 2020

Introducing Our Newest Lab Environments: Lab Playgrounds

Want to train in a real cloud environment, but feel slowed down by spinning up your own deployments? When you consider security or pricing costs, it can be costly and challenging to get up to speed quickly for self-training. To solve this problem, Cloud Academy created a new suite of la...

Read more
  • AWS
  • Azure
  • Docker
  • Google Cloud Platform
  • Java
  • lab playgrounds
  • Python
Alisha Reyes
Alisha Reyes
— April 30, 2020

Blog Digest: AWS Breaking News, Azure DevOps, AWS Study Guide, 8 Ways to Prevent a Ransomware Attack, and More

  New articles by topicAWS Azure Data Science Google Cloud  Cloud Adoption Platform Updates & New Content Security Women in TechAWSBreaking News: All AWS Certification Exams Now Available Online As an Advanced AWS Technology Partner, C...

Read more
  • AWS
  • Azure
  • blog digest
  • Certifications
  • Cloud Academy
  • programming
  • Security
Avatar
Stuart Scott
— April 27, 2020

AWS Certified Solutions Architect Associate: A Study Guide

Want to take a really impactful step in your technical career? Explore the AWS Solutions Architect Associate certificate. Its new version (SAA-C02) was released on March 23, 2020, though you can still take SAA-C01 through July 1, 2020. This post will focus on version SAA-C02.The AWS...

Read more
  • AWS
  • AWS Certifications
  • AWS Certified Solutions Architect Associate
Alisha Reyes
Alisha Reyes
— April 9, 2020

New on Cloud Academy: AWS Solutions Architect Exam Prep, Azure Courses, GCP Engineer Exam Prep, Programming, and More

Free content on Cloud Academy More and more customers are relying on our technology and content to keep upskilling their people in these months, and we are doing our best to keep supporting them. While the world fights the COVID-19 pandemic, we wanted to make a small contribution to he...

Read more
  • AWS
  • Azure
  • Google Cloud Platform
  • programming