Riak CS: a Cloud Storage Solution Compatible with Amazon S3

Riak CS is an open source cloud storage technology compatible with Amazon S3 and Openstack Swift. Discover why more and more companies are using it.

Riak CS may not be the best-known cloud storage technology right now, but it’s definitely worthy of our attention. This post isn’t meant to provide an end-to-end installation and configuration guide, but to familiarize you with its function and features and to explain why you might want to use it, rather than various alternatives.

What is Riak CS?

Riak CS (“CS” stands for Cloud Storage) is object storage management software that’s built on top of Riak, Basho’s distributed database. It can be used to store any type of data like images, video, documents, and database backups. Riak CS stores key/value pairs in namespaces called Buckets. It’s open source and can be easily downloaded.

Why use Riak CS?

With the increasing adoption of cloud technologies, storage must not only exist in ever increasing capacity, but must also be reliable, easy to maintain, distributed, scalable, and cheap. But Riak isn’t the only storage option available for handling large volumes of data. Why not, for instance, stick with local solutions like SAN or NAS?

The traditional approaches to storage were designed for structured data, but today the major sources of data are machines (like sensors and smartphones). These data are unstructured and require a more robust storage solution to handle the greater variety. Earlier storage designs, on the other hand, were not very fault tolerant and would need greater effort to maintain their reliability.

Besides being better at handling unstructured data, Riak CS tries to address all the major drawbacks of traditional storage solutions by avoiding the single point of failure architectures, and by introducing greater fault tolerance, more robust management, scaling, and lower costs.

But what about other cloud solutions, and especially AWS’s dominant Simple Storage Service? What can Riak CS possibly offer that we can’t already get from S3?

This one is a bit more tricky.

Amazon S3 is a pay-as-you-go service that’s as reliable as just about anything else out there, and it’s cost effective. But, as it’s provided by a public provider, you lose some control over uptime (even though AWS’s record is very good). Moreover, there will be cases where you are simply reluctant to store secured data outside your data center.

Riak CS gives you the flexibility to configure the entire setup within your datacenter – behind your organization’s firewall. If done right, this can provide better security and more control over your storage operations. Therefore, Riak CS can be a preferred choice even over AWS S3 for customers looking for…

  • Complete control over storage design and configuration.
  • Storage protected behind the organization’s firewall.
  • Control over uptime and quality of service.
  • Customized solutions implemented in ways similar to cloud drives (like Dropbox).
  • Huge unstructured data stores that can dynamically (and economically) scale.
  • Low Latency.
  • High read/write availability.
Riak CS architecture
How Riak CS works

Ok. So given that there are going to be use cases where Riak CS can outperform other solutions in its class, we should still ask ourselves: Why Riak CS and not Riak? Both are built for storage, both are highly available and scalable. Why not Riak?

Here we will need to understand some key structural differences between Riak & Riak CS.

  • Riak CS is used to store very large objects – into the terabyte size range. But Riak excels at quickly storing and retrieving smaller objects.
  • Riak is a database, and it’s never recommended to directly expose a database to a network without authentication or authorization – something Riak currently lacks. Riak CS, on the other hand, is designed for web users, and hence supports both authentication and authorization.
  • Compatibility with major players in the storage market is critically important for full integration. Riak CS’s APIs are compatible with AWS S3, but that’s not possible with Riak. Riak uses native HTTP or Protocol Buffers APIs, but Riak CS is compatible with Amazon’s S3 and OpenStack’s Swift APIs
  • Data consistency is vital for cloud storage solutions even though writes are being requested in parallel from all ends of a cluster, it’s very important that the data remain consistent – especially if you’re relying on user level authentication. Riak, compared to Riak CS, doesn’t provide a particularly high level of consistency.

Riak CS Features

Now that we’re a bit more familiar with some of Riak CS’s ideal use cases, let’s focus briefly on some specific features to help inform your enterprise deployment decision.

  • The Riak CS API is compatible with the Amazon S3 API.
  • Riak CS doesn’t work with a master-slave model, hence all nodes are responsible for all kind of requests.
  • With its Per Tenant Visibility capability, it’s easier to track per-tenant usage.
  • Riak CS cluster nodes can scale dynamically without any downtime.
  • With Riak CS’s enterprise edition, the data can be replicated across different data centers for greater reliability.
  • You can store individual images, text, video, documents, database backups, software binaries and other content up to 5GB as a single, easily retrievable object.
  • Cost effective.
  • Easy setup.
  • Easy maintenance.

Riak CS is making noise in its market and has been adopted by some serious customers. Perhaps its time for a closer look.

To learn more about the storage services provided by AWS, Cloud Academy’s AWS Storage Fundamentals is your go-to training course to get an in-depth understanding of AWS storage features, when and why you might use the service within your own environment.

Cloud Academy