Moving data to the cloud is one of the cornerstones of any cloud migration. Apache NiFi is an open source tool that enables you to easily move and process data using a graphical user interface (GUI). In this blog post, we will examine a simple way to move data to the cloud using NiFi complete with practical steps. Calculated Systems offers a cloud-first version of NiFi that you can use to follow along. Cloud Object Storage
There are many ways to store data on the cloud, but the easiest are the object stores. All three major cloud providers have them:
These is an ideal starting point for files as you can typically land the files without too much forethought or capacity planning. Additionally, these object stores are extremely robust, featuring multiple levels of durability and availability.
For the purposes of this tutorial, we will start with the most common object store: Amazon Simple Storage Service (Amazon S3).
Before we get started moving data, let’s establish some basic terminology:
The access key ID and secret access key are very important to setting up your data transfer. You can download them as a .CSV file or save them somewhere safe.
IMPORTANT: Be sure to record your secret access key as this is the only time it can be viewed.
NiFi has many ways to provide access to AWS either through an overarching credential service or parameters set to a specific processor. The credential service is ideal when you have multiple processors all relying on the same keys. For the scope of this tutorial, we will not be using the service, but it is ideal when moving into a production setting.
For the purposes of this sample flow, let’s replicate NiFi’s own configuration directory to S3. To accomplish this, we need two additional processors: ListFiles and FetchFiles. Connect and configure them as shown below.
If you return to your bucket, you should see your files listed. Note: You may have to refresh button the page depending on your browser/settings.
As an optional step, you may wish to revoke the access keys you gave to this Nifi Demo. It is general best practice to remove unused keys when done. To revoke the keys, go the
AWS Console.