1. Home
  2. Training Library
  3. DevOps
  4. Courses
  5. Getting Started With Chef

Roles and Data Bags

Developed with

The course is part of this learning path

Cloud Configuration Management Tools with Ansible, Puppet, and Chef
course-steps 4 certification 1 lab-steps 2 quiz-steps 1
Start course
Duration2h 12m
star star star star star-half


In this lesson, we will cover how to use roles to make it easier to manage nodes. Also, we will discuss how to use data bags to store JSON data on the Chef Server, so that nodes can access it.

We will start by explaining what roles are and where they can be found. You will learn how to specify them in one of two formats: Ruby DSL syntax or JSON syntax. For this lesson, we will use the JSON syntax.

We will walk through how to create roles for each node. Then we will explain how to configure the role for the node.

After the roles are configured, we will move onto data bags. We will cover what data bags are and what they do. Then we will use a knife to create and upload several data bags.

Finally, we will recap what you have just learned in the lesson.


Welcome back! In this lesson I’ll cover how to use roles to make it easier to manage nodes. And I’ll also cover how to use data bags to store JSON data on the Chef Server, so that nodes can access it.

Roles are a way to specify the purpose of a node, without even needing to have a node. As an example, it’s not uncommon in web application tech stack to have a front-end server role, a back-end server role, and a database role. And on any server that’s a part of say, the back-end role, they’ll be responsible for running the web application code. Where the front-end would be responsible for running the web server and serving as a proxy for the back-end.

So these roles simply describe the purpose of a node. And you’re not limited to a single role. Maybe all servers have a common role applied before having their specific role applied. So roles will make it easier to determine which recipes are set as well as the runlist for a role.

Roles are defined in the roles directory under the chef repo. And you can specify them in one of two formats. You can use the Ruby DSL syntax, which you’ve seen before in the knife.rb file, it’s a property and then its value.
Or you can use the JSON syntax. You can mix and match formats for different roles. When you use the chef generate command to create a repo, it creates a roles directory and populates the directory with an example JSON file, as well as a README file.

I’ll use the JSON file for my example, however you can use the Ruby DSL if you like.
What I’ll do is create a role named webapp that will represent just the Ubuntu 14.04 node.

I’m going to copy the content from the example.json role and paste it into a new file, and I’ll name the file webapp. I’ll change the name property to webapp. And I’ll edit the description too. And now, I need to set the runlist, which will be the default recipe for the learn_chef_cookbook. Then I’ll save this file.
Next, I need to upload it to the Chef Server, and I can do that with the “role” subcommand of knife. So here it is, I’ll use paste in the command “knife role from file” and then pass in the filename and location. It returns quickly, and then I can see the results in the UI. So looking at the details you can see that the runlist specifies the learn_chef_cookbook.

Now I need to edit the runlist for the Ubuntu node so that in place of the current runlist they use this new role. You’re not limited, you could specify roles and recipes for a node’s runlist.

So I’ll use the same command from earlier, which is the “knife node runlist set” command. And I’ll specify the node and the runlist of the new role. The syntax should be familiar, it’s to use the word role, and then in square braces, the name of the role.

Now, If I go back to the web UI now you can see that the nodes have a role in the runlist, and I can expand it to show the recipes that make up the role.

Before when I configured the nodes, I targeted them by name. Well now, since the Ubuntu node is part of the webapp role, I can target it by role. To do that, instead of using the search syntax of name:nodename you can use role:rolename. Since the node is already configured, nothing new happens if I run this, however you’ll get to see that it knows the Ubuntu 14.04 server is part of the webapp role.

So this is an extremely useful mechanism to manage servers by their purpose. You can create roles that will have a runlist for it, and then add the roles to nodes as needed.

Alright, knowing how roles work checks off the first objective for this lesson. So now it’s time to cover data bags.

If you recall from way back towards the start of the course I said that data bags are global variables that store JSON data. Having the ability to store disparate data on the Chef Server allows the nodes to access it as needed through your recipes and templates.

Creating and uploading data bags is done with knife and is very similar to how I created the webapp role.

So, I’m going to create a users directory in the data_bags directory, and store a couple JSON files. Each file represents one user. Since the data bag allows you to store JSON, you can store whatever data you need. I do recommend that if you plan on using it for secrets such as API keys that you use the Chef Vault instead. I won’t be going into vault in this course, however it makes dealing with data you need encrypted easier than working with the encrypted data bag items.

So inside the Chef Repo there’s a directory named data_bags, and I’ll create a new subdirectory named users. Now, I’ll create a JSON file for the first user, which will be me. So I’ll name it “ben.json” and then I’ll fill out the object. It needs an ID, and this is how you reference the specific item, when you want to fetch this value later. So, I’ll use my first name, since in this example I won’t have many records, and therefore I don’t need to worry about duplicate IDs.

The ID is your only requirement, and any other properties specified here are up to you. I’m going to place all of mine under a property named “value.” Next I’ll add a first and last name property, and fill them out with the my first and last name. Great!
With this done I move on to copying the file, and renaming it “andy.json” and then editing the values inside to reflect the name of one of my fellow instructors, Andrew Larkin. So, I’ll save these changes and now, I’m ready to create the data bag, and upload these items.

The first command I need to issue is the “knife data bag create” command, and I need to pass in the name of the data bag, which is going to be “users” This will create a data bag on the Chef Server.

Now that it’s created, I can start storing stuff inside. For that I can issue the “knife data bag from file” command, and I’ll pass it the data bag that I want to store this item in, and the path to and name of my file. This command runs pretty quickly, especially since the JSON file doesn’t have all that much content in it.

And I need to run it again for Andy’s file. So I’ll just change the filename, and run this. And there it is.

So now if I issue the command “knife data bag list” I can see the different data bags. I only have the one, which is the users data bag I just created. And if I want see what items it contains, I can issue the “knife data bag show” command and pass in the name of the data bag I want to see, which is “users.” And it returns two records, “ben” and “andy.” And if I rerun that same command and add the ID for the item I want to see, it’ll show me the contents of the item.

If I switch back to the management UI, you can see the values are there on the Chef Server. If you want to use these items in your cookbooks, you can use data_bag and data_bag_item methods from the Ruby DSL. You may recall I covered those in a previous lesson. By fetching the JSON data with those methods, you can use it in your cookbooks. Here’s some rough sample code showing how to fetch the users data bag, and then loop over the item names, and pass that into the the data_bag_item method to fetch the actual item.

Okay, armed with your understanding of data bags, you should be able to use them in your recipes to fetch items from the Chef Server. And that checks off the second objective for this lesson, which was to teach you about data bags!

Alright, let’s see how much you recall from this lesson.

In your opinion how do roles make managing nodes easier?
This is an open ended question, and there are a lot of potentially great answers. The reason I think roles make node management easier is that they allow you to specify the desired configuration based on the nodes purpose, and then instead of assigning a runlist of recipes to the node, you can assign it roles. Then you can run commands, including the chef-client on all the node in a given role.

Here’s another questions, although it’s one that I can’t answer. Given your understanding of roles, can you start to picture some of the different roles and combinations of roles that might be used in any environments that you manage?
When I was using Chef to manage a large AWS hosted web application stack, I was able to use roles to target servers that were in the front-end, vs back-end application servers, among others. And being able to configure things based on their role made it easy to deploy updates to group of nodes. At that point, you don’t need to think about individual nodes, because you can focus on their purpose.

Here’s another question: What sort of data can you picture being stored in a data bag?
That’s another open question, and it’ll depend on your needs. I’ve used it in the past to store generic setting for applications, as well as using the encrypted items to store API Keys. Hopefully you have some ideas for how you might use it, if at all.

Alright, let’s wrap up this lesson here. In the next lesson I’ll summarize what I’ve covered throughout the course, and talk about next steps.


About the Author

Learning paths15

Ben Lambert is the Director of Engineering and was previously the lead author for DevOps and Microsoft Azure training content at Cloud Academy. His courses and learning paths covered Cloud Ecosystem technologies such as DC/OS, configuration management tools, and containers. As a software engineer, Ben’s experience includes building highly available web and mobile apps.

When he’s not building the first platform to run and measure enterprise transformation initiatives at Cloud Academy, he’s hiking, camping, or creating video games.

Covered Topics