Using Attributes to Constrain Marathon Applications

Lab Steps

lock
Logging in to the Amazon Web Services Console
lock
Understanding the DC/OS Cluster Architecture
lock
Connecting to the DC/OS Cluster NAT Instance using SSH with Agent Forwarding
lock
Adding Agent Attributes to DC/OS Nodes
lock
Using Attributes to Constrain Marathon Applications
lock
Validate AWS Lab
Need help? Contact our support team

Here you can find the instructions for this specific Lab Step.

If you are ready for a real environment experience please start the Lab. Keep in mind that you'll need to start from the first step.

Introduction

Marathon supports the use of one or more constraints to influence which nodes an application can be scheduled on. A common use of constraints is to provide fault-tolerance by spreading application instances onto different server racks or availability zones. Another use is to ensure certain applications run on the same node for dependency or performance reasons.

Constraints have three parts:

  1. Field: The hostname or any attribute of an agent node
  2. Operator: Describes the condition to be met by the field values
  3. Parameter (Optional): A pattern to match using the operator

The supported operators are:

  • UNIQUE: Enforce uniqueness in a field across all of the application instances. For example, to ensure only one application instance per node.
  • CLUSTER: Run all application instances on nodes matching the field parameter value. For example, to run all application instances on a specific rack.
  • GROUP_BY: Distributes application instances evenly across all known values of a field, or a specified number of values. For example, to evenly distribute application instances across all known availability zones, or across two availability zones.
  • LIKE: Allows you to specify a regular expression parameter to match against. For example, to start application instances on racks 1, 2, or 3.
  • UNLIKE: Run application instances on nodes that don't match a pattern. For example, start application instances on any racks other than 1, 2, or 3.
  • MAX_PER: Limits the number of application instances per field value. For example, to limit the number of application instances in any one availability zone to two or less.

This Lab Step will walk you through scheduling applications using constraints on the attributes you assigned to agents.

 

Instructions

1. Create a Marathon application definition with a constraint to only allow scheduling on racks 1 or 3:

Copy code
cat <<EOF > rack_1_3.json
{
"id": "/hello-world",
"cmd": "while [ true ]; do echo -n 'Hello Marathon: '; date; sleep 5; done",
"cpus": 1,
"mem": 10.0,
"instances": 2,
"constraints": [["rack-id", "LIKE", "rack-[1,3]"]]
}
EOF

Note the application definition requires 1 full CPU per application instance and two instances are being requested. The application doesn't do anything interesting, it is just a simple application to practice scheduling with constraints.

 

2. Add the application to Marathon:

Copy code
dcos marathon app add rack_1_3.json

 

3. Return to the DC/OS GUI, and navigate to Services:

alt

The STATUS is Waiting with only of the requested instances running.

 

4. To diagnose why the request isn't being fulfilled, click on hello-world and open the Debug tab:

alt

There are two visualizations to help you understand the criteria Marathon is trying to fulfill in scheduling the application instances:

  1. Summary bars: The bars flow from left to right with > arrows indicating direction. The leftmost bar represents the first criteria Marathon tries to satisfy. In this case, it is looking for agents in a private agent Role, of which there are 2/3. Next is the Constraints, which you specified in the application definition. Of the two remaining nodes, only 1/2 satisfy the Constraints. Lastly, of the only remaining eligible node, 0/1 meet the CPU requirement. This is because one application instance is already running and since the agent only has 1 CPU, there are none left. Therefore, the second application instance cannot be scheduled.
  2. Details matrix: Each agent shows the criteria that it satisfies with a green checkmark and the criteria that it does not satisfy with a red x. The agent with the ROLE not satisfied is the public agent, since Marathon will schedule on private agents by default. It is easy to see one private agent satisfies the CONSTRAINT but not the CPU, while the other private agent is the opposite.

 

5. Return to your SSH shell, and delete the application:

Copy code
dcos marathon app remove hello-world

 

6. Create a new application definition which aims to distribute application instances evenly across racks but limits the number of instances to two per agent:

Copy code
cat <<EOF > rack_distribute.json
{
"id": "/hello-world2",
"cmd": "while [ true ]; do echo -n 'Hello Marathon: '; date; sleep 5; done",
"cpus": 0.1,
"mem": 10.0,
"instances": 1,
"constraints": [
["rack-id", "GROUP_BY"],
["rack-id", "MAX_PER", "2"]
]
}
EOF

The CPU requirement is dropped to 0.1 to allow for multiple application instances per agent. Because the GROUP_BY constraint doesn't include a parameter, Marathon will distribute application instances across all known racks. In this case, there are two known racks, rack-1 and rack-2. You start with only one instance and will watch how Marathon scheduled additional instances by scaling the application.

 

7. Add the application to Marathon:

Copy code
dcos marathon app add rack_distribute.json

 

8. In the DC/OS GUI, navigate to Services > hello-world2:

alt

 

9. From the SSH shell, scale the application up to 2 instances:

Copy code
dcos marathon app update hello-world2 instances=2

 

10. Return to the DC/OS GUI, and observe that the second application instance is Running on the other host:

alt

 

11. Scale the application to 3, and then to 4 instances and observe which host is chosen after each scaling event in the DC/OS GUI.

In the end, you should have 2 application instances per agent.

 

12. Attempt to scale the application to 5 instances.

Because of the MAX_PER constraint, each rack is only allowed up to two instances. Therefore, the fifth instance cannot be scheduled.

 

13. Return to the DC/OS GUI, and navigate to the Debug tab:

alt

Try moving your mouse over the Summary bars and Details matrix entries to uncover more information about what resources were requested and what resource offers were received.

 

Challenge (Optional)

If you have time remaining in your Lab session, try to write constraints for the following and scale the application up from one instance to test your constraints:

  • Agents with capabilities 1 and 3 and on racks 1 or 2
  • Agents not on rack 2 with capabilities 4 or 5
  • Each rack has only one application instance

 

Summary

In this Lab Step, you explored Marathon constraints to control where application instances are scheduled. You wrote a couple application definitions with constraints. You also used the DC/OS GUI to gain a better understanding of how Marathon schedules application instances.