Product, Use Cases

Zero downtime app deploys via Blue-Green deployment and the DCOS CLI

Oct 16, 2015

Brenden Matthews

D2iQ

3 min read

 
Mesosphere's Datacenter Operating System (DCOS) is already an impressive piece of technology, and now it's even more powerful. We've added features to the DCOS command line interface (CLI) that will make Blue-Green deployment easier.
 
What's Blue-Green deployment? It's a way to safely deploy applications that are serving live traffic by creating two versions of an application (BLUE and GREEN). To deploy a new version of the application, you drain all traffic, requests and pending operations from the current version of the application, then switch to the new version, and then turn off the old version. Blue-Green deployment eliminates application downtime and allows you to quickly roll back to the BLUE version of the application if necessary.
 
For an overview of the process, here's a great article by Martin Fowler.
 
In a production environment, you would typically script this process and integrate it into your existing deployment system. Below we provide an example of the steps necessary to perform a safe deployment on the DCOS.
 
Requirements
 
  • A Marathon-based app with health checks that accurately reflect the health of the application.
  • The app must expose a metric endpoint to determine whether the app has any pending operations. For example, the application could expose a global atomic counter of the number of currently queued DB transactions.
  • The jq command-line JSON processor.
  • If you are using open source Mesos, configure the DCOS CLI.
 
Procedure
 
We will replace the current app version (BLUE) with a new version (GREEN).
 
1. Launch the new version of the app on Marathon. Add a unique ID to the app name, such as the Git commit ID. In this example, we ID the new version of the app by adding GREEN to its name.
 
    # launch green    dcos marathon app add green-myapp.json 
 
Note: If you were using the API instead of the DCOS CLI, the command above would be much longer, as shown below:
 
    curl -H "Content-Type: application/json" -X POST -d @green-myapp.json <hosturl>/marathon/v2/apps
 
2. Scale GREEN app instances by 1 or more. Initially (starting from 0 instances), set the number of app instances to the minimum required to serve traffic. Remember, no traffic will arrive yet: we haven't registered at the load balancer.
 
    # scale green    dcos marathon app update /green-myapp instances=1
 
3. Wait until all tasks from the GREEN app have passed health checks. This step requires jq.
 
    # wait until healthy    dcos marathon app show /green-myapp | jq '.tasks[].healthCheckResults[] | select (.alive == false)'
 
4. Use the code snippet above to check that all instances of GREEN are still healthy. Abort the deployment and begin rollback if you see unexpected behavior.
 
5. Add the new task instances from the GREEN app to the load balancer pool.
 
6. Pick one or more task instances from the current (BLUE) version of the app.
 
    # pick tasks from blue    dcos marathon task list /blue-myapp
 
7. Update the load balancer configuration to remove the task instances above from the BLUE app pool.
 
8. Wait until the task instances from the BLUE app have 0 pending operations. Use the metrics endpoint in the application to determine the number of pending operations.
 
9. Once all operations are complete from the BLUE tasks, kill and scale the BLUE app using the API. In the snippet below, <hosturl> is the hostname of your master cluster prefixed with http://.
 
    # kill and scale blue tasks    echo "{\"ids\":[\"<task_id>\"]}" | curl -H "Content-Type: application/json" -X POST -d @- <hosturl>/marathon/v2/tasks/delete?scale=true
 
This Marathon operation will remove specific instances (the ones with 0 pending operations) and prevent them from being restarted.
 
10. Repeat steps 3-10 until there are no more BLUE tasks.
 
11. Remove the BLUE app from Marathon.
 
    # remove blue    dcos marathon app remove /blue-myapp
 

Ready to get started?