Triangle }}

Introducing Marathon 1.5

Jörg Schad and Johannes Unterstein discuss the new features in Marathon 1.5.
Jörg Schad and Johannes Unterstein discuss the new features in Marathon 1.5.

We are excited to announce the release of Marathon 1.5.1. Marathon is the container orchestrator powering Apache Mesos and DC/OS. Marathon 1.5 is part of DC/OS 1.10 and is also available for download as a standalone binary. This release includes a number of new features, bug fixes, and improvements. Among many other features and performance and scalability improvements, it includes support for file based secrets, support for multiple networks, a backup and restore mechanism, and a plugin interface to customize offer matching.

Backup & Restore

Marathon 1.5 has added a built-in backup and restore functionality. The complete current state of Marathon, which is kept in the persistent data store, can be backed up to an external file or to an external storage provider. Restoring from a backup brings Marathon to the exact state it was in at the time of backup creation.

For detailed information, please see the related marathon docs page.

Unreachable Strategy

Recent changes in Apache Mesos introduced the ability to handle a temporary unavailability agent. In this case (Marathon) tasks running would be placed in the TASK_UNREACHABLE state. This behavior allows for the ability for a node to disconnect and reconnect to the cluster without having a task replaced. To allow for a task to reconnect to the cluster, the default configuration will wait 75 seconds before restarting that task. Prior to the TASK_UNREACHABLE state Marathon would usually restart in less than a second.To make the behavior flexible it is now possible to configure unreachableStrategy for apps and pods to either instantly replace unreachable apps or pods or after a custom timeout duration (during which the task might have become already reachable again).

Networking

Marathon 1.5 introduced multiple networking improvements involving to better support multiple container networks. To support this, the field networkNames has been added to app container’s ContainerPortMapping and the pod Endpoint.

Additionally container port discovery has been improved, with a pod or app being able specify with which container network(s) a port name/protocol/etc is associated. Discovery labels are now generated for container networks associated with ports.

Unfortunately this causes some breaking changes and the following deprecated fields will no longer be generated for app JSON:

  • ipAddress
  • container.docker.portMappings
  • container.docker.network
  • ports
  • Uris

Marathon will continue to accept old app JSON containing these fields as it did in 1.4; however, applications that use deprecated fields will be normalized into a canonical representation and hence external tooling cannot rely on these fields anymore and requires adjustments.

See the networking documentation for details concerning the new API.

File based Secrets

Marathon provides a pluggable interface to integrate secret store providers such as vault
With Marathon 1.5 this interface has been extended to support file based secrets which can be mounted into the Mesos Sandbox.

Please note, that there is not yet an OSS implementation of this interface.For detailed information, please see the related marathon docs page.

Customizable offer matching

Marathon now has a pluggable interface for custom logic during offer matching. Such plugins can be used to provide custom filters for offers, e.g., for these use cases:

  • Analytics. If task fails, for example, 5 times for 5 minutes, we can assume that it will fail again and reject new offers for it.
  • Binding to agents. For example, agents can be marked as included into primary or secondary group. Task can be marked with group name. Plugin can schedule task deployment to primary agents. If all primary agents are busy, task can be scheduled to secondary agents.

Outlook for 1.5.2

With Marathon 1.5.2 Mesos maintenance primitives will become fully usable. These maintenance primitives allow configure scheduled maintenance or unavailability window for Mesos agents.
In Marathon 1.5.2 it will be possible to opt-in for this feature and be able to respect possible unavailability. During the configurable draining time no new tasks will be started on the particular agents.

We hope you enjoy Marathon version 1.5. If you want more details check out the release notes! If you discover any issues or want to provide feedback, please consider the Jira or the mailing list.