Questions Customers Have About YAKS (Yet Another Kubernetes Service)

Understanding challenges facing Kubernetes operators and users helps you get the most out of your implementation.
Understanding challenges facing Kubernetes operators and users helps you get the most out of your implementation.

Editor’s note – This blog post reflects our Kubernetes efforts in 2017. To learn more about Kubernetes on DC/OS, see Kubernetes-as-a-Service Now Available in DC/OS 1.11.

Kubernetes is now officially a commodity. As of last week, all the top vendors and providers now offer a solution around Kubernetes. The Cloud Native Computing Foundation (CNCF) recently began certifying Kubernetes distributions to ensure conformance. To the benefit of customers, the majority of the market is focused on pure play Kubernetes which, again, makes the API and CLI an interchangeable commodity that can be used anywhere. This consensus also means that users can now focus on implementing the solution they are trying to deliver rather than figuring out how to get up and running with a specific implementation of Kubernetes. I look forward to hearing how users are tackling this next set of hard problems this week during KubeCon US 2017.

Over the past two years I have talked to thousands of Kubernetes users across the globe and they have consistently wanted to focus more on their planned higher level services than orchestration. Customers want to know things like “will this work for a planned IoT business that relies heavily on multiple real time analytics solutions” or “how can I make sure to scale across data centers so my customers don’t experience outage”, and are not purist on how to get there. Kubernetes is a tool to them. A great tool. A tool that is now ubiquitous but is still only part of the solution they are trying to deliver to their own customers.

The questions users have are hard because the solutions they are trying to develop require multiple, and an ever growing number of, modern open source services. There are common threads to the questions I’ve heard from Kubernetes users. Here is a summary of a few topics I’ve heard from users:

  1. How do I run multiple interconnected Kubernetes clusters for high availability? The number one question you will hear from people starting to build a solution on top of Kubernetes is how will they make it resilient for customers. Unfortunately, federation in Kubernetes today is in the same realm as “Kubernetes on Kubernetes” and “Openstack on Kubernetes.” It is merely an engineering fantasy without any tether to the reality of customer demands. What are most people asking for in regards to federation? They have trouble verbalizing it but I’d break down to three things:

    a) Kubernetes Cluster Management – This isn’t the same as a Kubernetes Dashboard of Dashboards. Users want to manage the clusters and not the hosted objects regardless of cluster location, lifespan or intended use. They want a way to easily spin those clusters up anywhere, interconnect them if they must, and manage their entire lifecycle. Most importantly they want these clusters to be largely identical across environments.b) Granular Monitoring and Aggregated Logging – Users want robust monitoring on both the hosted Kubernetes objects, via their application performance management (APM) solution, and for the underlying Kubernetes infrastructure itself. There are third party tools like DataDog, Sysdig and open source Prometheus from the CNCF that accomplish this goal.

    c) Deployment Workflow to Many Environments – Customers are asking for workflow for lightly coupled, eventually consistent, homogenous clusters instead of tight orchestration among clusters in a distributed system. The difference is you won’t have “bursting” or a single cluster spanning multiple data centers. Users want to know that if an availability zone or environment goes down, then their higher level service will still be up and running because it was also hosted on an identical cluster somewhere else. Their customers won’t experience an outage. They also want to know that deployments will work across clusters, and these clusters must be homogenous for that to work.

    Currently Kubernetes federation is complex and doesn’t really get to the heart of most of the issues listed above. For Kubernetes federation right now you host some of the Kubernetes components – API Server, Scheduler and Controller Manager – on one of the Kubernetes clusters. Even with the new OSS tool (kfed) to help, that’s not easy. When you have it working, many of the Kubectl commands are not supported and the entire architecture is fragile. The current federation approach provides some of the deployment workflow, when it works, but addressing part of the above concerns is not enough.

  2. How do I use Kubernetes as a component for my single application? Probably one of the most surprising things you hear from users is they are using Kubernetes to support a single application and not as a generalized container service. Around 80% of the users I have talked to fall into this bucket. Many of the case studies I see from large organizations have many siloed teams working on different services using Kubernetes. That is very different from the original intent of using Kubernetes to run the entire data center like Google.
  3. How do I integrate my load balancer and identity management? The single most cited reason people I have talked to gave for using a particular Kubernetes distribution, over simply using the vanilla version, is out-of-the-box load balancer and identity management integrations. All the other features such as single vendor user interfaces, additional storage features and the like never come up. There needs to be better integration with load balancers and identity management solutions if customers are going to use the pure upstream version that are compatible across clouds.

Kubernetes, Fast Data Services and Machine Learning with DC/OS

People incorrectly perceive Apache Mesos and Kubernetes as competing technology in the container orchestration space.

The Mesos architecture separates the scheduling of Kubernetes, Tensorflow, Cassandra and other services from the underlying resource management. The benefits are that organizations can automate the feeding of those clusters as they run out of resources, update them with no downtime, and spin up new clusters as demand requires.

Mesosphere DC/OS provides a proven technology (Apache Mesos) with more than 7 years in production in small, medium and large scale deployments. DC/OS’s maturity allows customers to go into production easily and quickly with confidence.

DC/OS has many advantages when building out solutions that include Kubernetes:

  • Create New Clusters Anywhere – DC/OS allows users to spin up identical Kubernetes clusters, and supporting data service components, as needed in any environment. Each team can have its own isolated cluster or a single team can spin up multiple clusters for different life cycles. Clustered services can be updated to newer versions with zero downtime while capacity can be added as needed.
  • Connected Service Components – modern applications rely on an ever growing list of open source data services, for example distributed databases like Cassandra and MongoDB, and stream processing like Kafka and Spark. To get the most out of these services often requires machine learning from additional frameworks like Tensorflow and PyTorch. DC/OS supports a variety of frameworks that can be spun up, connected, and accessed immediately. The service catalog is enabled by a powerful SDK that lets OSS communities, customers, and DC/OS users to integrate their solutions on the platform with minimal code and deploy them with a single click. A full list of DC/OS packages can be found in the service catalog https://universe.dcos.io/#/packages.
  • Load Balancing for Thousands of Nodes – Securely exposing services in a containerized distributed environment is difficult due to its dynamic nature. Recently, Mesosphere released Edge-LB that allows many siloed teams to have access to a load balancer that was built with scalable microservices and containers in mind. With Edge-LB, teams don’t need to worry about collisions due to identically named services or about scaling the load balancer itself because Edge-LB is hosted on DC/OS and is as scalable as every other framework. It conforms to and expands the Container Network Interface (CNI) standard so plugins like Calico can utilize it.

Learn more about Kubernetes

Written by the founders of Kubernetes, “Kubernetes: Up & Running,” is the definitive guide to containerized application development. This free eBook excerpt will help you understand how to build, deploy, and maintain more reliable, scalable distributed systems.