Skip to content

Instantly share code, notes, and snippets.

@JacobAMason
Last active November 30, 2017 04:42
Show Gist options
  • Select an option

  • Save JacobAMason/96bb5317724b77b0f865eea96c26c7cb to your computer and use it in GitHub Desktop.

Select an option

Save JacobAMason/96bb5317724b77b0f865eea96c26c7cb to your computer and use it in GitHub Desktop.
Microservice Architecture with Docker

Microservice Architecture with Docker

Microservice architecture is a good solution for companies trying to maximize their resources, both human and machine, to develop clean, maintainable, scalable, available code. In order to take advantage of this architecture, we must first comprehend it, and then learn how to master it. Microservice Architecture is a solid, modern architecture and Docker is the best way to get started with this architecture.

Microservice Architecture

Microservice Architecture is a modern architecture designed explicitly for independence, scalability, and reliability through system failure. In their definitive expose on microservices [1], James Lewis and Martin Fowler describe the “common characteristics” that such an architecture should possess. They define microservices to be a style of architecture for “developing a single application as a suite of small services, each running in its own process and communicating with lightweight mechanisms, often an HTTP resource API” [1]. This single sentence definition is fairly succinct: Microservices are just miniature services that make up an application. However, the power of the microservice architecture is found in the specific way these services are connected and how they are separated.

The Key Features of Microservice Architecture

The first key feature that a microservice architecture provides is interoperability. Because each microservice communicates with a high-level API such as HTTP, instead of something lower level like Java's RMI, the service is completely independent. Designers of these individual services are free to make their own lower-level decisions, from the programming language used to write the service to the physical hardware running the service. A Raspberry Pi could take part in a microservice architecture with a large cloud cluster because the HTTP-based API could allow communication between these two disparate platforms.

The second powerful feature of microservice architecture is the way services are segregated. A major decision any architecture faces is how to divide up responsibilities—which components will be responsible for which requirements. Lewis and Fowler remark that when management break down application requirements for individual teams, they tend to break them up via technology (e.g.: UI, database, security, etc) [1]. They further suggest that such an architecture will inevitably follow Conway's Law, which says, “Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization's communication structure.” This might be convenient for short-term applications, but it is less ideal in the long-term—and most software teams want their applications to work in the long term. Instead of separating responsibilities by technical necessity, the microservice architecture divides services by their “business capability” [1]. This results in team independence, e.g.: the UI team no longer depends upon another team to design and implement the database for their application. A team must be full-stack, and, as a result, will have much more autonomy in making design and implementation decisions.

Benefits of Microservice Architecture

Clearly, the main theme of microservice architecture is independence. The benefits of design and implementation independence are numerous.

The first benefit has already been mentioned: Teams responsible for their own full-stack processes are free to choose implementation tools, libraries, and languages1 that they want and the final products can be distributed across many types of physical hardware, including cloud computing resources.

Secondly, the coupling between modules is reduced drastically. Since services must communicate between each other by a high-level API, they do not have build-time dependencies on each other. As long as the API evolution doesn't result in breaking changes, the service can continue to be improved with new functionality and receive maintenance without ever interfering with other services.

Another benefit of microservices, which is also a necessity due to structure, is decentralized data. In a monolithic architecture, it is not uncommon for all application data to reside in a single database. In the microservice style, each service has its own data store, and these data stores are very rarely shared. With the rise in understanding and popularity of NoSQL, this segregation of data allows individual services to store their data using a tool that best fits their data, whether it be SQL, NoSQL, or something different entirely, like Elasticsearch.

Yet another benefit of this style of architecture is that it can be designed for failure. To clarify, I will use the example of Chaos Monkey developed by Netflix. In 2015, Time magazine reported that 37% of North American internet traffic “during peak evening hours” consisted of Netflix streaming video [2]. Because the demand for Netflix video is so great and because the content Netflix distributes must allow users to engage in real-time, Netflix must have a very reliable, low latency system. To meet this need, Netflix relies on Amazon's AWS cloud computing platform. While AWS can provide the scalability and distributed replications, Netflix's applications must still be designed to work together in such an environment. Netflix's concern in this scenario is that a single service failure could bring down their network. Netflix says that their “best defense against major unexpected failures is to fail often. By frequently causing failures, we force our services to be built in a way that is more resilient” [3]. And Netflix does cause failures—literally. The Chaos Monkey tool's task is to randomly select a running service and to terminate it. Chaos Monkey terminates small parts at a time so that no customers will be impacted, but if the system were not designed to continue working through these intentional failures, the system would quickly be brought to its knees. Since the whole is a sum of its parts, each service must be able to compensate for the failures of other services, as well as accept new services that have been spun up to take the place of those which were defective.

Alternatives to Microservice Architecture

As we can see, microservice architecture brings about very positive characteristics in the systems which follow this pattern. Microservices are perceived as a step forward in the domain of software architecture, but it is worth considering what architectural styles and their relevant benefits a transition to a microservice architecture might abandon.

Microservice architecture falls under the umbrella of Service-Oriented Architecture (SOA). Some equate the two, claiming that “Microservice” is just a buzzword invented to describe a preexisting architectural pattern. Lewis and Fowler in [1] say this isn't too far from the mark, but that SOA can also imply the use of a design structure called the Enterprise Service Bus (ESB), or as Jim Webber has referred to it, the “Egregious Spaghetti Box” [4]. While the ESB pattern was originally designed to be much closer to the microservice pattern we see today, vendors began selling hub-and-spoke Enterprise Application Integration middleware as ESB and, according to Matt McLarty from InfoWorld, “customers ate them up” [[5]. The resulting structure more closely resembled that of monolithic architecture.

Monolithic architecture might sound old and unusable, but monolithic architecture has its place. Monoliths produce an executable output that is designed to run together on a single physical or virtual machine. A monolithic style is acceptable when the application is small and the solution is evident. In fact, monolithic architecture is often a preferable pattern to follow when designing a new system. Fowler contends in [6] that successful microservice architectures have grown out of a monolith, and that those projects which began as a pure microservices tend to fall apart. Microservices require a component separation based around business value, and since business value of components is shaped by customers, it is easy to get the separation of responsibilities wrong. Fowler reasons that the cost of using microservices from the start of a project is often too great—that the cost of trying to maintain a set of microservices will inhibit progress. Instead, he says teams should consider a “monolith-first strategy, where you should build a new application as a monolith initially, even if you think it's likely that it will benefit from a microservices architecture” [6]. Fowler outlines a couple variations on the monolith-first strategy when moving from a monolith to microservices: design the monolith with components in mind and then refactor components out. Use Strangler Pattern to “peel off” components and implement new components as microservices, or completely discard the monolith and re-write the microservices from scratch (the benefit here is that the original monolith design does not have to care about a later refactoring and can help a company be first to market).

Monolithic architecture can be used as a stepping stone to microservices, but it is often incorrectly thought of as the only alternative to a microservice architecture. Often, any style of architecture that does not follow the microservice pattern is called a monolith, and this is a misconception. As an alternative to both monolithic and microservice architectures, it is possible to use a cookie-cutter architecture pattern as documented by Paul Hammant in [7]. Instead of scaling each service independently, it is possible to instead scale the entire system at once, replicating the processes again and again to machines as demand increases. It is also possible to design components that interact over a common language specific interface (as opposed to HTTP) inside of a single process and then scale that process to many machines. Neither of these variations of SOA are true microservices or monoliths because they are independently built processes or services (not a monolith), but they are not independently scaled or deployed (not a microservice). Cookie-cutter architecture can be easier to use and understand, so it can be used as an alternative to both microservices and monoliths. For a depiction of how monoliths differ from microservices, see Figure 1.

Figure 1: Difference between dow Monoliths and Microservices scale [1] Figure 1: Difference between how Monoliths and Microservices scale [1]

Microservice Architecture Criticism

Though microservices have many benefits and the alternatives seem like inferior solutions, there are still some drawbacks to adopting the microservice architecture. First, there is some duplication of effort. Benjamin Wootton, CTO of Contino, points out in [8] that changing requirements which span services are difficult to implement. A new service could be created to fulfill the system-wide change, but that would introduce synchronous coupling2 which is undesirable. A common library could be used between the running services to avoid this coupling, but since microservices often polyglot, this solution is often not possible. The only remaining solution that would not result in synchronous coupling could result in duplicating the work between services. This is not a problem with monolithic applications.

Next is the issue of testability and asynchronicity. Unlike monoliths and some SOA pattern designs, microservices communicate using primarily asynchronous calls. This results in a more complicated system, especially in the area of testing when needing to test a service which relies on other services because it is hard to mock the production environment. Wootton says that instead of placing a large emphasis on testing, developers focus on monitoring their services in production to find problems and ensure correctness [8]. Eugene Dvorkin points out in [9] that monitoring itself can be a difficult problem for a distributed system; it can be hard to aggregate metrics and error logs on running services. Fortunately, Dvorkin provides an answer to these problems, telling his readers to use prebuilt solutions like Logstash and Riemann.

However, the largest drawback, by far, is the complexity of microservices and the cost of operations as a direct result. Microservice architecture requires skilled DevOps and System Networking teams to deploy the very complex final product that such an architectural style produces. Because of its polyglot nature, the process is only complicated further. Wootton says the operations team must be “embedded within your development team” as developers are constructing a full-stack service using whichever tools they desire [8]. The deployment, monitoring, and provisioning hardware for all the services also add up and can make the entire development process slower [9]. This is the caveat of the microservice architecture: sure, it is possible to build a great distributed system, but developers must have the know-how and the tooling to maintain it. A key tool used by most successful adopters of the microservice architecture to bring their projects to life is the Linux container.

Docker and Linux Containers

According to Red Hat, “A Linux® container is a set of processes that are isolated from the rest of the system, running from a distinct image that provides all files necessary to support the processes” [10]. The author's opinion is that Docker is probably the best container option available, is superior to VMs in nearly every way, and can be orchestrated across multiple physical systems.

Key Docker Concepts

The first thing about Docker that newcomers to the technology must grasp is that a container is not a Virtual Machine (VM). Whenever virtualization is used to a piece of software, the host operating system must run a hypervisor and then run the guest operating system within the context of the hypervisor. In this setup, the host machine is responsible for emulating every part of the guest machine, including the kernel. This is an acceptable option for a single developer using a single VM. The problem is that VMs do not scale well. Because the guest operating system must be simulated over and over again for each service, such an infrastructure would spend a considerable amount of computing resources on this overhead. Containers solves this problem by sharing the kernel. Each new container can be thought of like a separate process that is running in isolation from the other processes running on the system. See Figures 2 and 3 for a comparison of how containers and VMs are deployed on a system.

Figure 2: A representation of deployed Docker containers [19] Figure 2: A representation of deployed Docker containers [19]

Figure 3: A representation of deployed VMs [19] Figure 3: A representation of deployed VMs [19]

The second concept is that of images. Docker uses images to assist in container composition (building a new container from an existing container). Images are versioned and are the artifact distributed both to developers and to the production servers. Docker, as a piece of software, is made of two primary components: the Docker daemon (sometimes calledthe Docker Engine) and the Docker cli. The cli is just used to instruct the daemon. The daemon keeps track of all running containers and is responsible for their creation and destruction. The daemon can also build images from template files called “Dockerfiles.” This is the value that Docker adds to Linux containers that so many find useful.

The Benefits of Docker

Docker provides many benefits due both to the nature of containers and the templating script that Docker uses to construct images. Docker is lightweight and fast, can be used for Continuous Delivery (CD) and Continuous Integration (CI), and can be deployed nearly anywhere.

Docker gets its speed from its container underpinnings. Because containers do not require an extensive hypervisor to simulate a full OS, they are very quick. Docker containers can start in a matter of seconds, which is fantastic compared to the very long start times of VMs.

Docker is not only used for the production environment, but also for the local environment running on a developer's laptop. A developer can create an application in the context of an image and then ship that image directly to production. In fact, because the image is built from a template (the Dockerfile), the image can be versioned along with the source code that is built inside the image. This has an important implication: if the instructions to build the environment are always with the source code, the code can be tested in the same environment as if it were in production. This allows for CD and CI tooling to understand how to deploy applications without user interference once configured. Most importantly, the application deployment process is identical for every containerized application, regardless of the internal contents.

Not only can Docker deploy any containerized application to the production server, Docker can deploy this image to nearly any server in the world3. In a similar manner to virtualization, Docker provides a foundation upon which any Docker image can be run as a container—this is the Docker Engine. Because it does not emulate the kernel, the Docker Engine can deploy images to almost all hardware devices as long as they provide the Linux kernel4. This set of compatible devices includes cloud computers. Many cloud computing platforms, like those by Amazon and Google, support running Docker images by providing scalable instances with the Docker Engine already prepared.

Alternatives to Docker

With the advent of cloud solutions to the container management problem that Docker sought to solve, cloud providers and others have developed a couple alternatives to Docker.

First, there is the alternative to containers: the VM. The drawbacks of a VM have already been discussed above, but there are still ways that VMs can perform at scale. Vagrant is nearly a suitable alternative to Docker, but it is not a container management platform. Vagrant admits that containers are faster, but that Docker only runs Linux images for certain distributions [11]. Vagrant claims it is better for building a development box when that development box is Windows. The biggest difference is that Vagrant still creates and manages VMs, and this is not as lightweight or as fast as Docker's containers. The author personally finds Vagrant lacking, even though he used Vagrant before he ever even knew what a microservice was. Vagrant's few helpful utilities and selling points are quickly being subsumed by Docker, especially after they acquired Unikernel Systems and formed a partnership with Microsoft in 2016 [12].

Second, there are alternative containerization frameworks other than Docker, though they are not in direct competition since they serve slightly different purposes. While the Docker daemon is capable of handling containers on a single machine, developers wanted to use them in production where they ran on many machines. This is a process known as container orchestration. There were several tools developed for the purpose of managing these containers, one of the first being Marathon on Apache Mesos, then Kubernetes. Docker then created Docker Swarm, but it was eventually made a part of the Docker Engine. These orchestration systems, while technically Docker alternatives, are all running Docker containers. This is because the Docker file format has been accepted as the industry standard in the Open Container Initiative (OCI) [13]. Google, one of the members of the OCI adopted containers on the Google Cloud Platform and, in order to provide it as a service, designed Kubernetes as an orchestration service, which they subsequently released as open-source software. This used to be an attractive feature as Docker Swarm was not always open-source. Another container orchestration framework developed to handle Docker containers is Apache Mesos. Mesos is a replication of Google Borg and Facebook Tupperware, and Docker is just one of the many deployment environments it supports. Mesos uses an orchestration framework called Marathon to interface with the Docker Engine. To summarize, both Kubernetes and Apache Mesos running Marathon are just alternatives to Docker Swarm, and not necessarily alternatives to the Docker Engine. [14]

There is, however, an alternative to the Docker ecosystem. In 2014, many security holes were found in Docker, and though they were closed, it led some to develop a container ecosystem with a focus on security called rkt (pronounced: rocket). The biggest security hazard that comes with using Docker is a privilege escalation attack. Because the Docker daemon runs as root, if a container has a vulnerability, it could allow an attacker to escape the container and run in the context of the daemon (root). To properly set up Docker, developers must use something like SELinux, but since SELinux involves some lengthy setup as well as a fairly solid Linux understanding, many developers simply do not do it. Rkt simply does not allow containers to be created from a root context, so this vulnerability is not present. The are a couple other benefits to using rkt, for example HTTPS for image downloads with no registry necessary, but as with Vagrant, Docker is quickly closing the gap on its other shortcomings when compared to rkt. Two advantages rkt used to have over Docker were that rkt was designed as a modular system and Docker was monolithic and rkt used an open-source container format and Docker's was proprietary—Docker has since become more modular and open-source because key pieces of its functionality were separated as a part of OCI. [15]

Using Docker to Support Microservice Architecture

With an understanding of microservice architecture and Docker, the pieces of the puzzle are coming together. Remember the biggest criticisms of microservices? They are complex and therefore costly to manage once they have been deployed. They require support for many different languages and libraries. Containers allow for deployment of many different languages and libraries to any platform. Docker helps manage containers so that individual developers or development teams can dictate the production environment for their service rather than a DevOps team shouldering all the production management responsibilities, resulting in higher costs. It should be apparent now that Docker is a perfect tool to use to assist in creating and maintaining a microservice style system. To illustrate Docker's effectiveness at meeting the needs of a microservice pattern, here is a simple example of how this is actually performed, followed by case studies of successful implementations.

The most common components of any modern application are a web server, some underlying business logic, and a database. This could easily be represented with a three-tier architecture, but we could assume some additional complexity. There could be many API endpoints, a need to consume outside data, the ability to send emails or push notifications, storage of disparate data that requires both SQL and NoSQL—in essence, this could still be a very large application. To follow Martin Fowler's advice, we could first design our application as a monolith. By following the three-tier architecture, there would already be some degree of separation between components. Once we understood how to best separate the responsibilities of our fictional application, we could use the Strangler Pattern to move functionality from the monolith into a microservice. This is where Docker comes in: each new microservice would run inside a container and, as such, would have a versionable Dockerfile that would be used to build the container not only for production, but for CI testing and to onboard every other developer who wish to contribute to that microservice. To scale these microservices up, we could use Kubernetes, Mesos, or take advantage of the built-in Docker Swarm. Individual teams that assumed responsibility for every part of their microservice would create and maintain the Docker image.

There are several companies who have found the Microservices with Docker pattern to be very effective.

  1. PayPal, the online payment system created by Elon Musk, now of Tesla and SpaceX fame, uses their own private servers for hosting data and everything else their company requires. This is understandable given the sensitive information they frequently handle. PayPal recently acquired another company called Venmo, and as a result, they handle well over 200 transactions per second. Peak times, such as holiday shopping seasons, can put even more variable strain on a system such as this. Today, Paypal uses Docker to “scale quickly, deploy faster, and one day even provide local desktop-based development environments with Docker.” [16]
  2. ADP, the human capital management provider, has over 35 million users. Most people associate ADP with payroll, but it offers even more functionality than that, depending on the size of the company and what HR services they require. Like PayPal, ADP works with very sensitive data, especially social security numbers. They hold over 55 million social security numbers, and their payroll systems paid out nearly 2 trillion dollars a year. ADP is so ubiquitous that about ten percent of the United States Gross National Product is handled by ADP's systems. ADP relies on Docker because Docker images can be digitally signed so that they are confident that the code running on their servers has been verified. ADP also uses Docker Swarm to scale up and to create “an evolutionary path forward to micro services” by hybridizing their containers. They are using the Strangler Pattern to peel off functionality into containers. [17]
  3. General Electric is another company adopting Docker for its Appliances division. Unlike ADP and PayPal, GE has been around over 60 years and maintains a considerable amount of legacy software and data. Before using Docker, it took at least 6 weeks to move a completed application to deployment. GE tried to use Puppet to stand up machines, but they found this unwieldy. Because Docker allows deployment to any environment (not just one configured by tools like Puppet), GE was able to migrate over 60% over their datacenter within only 4 months. They were also able to save resources by moving from VMs only (using VMware) to using containers that consisted of, on average, 14 applications per their deployment container. This means they were able to create the new containerized system and continue running the remaining legacy applications using the same physical hardware. [18]

Whether a new company, an old company, or a company handling sensitive data, Docker works well as a drop-in solution for microservice architecture. The best part about Docker is that it's FOSS. Anyone can use it, from the biggest tech corporations, down to an individual user on their personal laptop. ■

Footnotes
  1. A system environment that uses different languages for different services is called a polyglot environment.
  2. Synchronous coupling is a term used here to mean that a service A which depends on the result of another service B is blocked while waitng for service B to respond due to the nature of the change introduced into A.
  3. At the time of writing, Docker's slogan is “Build, Ship, and Run Any App, Anywhere”—a fairly succinct description of the product.
  4. This isn't entirely true. Docker can run on Windows and OS X (now macOS), but not as efficiently as running on Linux natively.
References

[1] “Microservices,” martinfowler.com. [Online]. Available: https://martinfowler.com/articles/microservices.html. [Accessed: 01-Nov-2017].

[2] “Netflix Accounts for More Than One-Third of Internet Traffic,” Time. [Online]. Available: http://time.com/3901378/netflix-internet-traffic/. [Accessed: 03-Nov-2017].

[3] J. Brodkin, “Netflix attacks own network with ‘Chaos Monkey’—and now you can too,” Ars Technica, 30-Jul-2012. [Online]. Available: https://arstechnica.com/information-technology/2012/07/netflix-attacks-own-network-with-chaos-monkey-and-now-you-can-too/. [Accessed: 03-Nov-2017].

[4] “Does My Bus Look Big in This?,” InfoQ. [Online]. Available: https://www.infoq.com/presentations/soa-without-esb. [Accessed: 04-Nov-2017].

[5] M. McLarty, “Learn from SOA: 5 lessons for the microservices era,” InfoWorld, 09-Jun-2016. [Online]. Available: https://www.infoworld.com/article/3080611/application-development/learning-from-soa-5-lessons-for-the-microservices-era.html. [Accessed: 04-Nov-2017].

[6] Martin Fowler, “MonolithFirst,” martinfowler.com. [Online]. Available: https://martinfowler.com/bliki/MonolithFirst.html. [Accessed: 04-Nov-2017].

[7] Paul Hammant, “So you think monolith is the only alternative to microservices.” [Online]. Available: https://paulhammant.com/2015/05/02/so-you-think-monolith-is-the-only-alternative-to-microservices/. [Accessed: 04-Nov-2017].

[8] Benjamin Wootton, “Microservices - Not a free lunch! - High Scalability -.” [Online]. Available: http://highscalability.com/blog/2014/4/8/microservices-not-a-free-lunch.html. [Accessed: 04-Nov-2017].

[9] Eugene Dvorkin, “Seven micro-services architecture problems and solutions | Art of Software Engineering.” .

[10] “What’s a Linux container?” [Online]. Available: https://www.redhat.com/en/topics/containers/whats-a-linux-container. [Accessed: 04-Nov-2017].

[11] “Vagrant vs. Docker,” Vagrant by HashiCorp. [Online]. Available: https://www.vagrantup.com/intro/vs/docker.html. [Accessed: 04-Nov-2017].

[12] “Company,” Docker, 14-May-2015. [Online]. Available: https://www.docker.com/company. [Accessed: 05-Nov-2017].

[13] “Home,” Open Containers Initiative. .

[14] Amr Abdelrazik, “Docker vs. Kubernetes vs. Apache Mesos: Why What You Think You Know is Probably Wrong.” .

[15] “Docker vs Rkt (Rocket) - Which one to choose?,” Bobcares, 28-Jul-2016. .

[16] “PayPal Uses Docker To Containerize Existing Apps, Save Money and Boost Security | Docker.” [Online]. Available: https://www.docker.com/customers/paypal-uses-docker-containerize-existing-apps-save-money-and-boost-security. [Accessed: 15-Nov-2017].

[17] “Docker Enterprise Edition Delivers Security And Scale For ADP | Docker.” [Online]. Available: https://www.docker.com/customers/docker-enterprise-edition-delivers-security-and-scale-adp. [Accessed: 15-Nov-2017].

[18] “GE Uses Docker to Enable Self-Service For Their Developers | Docker.” [Online]. Available: https://www.docker.com/customers/ge-uses-docker-enable-self-service-their-developers. [Accessed: 15-Nov-2017].

[19] “What is a Container,” Docker, 29-Jan-2017. [Online]. Available: https://www.docker.com/what-container. [Accessed: 17-Nov-2017].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment