Archive for the ‘Technical’ Category

Docker explained in layman’s terms

Tuesday, April 14th, 2015

There are quite a lot of good documentation available about Docker including the official one, but many of them start explaining lxc, cgroupsLinux kernel and UnionFS in the very first paragraph, thus scaring off many readers who are not Linux geeks.

In this article I will try to explain Docker in layman’s terms. No prior knowledge of virtual machines, cloud deployments or DevOps practices are assumed to understand this post.

Docker?

Docker is an open source platform which can be used to package, distribute and run your applications. Docker provides an easy and efficient way to encapsulate applications (e.g. a Java web application) and any required infrastructure to run that application (e.g. Red hat Linux OS, Apache web server, Tomcat application server, mySQL database etc.) as a single “Docker image” which can then be shared through a central, shared “Docker registry“. The image can then be used to launch a “Docker container” which makes the contained application available from the host where the Docker container is running.

Docker provides some convenient tools to build Docker images in a simple and efficient way. A Docker container on the other hand is a kind of light weight virtual machine with considerably smaller memory and disk space footprint than a full blown virtual machine.

By enabling fast, convenient and automated deployments, Docker has the effect of shortening the cycle between writing code, testing code and getting it live on Production. On the other hand, by providing a light weight container to run the application, Docker enables very efficient utilization of hardware and CPU resources.

Docker is open source and can be installed on your notebook or on any servers where you want to build and run your Docker images and containers (provided you meet the minimum system requirements).

Docker deployment workflow from 1000 feet

Before we look into individual components of the Docker ecosystem, let us try to understand one of the ways, how Docker workflow makes sense in the Software Development Life Cycle.

In-order to support this workflow, we need a CI tool and a configuration management tool. I picked Bamboo and Ansible, even though the workflow would remain the same for any other tools or modes of deployment. Below is a possible workflow:

  1. Code is committed to Git repository
  2. A job is triggered by Bamboo to build the application from the source code and run unit/integration tests.
  3. If tests are successful, Bamboo builds a Docker image which is pushed to a “Docker registry”. (Dockerfile, which is used to provide a template for building the image is typically committed as part of the codebase)
  4. Bamboo runs an Ansible playbook to deploy the image to QA servers
  5. If QA tests are passed as well, Ansible can be used to deploy and start the container in production

Docker ecosystem

Docker client and server

Docker is a client-server application. The Docker client talks to a Docker server (also called a daemon), which in turn does all the work. Docker is equipped with a command line client binary called Docker and a full Restful API. Docker client and server could be running on the same host or on different hosts.

Docker images

Images are the building blocks of the Docker world. They are the build part of Docker’s lifecycle. They are built step-by-step using a series of instructions which are typically described in a simple text configuration file called “Dockerfile”.  Docker provides a simple text based way of declaring the infrastructure and the environment dependencies.

Docker images are highly portable across hosts and environments. Just like compiled Java code can be run on any operating system where JVM is installed, a Docker image can be run in a Docker container on any host that runs Docker.

Docker registry

Docker registries are the distribution component of Docker. Docker stores the images which you build in a registry. There are two types of registries: public and private. Docker Inc. operates the public registry for images called Docker Hub. You can create an account on the Docker hub and store and share your images, but you have the option of keeping your images in Docker Hub private as well.

It is also possible to create your own registry behind your corporate firewall.

Docker container

If images are the building or packing aspect of Docker, containers are the runtime or execution aspect of Docker. Containers are launched from images and may contain one or more running processes.

Docker borrows the concept of standard shipping container, used to transport goods globally, as a model for its containers. But instead of goods, Docker containers ship software. Each container contains a software image — its cargo — and like its physical counterpart allows a set of operations to be performed. For example, it can be created, started, stopped, restarted and destroyed.

Like a shipping container, Docker doesn’t care about the contents of the container while performing these actions; for example whether a container is a web server, a database or an application server. Each container is loaded the same way as any other container.

Docker also doesn’t care where you ship your container: you can build on your laptop, upload to a registry, then download to a physical or virtual server, test, deploy to a cluster of a dozen Amazon EC2 hosts, and run.

 Benefits of Docker

  1. Efficient hardware utilization: When compared to hypervisor-based virtual machines, Docker containers use less memory, CPU cycles and disk space. This enables more efficient utilization of hardware resources. You can run more containers than virtual machines on a given host, resulting in higher density.
  2. Security: Docker isolates many aspects of the underlying host from an application running in a container without root privileges. Also a container has a smaller attack surface than a complete virtual machine.
  3. Consistency and portability: Docker provides a way to standardize application environments and enables portability of the application across environments.
  4. Fast, Efficient deployment cycle: Docker reduces the cycle time between code being written, code being tested, deployed and used.
    Below quote is taken from from official documentation

    Your developers write code locally and share their development stack via Docker with their colleagues. When they are ready, they push their code and the stack they are developing onto a test environment and execute any required tests. From the testing environment, you can then push the Docker images into production and deploy your code.

  5. Ease of use: Docker’s low overhead and quick startup times make running multiple containers less tedious.
  6. Encourages SOA: Docker is a natural fit for microservices and SOA architectures since each container typically runs a single process or application and you can run multiple containers with little overhead on the same system.
  7. Segregation of duties: In Docker ecosystem, Developers care about making the application work inside the container and Ops cares about managing those containers.

Containers Vs. Virtual machines

docker-containers-vms

  • Virtual machines have a full OS with its own memory management, device drivers, daemons, etc. Containers share the host’s OS and are therefore lighter weight.
  • Because Containers are lightweight, starting a container takes approximately a second whereas booting a VM can take several minutes
  • Containers can generally run only same or similar operating system as the host operating system. e.g., they can run Red Hat on an Ubuntu host, but you can’t run Windows on an Ububtu host. (In practice, for most practical cases it is not a real requirement to run different OS types)
  • While using VMs, in theory, vulnerabilities in particular OS versions can’t be leveraged to compromise other VMs running on the same physical host. Since containers share the same kernel, admins and software vendors need to apply special care to avoid security issues from adjacent containers.
    Countering this argument is that lightweight containers lack the larger attack surface of the full OS needed by a VM, combined with the potential exposures of the hypervisor itself.

References

  1. Docker user guide
  2. Security risks and vulnerabilities of Docker
  3. Containers Vs. VMs
  4. Docker: Using Linux Containers to Support Portable Application Deployment
  5. Contain yourself: The layman’s guide to Docker
  6. Deploying Java applications with Docker

 

O’Reilly Software Architecture Conference

Sunday, April 12th, 2015

O’Reilly recently conducted a Software Architecture Conference in Boston. That included a two days training on Microservices by Sam Newman (author of the recently published book Building Microservices) and two further days of talks by various speakers.

A talk was given by Martin Fowler on Agile Architecture, the video of which is available from the conference website.

microxchg.io => Microservice conference in Berlin

Friday, April 10th, 2015

Recently , I attended a Microservice conference in Berlin, where some of the trend-setters in this field such as James Lewis(Thoughtworks), Adrian Cockburn(Author, Agile development expert) and Chris Richardson(founder of Cloudfoundry) gave interesting presentations. Among the speakers were also Sam Newman(Thoughtworks) who has written a new book on the same topic.

Two-speed architecture

Wednesday, March 18th, 2015

I first came across the term “two-speed architecture” through an article from McKinsey Insights. I found it quite interesting as the topic applies to some of the projects and processes which I am currently involved in.

Some parts of this post contains distilled information from mckinsey’s articles on this topic, but other than that I have added some examples based on how two-speed practices are currently being implemented in several organizations. At the end of the post, I have provided a table which describes the dichotomy between the traditional world and the new world, which may (hopefully)harmoniously coexist in a “two-speed IT enterprise”

Introduction.

Two-speed architecture is a new terminology that is being used by some to describe the co-existence of a fast-speed customer-centric front-end running alongside a slow-speed, transaction-focused legacy back end. A two-speed IT architecture is aimed at helping companies develop their customer-facing capabilities at high speed while decoupling legacy systems for which release cycles of new functionality stay at a slower pace.

Those organizations which plan to modernize their application and system portfolio may find themselves following a two speed architecture through out the transitional period which could last few months to several years.

Microservices could act as a key enabling force for a two speed architecture though it is not a necessity. e.g. New microservices defining only a small amount of functionality, such as look-up of the next product a consumer would most likely purchase, should be deployable in an hour rather than in several weeks.

Main Drivers

There are many mature old economy companies which face significant market pressures to take advantage of cutting edge technologies to bring new products and services, improve time to market, provide richer user experience and improve operational efficiency through automation.

The legacy IT architecture and organization, for example, which runs the supply-chain and operations systems responsible for executing online product orders, lacks the speed and flexibility to address the above needs. In-order to address these issues, many companies need an IT architecture that can operate at different speeds.

Some examples include

  • A traditional insurance company which aims to take advantage of latest GPS technology based services to track customer’s driving habit to provide a personalized insurance policy – here the service to track driving habit may be one or more microservices which may use semi-anonymous data to the extent that is permitted by the rules and regulations. These microservices will typically use their own data-stores and may exchange anonymous or semi anonymous data across other services in a way that is probably not imaginable in their own legacy systems.
  • A traditional retailer which aims to offer new sales channel through mobile devices or which aims to provide a multi-channel experience to user – these new mobile apps will be written in latest front-end technologies and may try to take advantage of big data and advanced analytics to offer services such as buying suggestions. They may also aim to integrate with social media to take advantage of marketing possibilities offered, e.g. by enabling a user to post his latest purchase in facebook. These new ecosystem could have completely different requirements and capabilities when it comes to agility, security and time-to-market when compared to the same retailer’s traditional ERP systems
  • An old economy financial giant can write a set of (micro)services to offer a set of services which access and integrates customers data from different departments of the bank (which are typically managed in silos and not integrated in a central repository) and possibly even from third party risk management solutions to process customer’s credit or loan application. Such an exercise which may have taken several days to weeks in the past using a combination of digital and manual actions could be done fully digitally in real time nowadays. But such a new microservice probably also has to be more agile with shorter release cycles, while individual legacy services which are glued together have different release cycles which cannot be coordinated for organizational reasons.
  • An insurance company can launch an app which tracks user’s geographic location. When the user leaves the country, the app will recognize this and can push this information to a microservice. Another microservice can then recommend a travel-insurance to the user. These new microservices could have tremendous business value, but at the same time can be independently developed, tested and deployed in a more dynamic release cycle.

In a nutshell

Two speed architecture makes sense when a company with a portfolio of established “legacy applications” which demands high availability and high security with a slow release cycle is faced with the challenge of developing, testing and deploying customer facing front-end applications and services in a highly agile way to meet the market requirements. A two speed architecture is possible when a company makes a conscious choice to commit itself to “two speed IT” with the aim of addressing customer requirements and market competition using dynamic, shorter release cycles and relaxed customer-driven process and governance.

Following table shows some of the key differences between the two worlds that can harmoniously co-exist in a “two speed enterprise”.

[table]

Area or topic, Legacy applications and services, New (micro)services

Approach, “Waterfall / Waterfall-scrum hybrids / “pseudo-scrum””,  XP / Scrum / Kanban – willing to adopt aggressive and tactical Devops practices from time to time to deliver software at short notice

Governance, “Plan driven, approval based”, “Empirical, continuous, process based”

Release cycle, One release per month or per quarter,”Several releases per month, sometimes several releases per week”

Release process, “Manually intensive releases, often with extended downtime”, “Fully automated releases, sometimes delivered using container based packaging and deployment tools like Docker with minimal service disruption”

Testing, “Unit tests written by developers, often followed by a dedicated QA team performing testing in a lengthy QA phase”, Relatively less number of unit tests and highly automated end-to-end integration tests
Testing, Emphasis on full test coverage using traditional test coverage metrics – not transparent to all the stakeholders, “Emphasis on monitoring and recovering capabilities using modern DevOps tools and practices. POs, Ops and developers are able to view the status and health of services in a highly transparent manner using monitoring pages in a web browser”

Operation, Emphasis on robustness – highly risk averse and willing to postpone releases if confidence is missing, “Emphasis on time to market – willing to take calculated risks for bringing new features as early as possible, thus gaining valuable end user feedback (fail fast, fail often)”

Architecture, “Default mode of thinking will be a layered architecture resulting in classic 3 tier architecture of UI, middleware and DB”, Architecture is API driven and thinking is dominated by disparate services that are required to solve a domain problem.
Culture, IT-centric / process-centric, Business-centric

[/table]

Followers

Insurer Allianz is known to have made “two speed architecture” a core part of their enterprise IT strategy. Australian Department of Defense is also known to have taken such a stance. According to some consultancies such as McKinsey and BCG, several retailers, banks and telecoms are said to have aligned their IT process and architecture for a two speed architecture, but their names seem to be a closely guarded secret.

References

  1. http://www.mckinsey.com/insights/business_technology/a_two_speed_it_architecture_for_the_digital_enterprise
  2. http://www.mckinsey.com/insights/business_technology/running_your_company_at_two_speeds
  3. http://www.wired.com/2014/12/reach-two-speed-it-apis/
  4. https://www.bcgperspectives.com/content/articles/it_performance_it_strategy_two_speed_it/
  5. Coursera provides a free course on the broader concept of “Two Speed IT”, aimed at IT strategy managers
  6. API centric development

Microservices – μService architecture

Sunday, October 19th, 2014

Microservices are the hottest topic in enterprise software development nowadays. It is an application architecture pattern or a trend that has emerged over last 2 to 4 years based on several enabling factors such as polyglot development, cloud deployment and increased deployment automation.

Breaking up a legacy, monolithic, portal-server based application into more service oriented, independently deployable and easily maintainable multiple services based around business capabilities could be an interesting and daunting challenge

Articles

Martin Fowler’s defining article

Presentation by James Lewis of Thoughtworks

Cracking Microservices practices

μService not a free lunch

Microservices: Decomposing Applications for Deployability and Scalability

Agile coding in enterprise IT: Code small and local

 12 factor apps

Microservice in practice

Karma Inc.

Failing at Microservices

 

Implementation

http://blog.xebia.com/2014/10/27/dropwizard/

 

Tags

https://twitter.com/hashtag/microservice

Java 8 to be released today

Tuesday, March 18th, 2014

Today (18.03.2014) is the official release date for Java 8 and the download page from Oracle now lists Java 8 as the latest release. I have been trying out various new features of Java 8 for more than 6 months now. I had downloaded the early access build and have been using it with intellij 12.1.6.

Support for Java 8 in intellij is reasonable so far, even though there have been some minor issues, one of them which I had reported and got fixed. This involved “Error suggestion” for checked exceptions inside a lambda expression containing a wrong suggestion to add the exception to the method definition which results in a non compilable code. Just like an anonymous inner class must catch all its checked exceptions, a lambda must also catch them all within the lambda block.  Also the code completion using custom templates inside the lambda block does not seem to work as well.

There are many examples to be found in the internet that explains very well what a lambda expression is and how they could make the life easier for Java developers in certain cases. Oracle’s tutorial is worth checking out as it also explains the necessity of lambda by starting with a pre Java 8 code and transforming it step by step using lambda expression.

Why lambda? A mature language facing a mid-life crisis

At 18, Java platform is not a novelty anymore. Since its first release in 1995, the platform and the language has grown to achieve unprecedented success, immense popularity and widespread adoption in almost all walks of software development. According to Oracle’s own claim, 3 billion devices run on Java and that include Computers, Printers, Routers, Mobile devices, Tablets and a wide range of other networked appliances (to me it sounds like one of those claims which will never be verified!)

java3billion

 

Though it might seem like things have never been better for Java, there are skeptics who believe that both the platform and the language have seen their best times already – those who regard Java as a cluttered, poorly designed and overreached language that is slow or incapable to adapt to the latest challenges in the industry. This doubt comes at a time when Java is facing increasing competition from dynamic languages like Ruby and functional languages like Scala and Clojure.

Competition from Ruby and Scala

While Ruby was first developed at around the same time when the first ever Jdk was released, it never grew to be as popular as Java as a programming language. Though highly expressive with its concise and some say beautiful syntax, Ruby’s dynamic typing system kept it largely away from the big scale enterprise projects where the static typing of Java was highly valued for maintenance reasons. But the embrace of Ruby on Rails by many companies in the middle of the last decade gave the language a fresh new life. Ruby on Rails or Rails as it is popularly known was adopted by many web based start-ups who valued its productivity and shorter time-to-market. In the following years, it has made its way into one of the top programming languages used for web development and scripting. Twitter was developed almost entirely using Ruby and Rails and ran on it until very recently when it switched to Scala.

Lately Scala has also emerged as a serious alternative to Java. It has some of the treasured qualities of Java such as a static type system and object oriented programming style, but at the same time it boasts many other powerful features that comes with a functional style programming. Its syntax is more concise and code less cluttered when compared to Java as the designers of Scala took care to avoid the boiler-plate mess that Java is known for. Twitter and LinkedIn are some of the most well known adopters of Scala. Also Akka and Play! – two increasingly popular frameworks are written for Scala though they support Java as well. Scala has the advantage that it runs on JVM and mixes seamlessly with Java (one can import JDK libraries in Scala code).

Here comes the closure

Many popular languages support closures, but Java was not one of them until Java 8. In order to get around with this limitation, Java developers had to depend upon a helper interface and an anonymous instance of this interface which would implement the behavior for the interface. This resulted in quite a bit of boilerplate code that comes up with anonymous inner class in the client code. But with Java 8, this should be eased a bit.

What could be written using anonymous inner class in Java 7 could be expressed as an anonymous function in Java 8

Java 7:
public class HelloInnerWorld {

public static void main(String[] args) {
    new Thread(new Runnable() {
        @Override
        public void run() {
            System.out.println("Hello World");
        }
      }).start();
    }
}
Java 8:
public class HelloLambdaWorld {
    public static void main(String[] args) {
        new Thread(() -> System.out.println("Hello World")).start();
    }
}

How the above code will change in a way that again results in boilerplate code when the code in the closure throws a checked exception, I will explain in another post.

Algorithms, Part I by Robert Sedgewick of Princeton Uni. in Coursera

Monday, March 17th, 2014

I just finished Week 1 of an online course on Algorithms in Coursera. In this course, around 2 hours long videos are published in each week, on Fridays. Lectures are split into smaller videos of 10-15 minutes in length and each of them contains one or two quizzes. Also for each week, one programming assignment and several exercises are published which have to be submitted before a deadline and will be graded automatically.

The video lectures are very simple and easy to follow and Professor  Robert Sedgewick of Princeton University does an awesome job of explaining the fundamental concepts of Algorithms in a simple and effective manner. Robert Sedgewick is the co-author of “Algorithms” – one of the most popular books on Algorithm.

In the past, I have used various algorithms like quick sort and binary search and other data structures such as Queues and Stacks from Java’s collection library, but this course offers an excellent opportunity to understand the various implementation details of those fundamental algorithms and also to make use of this understanding and knowledge to make more informed decision when choosing algorithms and data structures in the future.

In the first week, the focus is on “Union-Find” and “Analysis of Algorithms”. In the first part called “Union-Find”, dynamic connectivity problem which has wide range of applications in areas ranging from social-network graph to electrical conductivity of a material is explained. How to find whether two elements in a set are connected and also how to connect two elements in a set using different approaches are explained using simple examples.

In the second part called “Analysis of Algorithm”, the reasons to analyze algorithms is explained. The primary practical reason to analyze algorithm is to avoid performance bugs, ie, when a programmer’s lack of understanding about performance characteristics resulted in a poor performance for the client of the application. A scientific method to study and compare the performance of algorithms as proposed by the legendary computer scientist Donald Knuth is also briefly explained. Later on, a structured way to understand and hypothesize about the “Order of growth” of an algorithm is presented.

The programming exercise for the week involved finding the percolation threshold for an N x N grid using Monte Carlo simulation. The program had to be written in Java and a utility class with the implementation of weighted quick find was already provided.