Virtualization has been a widely used principle in the IT infrastructure for some time, the latest trend an even greater isolation of individual services. This is done by containers, and the most distinctive tool for their management includes Docker. But where does Docker place the containers? What do providers of IBM Bluemix infrastructure and Amazon Web Services offer? This is discussed in the following article.
Containers (also called OS partitions) are a relatively old principle, and they have been in the UNIX systems from the beginning – with the help of chroot and jail commands. They offer the most efficient form of virtualization, since they share the same core with the host OS. It has its limitations, not possible to introduce custom modules into the core, which would be enabled, for example, by custom virtual network cards (commonly used for different encrypted tunnels). But today's web applications and backends do not need this. These in turn need to streamline the management of the server infrastructure so that it is scalable for large numbers of instances.
A new method of administration
In the classical model, when it is necessary, for example, to install a new library, this library is installed on the given server (either virtual, container, or physical). This is to be repeated for each running instance. Docker and similar extensions of systemic containers (Chef, Puppet) solve this problem by writing some kind of initialization script according to which the image of the given container is built. Admin then does not interfere with the running instance; if anything needs to be adjusted, the initialization script is always adjusted. The instance is then killed and installed again according to the revised script.
With this principle, it is possible to deploy a large number of instances, and what is more, these scripts shared across community for individual services. The condition for this principle is a considerable fragmentation of services, as one container should carry just one activity. It means to set one container for the database, one for web application (e.g. PhpMyAdmin) and connect these containers to each other. But everyone has their own initialization script.
Docker is interesting for two reasons. Firstly, due to the strong support among infrastructure providers such as IBM Bluemix and Amazon AWS. Second, thanks to the large community around https://hub.docker.com/, which brings together various Docker scripts, and many open source projects (e.g. mysql, redmine, redis) have official repo here. It is therefore a more convenient way of installation than from source codes, and much faster than from the official repositories of Linux distribution administrators (which have such a slow acceptance cycle they are almost unusable for new services).
Which components do the individual containers consist of?
A network connection must be doubly maintained– on the one hand (optional) using public IP addresses with the outside world and then (mainly) with other containers. Communication with other containers should not take place using a public network and open ports for security reasons. Docker has a function for this principle – a link. This allows each container to connect to an existing one, so for example, Mysql database does not have its own public IP addresses and open ports, container with phpMyAdmin connecting to it using a link, which then communicates on the outside.
However, this option is already considered legacy since the introduction of docker networking. But most docker repositories on docker hub are written for it, and that's why it will be necessary for some time. Docker networking functions above systemic bridges, it is therefore possible to assemble private infrastructure as you like (but it is not as straightforward as using the links).
Amazon supports both legacy lines (even using the web GUI) and newer networking (using only CLI). IBM Bluemix legacy does not support links, but it does support networking via CLI.
Since the containers must be killed and reinstalled for each configuration change, their storage is only temporary. How to solve this using the example of three-ply web applications? The database (as mentioned above) does not belong in the container with the application, acting as a separate container or as an external service. Regarding unsorted data (various upload files, etc.) not stored in the database, so-called Volumes exist in docker for them, which are persistent storages on the host server mountable into individual containers.
IBM Bluemix does not support adding Volumes on the run, so you must put everything into the initializing Dockerfile, which is very impractical. Amazon supports adding volumes even through the GUI.
The pricing model of both services is different, both have the basic free tier, which however we will not deal with now. In the case of IBM, each container is paid for (which can be ordered with different parameters), and on the other hand in the case of Amazon, containers are not paid; it's an extra service that complements the existing EC2 computing instances (which are paid of course) – the number of containers is therefore not limited.
Both services have different pricing based on location; for comparison, I took the EU region (which is among the more expensive).
The minimum reasonable Amazon instance EC2 t2.micro with 1GB RAM and shared CPU costs USD 0.014/hour, or about CZK 260 per month. Furthermore, USD 0.11 is paid per month for 1GB of storage, and so the instance with 20 GB of storage comes up to CZK 315 in total.
The container of similar parameters from IBM Bluemix costs CZK 423 + CZK 286 for 20 GB of storage, which is CZK 707 in total.
Both services have made tremendous progress in the past (half) year, so the conclusions of this article therefore may not apply in a few months. Now Amazon is more advantageous, both in terms of the portfolio of supported functions, and in terms of pricing. But it is important that there are more such services and it is possible to move around freely between them thanks to standard Docker containers.