After publishing my first post about docker: 3 Reasons to use docker I got some questions. I decided that it would be beneficial to everyone to publish my answers here.
Q: You say that ansible can take up to 20x longer to provision, but why?
From skimming some random dockerfiles they look pretty similar to playbooks. To me it seems like the main difference with docker is that it offers isolation by the way of containers.
Docker uses cache to speed up builds significantly. Every command in Dockerfile is build in another docker container and it's results are stored in separate layer. Layers are built on top of each other.
Docker scans Dockerfile and try to execute each steps one after another, before executing it probes if this layer is already in cache. When cache is hit, building step is skipped and from user perspective is almost instant.
When you build your Dockerfile in a way that the most changing things such as application source code are on the bottom, you would experience instant builds.
You can learn more about caching in docker in this article.
Another way of amazingly fast building docker images is using good
base image - which you specify in
FROM command, you can then only make
necessary changes, not rebuild everything from scratch. This way, build will
be quicker. It's especially beneficial if you have a host without the
cache like Continuous Integration server.
Summing up, building docker images with Dockerfile is faster than provisioning with ansible, because of using docker cache and good base images. Moreover you can completely eliminate provisioning, by using ready to use configured images such us postgres.
$ docker run --name some-postgres -d postgres
No installing postgres at all - it's ready to run.
Q: Also you mention that docker allows multiple apps to run on one server.
I'm wondering how granular one should get... for example, if I had a Rails project that used postgres, should I split them into separate containers? If not, why not? How do you make that decision? And if you split them up, how do they communicate?
It depends on your use case. You probably should split different components into separate containers. It will give you more flexibility.
Docker is very lightweight and running containers is cheap, especially if you store them in RAM - it's possible to spawn new container for every http callback, however it's not very practical.
At work I develop using set of five different types of containers linked together.
In production some of them are actually replaced by real machines or even clusters of machine - however settings on application level don't change.
Here you can read more about linking containers.
It's possible, because everything is communicating over the network. When you specify
links in docker
run command - docker bridges containers and injects
with information about IPs and ports of linked
children into the
This way, in my app settings file, I can read those values from environment. In python it would be:
import os VARIABLE = os.environ.get('VARIABLE')
There is a tool which greatly simplifies working with docker containers, linking included. It's called fig and you can read more about it here.
Q: Finally, what does the deploy process look like for dockerized apps stored in a git repo?
It depends how your production environment looks like.
Example deploy process may look like this:
- Build an app using
docker build .in the code directory.
- Test an image.
- Push the new image out to registry
docker push myorg/myimage.
- Notify remote app server to pull image from registry and run it (you can also do it directly using some configuration management tool).
- Swap ports in a http proxy.
- Stop the old container.
Elastic beanstalk is a powerful beast and will do most of deployment for you and provide features such as autoscaling, rolling updates, zero deployment deployments and more.
Dokku is very simple platform as a service similar to heroku.
That would be all. If you have more questions you can always ask some in the comments, email or find me on twitter.