For past few weeks trying out docker and found it useful to convey the need of lightweight containers for dev/test.
Although it works like git It presents nice extensions on/around lxc. lxc has extremely simple cli interface to use and run with(as a user I remember being excited by solaris containers long time ago). Docker makes it much more powerful by adding version and reusability imho.
I used it on Azure without issues. When Spark’s docker friendly release was mentioned by Andre it was on my to do list for long time. Intent was to run the perf benchmark using memetracker dataset – will get it on fullfledged cluster one of the days.
Update – 2014-10th-June – MSOpen technologies announces support for docker natively on Azure – http://msopentech.com/blog/2014/06/09/docker-on-microsoft-azure/
Everything mentioned at the repo worked without issues – I just cloned the docker scripts directly. The only change was for the cloning, I used following statement
git clone http://github.com/amplab/docker-scripts.git Challenge with any new data system is to learn - import/export of data, easy query, monitoring , finding out root cause. That will require some work in real project - somewhere down the road. Got distracted by use of Go in docker in between.