Guesswork and Gambling

My first blog post is about running Docker in production. This blog in fact. Why pay for a WordPress hosting solution when I could make life so much harder for myself for your amusement?

I started with a GCE 2GB 1 vCPU instance. 2GB should be enough, right? Off the bat, I’m introduced to one of the most curious hang-overs from our transition from Operating System to Cloud. The static reservation. Guesswork and gambling: guess how much you need and then pray you don’t need more (before I’ve even written my first post, I’m already fretting about the complexities of horizontal scaling)

The bottom line though is that my blog can be afford to be down for short periods. The fact that many financial institutions still consider 24/7 online access a luxury… I mean really, how important is it for my vapid spewings to be highly available? No, what’s most important to me is the integrity of my data and being able to scale my compute, networking and storage bandwidth with minimal disruption in the unlikely event that this site becomes popular.

So I start by installing Docker into an Ubuntu 16.04 instance. I note that the install procedure has changed again. It’s improved – nice that it now uses apt. Once it’s installed and I’ve run hello-world, do I want to configure anything special? I decide not. This is not going to be the end of a CI pipeline, it’s going to run one thing for a long time. In fact, in this particular case, the utility of Docker is really the dependency management and provisioning aspect. The fact that I can grab a compose file and just run something.

Having found this article on Docker Compose and WordPress, I decide to just go for it and install docker-compose. Naturally I used apt. Ha! Fool! The version apt gave me is incompatible with the yml file in the demo, so I screw around with curl to pull down a more recent version from GitHub. Second problem was that the GCE VM instance doesn’t include python-ipaddress, which is an essential dependency of docker-compose. Of course, this is why we go straight to apt for stuff.

So having run the compose file, docker ps shows me that I have a WordPress and a MqSQL running. OK. Some more faffing and I realise the reason my browser can’t see it is because the GCE firewall rules only open port 80 for HTTP traffic. Simple to modify.

OK, so now I can create some test content and presumably it’s been eaten by MySQL. Next step is to think about how to separate data and compute. Why would I want to do this? Because I want to be able to manage the lifecycle of one independently of the other – particularly if I want to scale one horizontally or vertically. Docker volumes alone are not going to help me here, because the only disk I can map in is the boot disk of the VM. I either need a separate disk for the MySQL data or a data container on a separate VM. Nice to see that GCE offers SQL instances with multi-region high availability and dynamically expandable disk, but this is my money and I don’t need it. Of course the beauty of Cloud is that I should be able to test and verify all of these things.

I create a 20GB SSD disk and add it to my running instance where it appears as /dev/sdb. Partition and format the disk using fdisk and mkfs, update /etc/fstab and mount to a local directory. Reconfigure the compose yml file to point to my new mount and restart. All beautifully simple. I also choose to move the compose configuration to the new disk as I also want it to be preserved. What did strike me as curious is that disk throughput is proportional to size in that if I need more bandwidth, I have to make it bigger.

So before I go all in on this, what do I want to test for?

  1. Expand compute capacity with a small outage
  2. Expand disk space and/or disk throughput with a small outage
  3. Snapshot the data disk and bring it back

Vertical scaling is going to be a lot easier than horizontal at this stage, which should be a simple power down, reconfigure, power up and compose up, right? Let’s double the capacity – 2vCPU and 4GB. Pleasantly surprised that compose automatically brought my app back on reboot. The one thing I find curious though is that GCE gives me CPU, disk and network monitoring out of the box, but not memory. Seems like if I want host memory monitoring, I need to sign up for StackDriver. Hmm. Docker stats is actually quite neat in giving me a high-level summary of my consumption from the guest perspective, but without graphing over time, it’s of limited use. Piping vmstat to a file just seems very Not Cloud.

Expanding disk space isn’t as easy as creating a snapshot and creating a new disk from it, because partition size. However, manually copying the data over from one disk to another works just fine and can happen out of band. Since throughput is tied to disk size, I’d have to do this same operation for either. It doesn’t seem like I can easily automate snapshots, so a cron job in the VM that drives the GCE APIs could be a fun little project.

So, here it is. Overall a confidence-inspiring and relatively simple exercise. You are my customer.

Leave a Reply

Your email address will not be published. Required fields are marked *