This morning I cut barneyb.com
and all it's associated properties over from my old CentOS 5 box at cari.net to a new Amazon Linux "box" in Amazon Web Service's us-east-1
region. Migration was pretty painless. I followed the "replace hardware with cloud resources" approach that I advocate and have spoken on at various places. The process looks like this:
- launch a virgin EC2 instance (I used the console and based it on
ami-7f418316
). - create a data volume and attach it to the instance.
- allocate an Elastic IP and associate it with the instance.
- set up an A record for the Elastic IP.
- build a setup script which will configure the instance as needed. I feel it's important to use a script for this so that if your instance dies for some reason you can create a new one without too much fuss. It's not strictly necessary, but part of the cloud mantra is "don't repair, replace" because new resources are so inexpensive. Don't forget to store it on your volume, not the root drive or an ephemeral store. Here's one useful snippet for modifying /etc/sudoers that took me a little digging to figure out:
bash -c "chmod 660 /etc/sudoers;sed -i -e 's/^\# \(%wheel.*NOPASSWD.*\)/\1/' /etc/sudoers;chmod 440 /etc/sudoers"
- rsync all the various data files from the current server to the new one (everything goes on the volume; symlink – via your setup script – where necessary). Again, use a script.
- once you're happy that your scripts work, kill your instance,
- launch a new virgin EC2 instance,
- attach your data volume,
- associate your Elastic IP,
- run your setup script,
- if anything didn't turn out the way you wanted, fix it, and go back to step 8.
- shut down all the state-mutating daemons on the old box.
- shut down all the daemons on the new instance.
- set up a downtime message in Apache on the old box. I used these directives:
RewriteEngine On RewriteRule ^/.+/.* /index.html [R] DocumentRoot /var/www/downtime
- run the rsync script.
- turn on all the daemons on your new instance.
- add
/etc/hosts
records to the old box and update DNS with the Elastic IP. - change Apache on the old box to proxy to the new instance (so people will get the new site without having to wait for DNS to flush).
ProxyPreserveHost  On ProxyPass  /  http://www.barneyb.com/ ProxyPassReverse   /  http://www.barneyb.com/
These directives are why you need the rules in
/etc/hosts
, otherwise you'll be in an endless proxy loop. You'll need to tweak them slightly for your SSL vhost. The ProxyPreserveHost directive is important so that the new instance still gets the original Host header, allowing it to serve from the proper virtual host. This lets you proxy all your traffic with a single directive and still have it split by host on the new box.
The net result was a nearly painless transition. There was a bit of downtime during the rsync copy (I had to sync about 4GB of data), but only a few minutes. Once the new box was populated and ready to go, the proxy rules allowed everyone to keep using the sites, even before DNS was fully propagated. Now, a few hours later, the only traffic still going to my old box is from Baiduspider/2.0; +http://www.baidu.com/search/spider.html
, whatever that is. Hopefully it'll update it's DNS cache like a well-behaved spider should, but not according to my TTLs. Hmph.
Steps 1-12 (the setup) took me about 4 hours to do for my box. Just for reference, I host a couple Magnolia-backed sites, about 10 WordPress sites (including this one), a WordPressMU site, and a whole pile of CFML apps (all running within a single Railo). I also host MySQL on the same box which everything uses for storage. Steps 13-19 took about an hour, most of that being waiting for the rsync and then running through all the DNS changes (about 20 domains with between 1 and 10 records each).
And now I have extra RAM. Which is a good thing. I'm sure a few little bits and pieces will turn up broken over the next few days, but I'm quite happy with both the process and the result.
Nice write up!
What was your main reason for switching to the cloud?
Cheaper? More resources? Better scaling? Cause all the cool kids are doing it?
I believe Baiduspider is a Chinese search engine bot.
They smash our sites, second only to the Googlebot.
Thanks Mike. Primary reason was resources per dollar. For just under 10% more a month I was able to nearly double my CPU and RAM allotment from my previous provider. Have to pay metered bandwidth now rather than a fixed-price cap, but no great issue there. Proximity was also a win, since I store so many payloads on S3, and they're now available via AWS's internal fiber instead of 3,000 miles of public internet.
There are a lot of wins with the cloud, but for me hosting my personal server, most of them are fairly irrelevant (if not problematic). For example large systems will absolutely benefit more from cloud-based scaling than they'll suffer from slightly less reliable "servers". For a single-box personal setup, however, that's actually worse than physical hardware. Those risks are manageable and once under control even non-redundant, non-scaling setups are viable in the cloud once the bottom-edge of the price/performance curve is cost effective.
Bottom line is you can't run a reasonable dedicated box on EC2 for less than about $80 / month, so if you only need half those resources it's quite possible to get them for half the price somewhere else. Once you get to the point where cloud resources are cost effective, however, I think it's pretty much a no brainer to go that way unless you have some really specific needs for your infrastructure.