Moving to another cloud

We are in the process of migrating one of our backend dataprocessing servers from a legacy hosting company in NYC to Contegix.  What’s unusual about this transition is that we’re moving the machine onto Contegix’s new cloud platform rather to a traditional server.  We’ve noticed a few things already.  When we were copying over a huge backup of our databases, we noticed that they were transferring across the network from NYC to St Louis at 93Mbps, which is not frigging bad!  As I write this, we’re loading over 100Gb of data into a MySQL server on our new Contegix cloud machine at ~30K blocks/second (as measured by vmstat), which means that this thing has lightning fast i/o… not surprising since the storage is on an EqualLogic SAN (Update: we later saw this increase to ~70K blocks/second).

The differences between this cloud platform and EC2 (which we still use for some other needs) are striking.  The application that we will host on this new vm sometimes needs a lot of memory.  With Contegix, we can grow that all the way up to 128Gb with 32 cores.  Amazon doesn’t even come close to that – their max is 15Gb.  Or you can figure out how to distribute your application over a bunch of hosts.  But sometimes you just need 20Gb of memory and all the problems go away.  Plus we don’t have to compete for these resources – they’re guaranteed to us.

I also like the fact that the machine doesn’t disappear into oblivion when it reboots, which is a feature (?) of EC2 instances.  We can grow our storage needs past that point that I care to think about on this platform as well.  Plus, we get all the Contegix support that we want if we choose to do crazy things with this host.

The virtualization technology is VMWare ESX, which is darn cool stuff (having just set it up on an integration server here a week or so ago, I have to say that I like what I have seen so far).  We’ve already seen our VM get hot-migrated to another physical box in order to maximize the resources available to us.  Things got slow for a little bit, but then they got lightning fast.  I think we were copying data into the machine at that point and saw no impact to open connections, etc.  Don’t ask me why, but I’m still surprised that this works reliably.

So far so good.  We’ll report back with more later.