The Paradox Of The Mail Server On The Cloud

Cloud Mail ParadoxProviding your web application with a mail service that works flawlessly is probably essential for your business. You need to send activation emails to users, password reset emails, newsletters and probably a whole bunch of other emails that have to do with interactions with your application.

When there were only physical servers and static IP addresses, everything worked perfectly. But now, when your application is in the cloud, setting up a working mail server next to your application is ridiculously impossible. If your application is successful and you would like to send emails to your millions of satisfied users, your options come down to:

  1. Use a physical hosted server.
  2. Use a 3rd party email service.
  3. Set up a mail server in the cloud and compromise on some/most being marked as spam.

For us cloud oriented developers, option 1 is as useful as somebody suggesting you’d use a cassette tape recorder to put your favorite songs on. It’s old, unreliable, can’t scale. Option 2 is very costly if your business is successful, and most of these services don’t deal with the amount of mails you need to send if you have a large scale user base. Option 3 will make your email communication efforts with your users almost non-existent, which means you can’t afford it as well. So your only option is to compromise somewhere.

Why is sending email from the cloud so difficult?

In order for your mail server to operate successfully and be trusted by mail services around the world, you need to abide by the following rules:

  1. Don’t be an open relay.
  2. Implement (and follow) SPF policy (and DKIM if possible).
  3. Have a PTR record that resolves back exactly to your mail server hostname.
  4. Don’t let your public IP address be listed in any RBLs.

Rule #1 is easily implemented in any mail server configuration, and there are also a number of online tools to test if you’re an open relay or not. Option #2 is also pretty easy to implement, assuming you control your DNS zone files and know your way around it.

The problem of mail on the cloud begins with rules #3 and #4. A PTR record, which is a reverse DNS entry, must be present and correct for your mail server to not be considered spammy. If your mail server is at and is called, the PTR query for (well, for must return The PTR record can only be changed by the owner of the IP address, or by a delegation of his authority to you. Amazon Web Services do not let you control PTR records, so there goes the option for a mail server on EC2.

Other clouds let you control the PTR records for the IP addresses they assigned to you. But they fail on Rule #4. While your specific IP address might not be blacklisted in RBLs, the entire block that it belongs to might be blacklisted, because these IP addresses are assigned dynamically and therefore are always suspected as spammy by these lists. This is the case with Rackspace Cloud for example, and is the only thing left to be solved before you can run a mail server there. And although they’re trying to get their address block de-listed, this problem still persists.

Other clouds I’ve examined in this space are GoGrid and Joyent. GoGrid want you to fill up a questionnaire, and only then they open up port 25 for you. This sounds absurd, and against all the on-demand nature of the cloud (and I also personally don’t trust ServePath, the company that operates GoGrid). Joyent’s offering seem to disregard the option of hosting a mail server with them, and I couldn’t get their response on this matter.

So unless Rackspace Cloud solve their IP block blacklisting problem, or AWS offer a PTR setting option (plus no blacklisting as well), we’re left with the need to compromise.

The only feasible solution right now — seems like it’s back to physical hosting.

Cron Script To Snapshot Any Attached EBS Volume

If you would like to cron snapshots of any attached volume to an instance, you can use the following script. It uses the EC2 command line tools to see what volumes are currently attached to this instance, and takes a snapshot. Make sure to replace all the variables on the top of the script to match your own.


export JAVA_HOME=/usr/java/default
export EC2_HOME=/vol/snap/ec2-api-tools-1.3-26369

INSTANCE_ID=`curl -s`
echo "Instance ID is $INSTANCE_ID"
VOLUMES=`$EC2_HOME/bin/ec2-describe-volumes | grep "ATTACHMENT" | grep "$INSTANCE_ID" | awk '{print $2}'`
echo "Volumes are: $VOLUMES"

for VOLUME in $VOLUMES; do
        echo "Snapping Volume $VOLUME"
        DEVICE=`$EC2_HOME/bin/ec2-describe-volumes $VOLUME | grep "ATTACHMENT" | grep "$INSTANCE_ID" | awk '{print $4}'`
        echo "Device is $DEVICE"
        MOUNTPOINT=`df | grep "$DEVICE" | awk '{print $6}'`
        echo "Mountpoint is $MOUNTPOINT"

        # Snapshot
        SNAPSHOT_ID=`$EC2_HOME/bin/ec2-create-snapshot $VOLUME`

        echo "Snapshotted: $SNAPSHOT_ID"


If you’re wondering why $MOUNTPOINT is important (it’s not used here after all), it’s because you might want to freeze your filesystem if it’s XFS, so you could safely take a snapshot of a MySQL database for example. So you could easily wrap the snapshot create command with this:

        # freeze
        xfs_freeze -f $MOUNTPOINT

        # Snapshot
        SNAPSHOT_ID=`$EC2_HOME/bin/ec2-create-snapshot $VOLUME`

        # unfreeze
        xfs_freeze -u $MOUNTPOINT

And if you are indeed using this script to snapshot a volume with MySQL on it, you need also to flush tables with read lock, and gather information on master and slave positions. For this task you can use Eric Hammond‘s script, and incorporate it to the cron script. (You can read more about MySQL and XFS on EC2 on the AWS site).

Detaching Infrastructure From Physical Hosts: Fantasy vs. Reality

Dead Harddrive
Image via

Cloud computing has brought along the promise of easy-to-scale-and-yet-affordable computer clusters. There are various clouds out there that provide Infrastructure as a Service, such as Amazon EC2, Google App Engine, Mosso, and the newcomer Sites to name a few. I personally have experience as a developer only with Amazon EC2, and I am a devoted fan and user of the entire AWS stack. Nonetheless, I believe that what I have to say here is relevant to all other platforms.

While the cloud and IaaS model have indeed many significant advantages over traditional physical hosting, there is one major annoyance still to overcome in this space, and that is: your virtual host is still connected to a physical machine. And that machine is non-redundant, it doesn’t have any hot backup, and there’s no way to transparently and hassle-free fail over from it once its malfunctioning. And this is why, from time to time I get this email from Amazon:


We have noticed that one or more of your instances are running on a host degraded due to hardware failure.


The host needs to undergo maintenance and will be taken down at XX:XX GMT on XXXX-XX-XX. Your instances will be terminated at this point.

The risk of your instances failing is increased at this point. We cannot determine the health of any applications running on the instances. We recommend that you launch replacement instances and start migrating to them.

Feel free to terminate the instances with the ec2-terminate-instance API when you are done with them.

Let us know if you have any questions.


The Amazon EC2 Team

At this stage, this is one of the greatest shortcomings of EC2 from my point of view. As a customer of EC2, I don’t want to care if a host has hardware failure. Why can’t my instance just be mirrored somewhere else, consistent hot-backup style, and upon failure of host hardware be transparently switched to the backup host? I don’t care paying the extra buck for this service.

In my vision, in a true IaaS cloud there is no connection between the virtual machine and the physical host. The virtual machine is truly floating in the cloud, unbound to the physical realm by means of some consistent mirroring across physical hosts.

And you might be thinking “you can implement this on your own on the existing infrastucture that EC2 offers”, and “you should be prepared for any instance going poof”. And you are correct, at the current offering of EC2, this is the case. You always have to be prepared for an instance failure (in the last month, I had 2 physical hosts failure out of about 20, that’s about a monthly 10% (!!) ), and you always have to build your architecture so that a single host failure can fail over gracefully. But were my vision a reality, I wouldn’t have to worry about these things, and wouldn’t have to spend time and money on the overhead that they incur.

I am not certain that this is the situation in the other clouds, but if it is not, it might come with the price of less flexibility, which is a major part of EC2 on which I am not willing to give up. If that flexibility can be maintained, I would love to see my vision become a reality on EC2.

Network Latency Inside And Across Amazon EC2 Availability Zones

I couldn’t find any info out there comparing network latency across EC2 Availability Zones and inside any single Availability Zone. So I took 6 instances (2 on each US zone), ran some test using a simple ping, and measured 10 Round Trip Times (RTT). Here are the results.

Single Availablity Zone Latency

Availability Zone Minimum RTT Maximum RTT Average RTT
us-east-1a 0.215ms 0.348ms 0.263ms
us-east-1b 0.200ms 0.327ms 0.259ms
us-east-1c 0.342ms 0.556ms 0.410ms

It seems that at the time of my testing, zone us-east-1c had the worst RTT between 2 instances in it, almost twice as slow as the other 2 zones.

Cross Availablity Zone Latency

Availability Zones Minimum RTT Maximum RTT Average RTT
Between us-east-1a and us-east-1b 0.885ms 1.110ms 0.937ms
Between us-east-1a and us-east-1c 0.937ms 1.080ms 1.031ms
Between us-east-1b and us-east-1c 1.060ms 1.250ms 1.126ms

It’s worth noting that in cross availability zones traffic, the first ping was usually off the chart, so I disregarded it. For example, it could be anywhere between 300ms to 400ms, and the the rest would fall down to ~0.300. Probably some lazy routing techniques by Amazon’s routers.


  1. Zones are created different! — At least at the time of the testing, if you have a cluster on us-east-1b it performs almost twice as fast with regards to RTT between machines than a cluster on us-east-1c.
  2. Cross Availability Zones latency can be 6 times higher than inner zone latency. For a network intensive application, better keep your instances crowded in the same zone.

I should probably also make a throughput comparison between and across Availability Zones. I promise to share if I get to test it.

How to delete those old EC2 EBS snapshots

EBS snapshots are a very powerful feature of Amazon EC2. An EBS volume is readily available, elastic block storage device that can be attached, detached and re-attached to any instance in its availability zone. There are numerous advantages to using EBS over the local block storage devices of an instance, and one of the most important of them is the ability to take a snapshot of the data on the volume.

Since snapshots are incremental by nature, after an initial snapshot of a volume, the following snapshots are quick and easy. Moreover, snapshots are always processed by Amazon’s processing power and not by the cpu of your instance, and are stored redundantly on S3. This is why using these snapshots in your backup methodology is a great idea (provided that you freeze/unfreeze your filesystem during the snapshot call, using LVM or XFS for example).

But, and this is a really annoying but – snapshots are “easy come hard to go”. They are so convenient to use and so reliable, that it’s natural to use a cronned script to make a daily, or hell — hourly! — backup of your volume. But then, those snapshots keep piling up, and the only way to delete a snapshot is to call a single API call for a specific snapshot.If you have 5 volumes you back up hourly, you reach the 500 snapshots limit withing 4.5 days. Not very reliable now, huh?

I have been searching for a while for an option to bulk delete snapshots. The EC2 API is missing this feature, and the excellent ElasticFox add-on is not compensating. You just can’t bulk delete snapshots.

That is, until now:). I asked in the AWS Forum if there is anything that can be done about this problem. They replied it’s a good idea, but if I really wanted it to be implemented quickly, I should build my own solution using the API. So I took the offer, and came up with a PHP command line tool that tries to emulate a “ec2-delete-old-snapshots” command, until one is added to the API.

The tool is available on Google Code for checkout. It uses the PHP EC2 library which I bundled in (hope I didn’t break any licensing issue, please alert me if I did).

Usage is easy:

php ec2-delete-old-snapshots.php -v vol-id [-v vol-id ...] -o days

If you wanted to delete ec2 snapshots older than 7 days for 2 volumes you have, you would use:

php ec2-delete-old-snapshots.php -v vol-aabbccdd -v vol-bbccddee -o 7

Hope this helps all you people out there who need such a thing. I will be happy to receive feedback (and bug fixes) if you start using this.

MMM (Mysql Master-Master Replication) on EC2

Maintaining a MySQL high availablity cluster is one of the first missions encountered when scaling web applications. Very quickly your application gets to the point where the one database machine you have is not enough to handle the load, and you need to make sure that when failure happens (and it always happens), your cluster is ready to failover gracefully.

Some basic MySQL replication paradigms

MySQL master-slave replication was one of the first architectures used for failovers. The rationale is that if a master fails. a slave can be promoted to master, and start handle the writes. For this you could use several combinations of ip tools and monitoring software, for example, iproute and nagios, or heartbeat and mon.

However, master-slave architecure for MySQL replication has several flaws, most notable are:

  • The need to manually take care of bringing the relegated master back to life, as a slave to the now-promoted master (this can be scripted, but usually contains many pitfalls when trying to automate as a script).
  • The possibility of failover during a crash, which can result in same transaction being committed both on the old master and the new master. Good luck then, when trying to bring back the master as a slave. You’ll most likely get some duplicate key failure because of auto increments on the last transaction when starting replication again, and then the whole database on the relegated master is useless.
  • The inablity to switch roles quickly. Say the master is on a better machine than the slave, and now there was a failover. How can you easily restore the situation the way it was before, with the master being on the better machine? Double the headache.

Along came master-master architecture, which in essence is an architecture which keeps two live masters at all times, with one being a hot standby for the other, and switching between them is painless. (Baron Schwartz has a very interesting post about why referring to master-master replication as a “hot” standby could be dangerous, but this is out of the scope of this post). One of the important things that lies in the bottom of this paradigm, is that every master works in its own scope in regards to auto-increment keys, thanks to the configuration settings auto_increment_increment and auto_increment_offset. For example, say you have two masters, call them db1 and db2, then db1 works on the odd auto-increments, and db2 on the even auto-increments. Thus the problem of duplication on auto increment keys is avoided.

Managing master-master replication with MMM

Master-master replication is easily managed by a great piece of perl code called MMM. Written initially by Alexey Kovyrin, and now maintained and being installed daily in production environments by percona, MMM is a management daemon for master-master clusters. It’s a convenient and reliable cluster management tool, which simplifies switching roles between machines in the cluster, takes care of their monitoring and health issues, and prevents you from doing stupid mistakes (like making an offline server an active master…).

And now comes the hard, EC2 part. Managing high availability MySQL clusters is always based on the ability to control machines ip addresses in the internal network. You set up a floating internal ip address for each of the masters, configure your router to handle these addresses, and you’re done. When the time comes to failover, the passive master sends ARP and takes over the active master’s ip address, and everything swtiches smoothly. It’s just that on EC2, the router part or any internal ip address can not be determined by you (I was told FlexiScale gives you the option to control ip addresses on the internal network, but I never got to testing it).

So how can we use MMM on EC2 for master-master replication?

One way is to try using EC2’s elastic ip feature. The problem with it, is that currently moving an elastic ip address from one instance to another, takes several minutes. Imagine a failover from active master to passive master in which you would have to wait several minutes for the application to respond again — not acceptable.

Another way is to use name resolving instead of ip addresses. This has major drawbacks, especially the non reliable nature of the DNS. But it seems to work if you can set up a name resolver that serves your application alone, and use the MMM ns_agent contribution I wrote. You can check out MMM source, and install the ns_agent module according to contrib/ns_agent/README.

I am happy to say that I currently have a testing cluster set up on EC2 using this feature, and up until now it worked as expected, with the exception of several false-positive master switching due to routing issues (ping failed). Any questions or comments on the issue are welcome, and you can also post to the devolpement group.