The Paradox Of The Mail Server On The Cloud

Cloud Mail ParadoxProviding your web application with a mail service that works flawlessly is probably essential for your business. You need to send activation emails to users, password reset emails, newsletters and probably a whole bunch of other emails that have to do with interactions with your application.

When there were only physical servers and static IP addresses, everything worked perfectly. But now, when your application is in the cloud, setting up a working mail server next to your application is ridiculously impossible. If your application is successful and you would like to send emails to your millions of satisfied users, your options come down to:

  1. Use a physical hosted server.
  2. Use a 3rd party email service.
  3. Set up a mail server in the cloud and compromise on some/most being marked as spam.

For us cloud oriented developers, option 1 is as useful as somebody suggesting you’d use a cassette tape recorder to put your favorite songs on. It’s old, unreliable, can’t scale. Option 2 is very costly if your business is successful, and most of these services don’t deal with the amount of mails you need to send if you have a large scale user base. Option 3 will make your email communication efforts with your users almost non-existent, which means you can’t afford it as well. So your only option is to compromise somewhere.

Why is sending email from the cloud so difficult?

In order for your mail server to operate successfully and be trusted by mail services around the world, you need to abide by the following rules:

  1. Don’t be an open relay.
  2. Implement (and follow) SPF policy (and DKIM if possible).
  3. Have a PTR record that resolves back exactly to your mail server hostname.
  4. Don’t let your public IP address be listed in any RBLs.

Rule #1 is easily implemented in any mail server configuration, and there are also a number of online tools to test if you’re an open relay or not. Option #2 is also pretty easy to implement, assuming you control your DNS zone files and know your way around it.

The problem of mail on the cloud begins with rules #3 and #4. A PTR record, which is a reverse DNS entry, must be present and correct for your mail server to not be considered spammy. If your mail server is at 1.2.3.4 and is called mail.example.com, the PTR query for 1.2.3.4 (well, for 4.3.2.1.in-addr.arpa) must return mail.example.com. The PTR record can only be changed by the owner of the IP address, or by a delegation of his authority to you. Amazon Web Services do not let you control PTR records, so there goes the option for a mail server on EC2.

Other clouds let you control the PTR records for the IP addresses they assigned to you. But they fail on Rule #4. While your specific IP address might not be blacklisted in RBLs, the entire block that it belongs to might be blacklisted, because these IP addresses are assigned dynamically and therefore are always suspected as spammy by these lists. This is the case with Rackspace Cloud for example, and is the only thing left to be solved before you can run a mail server there. And although they’re trying to get their address block de-listed, this problem still persists.

Other clouds I’ve examined in this space are GoGrid and Joyent. GoGrid want you to fill up a questionnaire, and only then they open up port 25 for you. This sounds absurd, and against all the on-demand nature of the cloud (and I also personally don’t trust ServePath, the company that operates GoGrid). Joyent’s offering seem to disregard the option of hosting a mail server with them, and I couldn’t get their response on this matter.

So unless Rackspace Cloud solve their IP block blacklisting problem, or AWS offer a PTR setting option (plus no blacklisting as well), we’re left with the need to compromise.

The only feasible solution right now — seems like it’s back to physical hosting.

MMM (Mysql Master-Master Replication) on EC2

Maintaining a MySQL high availablity cluster is one of the first missions encountered when scaling web applications. Very quickly your application gets to the point where the one database machine you have is not enough to handle the load, and you need to make sure that when failure happens (and it always happens), your cluster is ready to failover gracefully.

Some basic MySQL replication paradigms

MySQL master-slave replication was one of the first architectures used for failovers. The rationale is that if a master fails. a slave can be promoted to master, and start handle the writes. For this you could use several combinations of ip tools and monitoring software, for example, iproute and nagios, or heartbeat and mon.

However, master-slave architecure for MySQL replication has several flaws, most notable are:

  • The need to manually take care of bringing the relegated master back to life, as a slave to the now-promoted master (this can be scripted, but usually contains many pitfalls when trying to automate as a script).
  • The possibility of failover during a crash, which can result in same transaction being committed both on the old master and the new master. Good luck then, when trying to bring back the master as a slave. You’ll most likely get some duplicate key failure because of auto increments on the last transaction when starting replication again, and then the whole database on the relegated master is useless.
  • The inablity to switch roles quickly. Say the master is on a better machine than the slave, and now there was a failover. How can you easily restore the situation the way it was before, with the master being on the better machine? Double the headache.

Along came master-master architecture, which in essence is an architecture which keeps two live masters at all times, with one being a hot standby for the other, and switching between them is painless. (Baron Schwartz has a very interesting post about why referring to master-master replication as a “hot” standby could be dangerous, but this is out of the scope of this post). One of the important things that lies in the bottom of this paradigm, is that every master works in its own scope in regards to auto-increment keys, thanks to the configuration settings auto_increment_increment and auto_increment_offset. For example, say you have two masters, call them db1 and db2, then db1 works on the odd auto-increments, and db2 on the even auto-increments. Thus the problem of duplication on auto increment keys is avoided.

Managing master-master replication with MMM

Master-master replication is easily managed by a great piece of perl code called MMM. Written initially by Alexey Kovyrin, and now maintained and being installed daily in production environments by percona, MMM is a management daemon for master-master clusters. It’s a convenient and reliable cluster management tool, which simplifies switching roles between machines in the cluster, takes care of their monitoring and health issues, and prevents you from doing stupid mistakes (like making an offline server an active master…).

And now comes the hard, EC2 part. Managing high availability MySQL clusters is always based on the ability to control machines ip addresses in the internal network. You set up a floating internal ip address for each of the masters, configure your router to handle these addresses, and you’re done. When the time comes to failover, the passive master sends ARP and takes over the active master’s ip address, and everything swtiches smoothly. It’s just that on EC2, the router part or any internal ip address can not be determined by you (I was told FlexiScale gives you the option to control ip addresses on the internal network, but I never got to testing it).

So how can we use MMM on EC2 for master-master replication?

One way is to try using EC2’s elastic ip feature. The problem with it, is that currently moving an elastic ip address from one instance to another, takes several minutes. Imagine a failover from active master to passive master in which you would have to wait several minutes for the application to respond again — not acceptable.

Another way is to use name resolving instead of ip addresses. This has major drawbacks, especially the non reliable nature of the DNS. But it seems to work if you can set up a name resolver that serves your application alone, and use the MMM ns_agent contribution I wrote. You can check out MMM source, and install the ns_agent module according to contrib/ns_agent/README.

I am happy to say that I currently have a testing cluster set up on EC2 using this feature, and up until now it worked as expected, with the exception of several false-positive master switching due to routing issues (ping failed). Any questions or comments on the issue are welcome, and you can also post to the devolpement group.