<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Life Scaling &#187; failure</title>
	<atom:link href="http://orensol.com/tag/failure/feed/" rel="self" type="application/rss+xml" />
	<link>http://orensol.com</link>
	<description>Oren Solomianik's Blog</description>
	<lastBuildDate>Mon, 21 Jun 2010 08:10:03 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Detaching Infrastructure From Physical Hosts: Fantasy vs. Reality</title>
		<link>http://orensol.com/2009/06/17/detaching-infrastructure-from-physical-hosts-fantasy-vs-reality/</link>
		<comments>http://orensol.com/2009/06/17/detaching-infrastructure-from-physical-hosts-fantasy-vs-reality/#comments</comments>
		<pubDate>Wed, 17 Jun 2009 16:53:47 +0000</pubDate>
		<dc:creator>Oren</dc:creator>
				<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[EC2]]></category>
		<category><![CDATA[Scaling]]></category>
		<category><![CDATA[Backup]]></category>
		<category><![CDATA[failure]]></category>
		<category><![CDATA[Host]]></category>
		<category><![CDATA[IaaS]]></category>
		<category><![CDATA[Infrastructure]]></category>
		<category><![CDATA[Instance]]></category>
		<category><![CDATA[Mirroring]]></category>
		<category><![CDATA[Physical]]></category>
		<category><![CDATA[Virtual]]></category>

		<guid isPermaLink="false">http://orensol.com/?p=246</guid>
		<description><![CDATA[<p class="wp-caption-text">Image via http://www.flickr.com/photos/martinlatter/</p>
<p>Cloud computing has brought along the promise of easy-to-scale-and-yet-affordable computer clusters. There are various clouds out there that provide Infrastructure as a Service, such as <a href="http://aws.amazon.com/ec2/" target="_blank">Amazon EC2</a>, <a href="http://code.google.com/appengine/" target="_blank">Google App Engine</a>, <a href="http://www.mosso.com/" target="_blank">Mosso</a>, and the newcomer <a href="http://www.salesforce.com/platform/sites/" target="_blank">Force.com Sites</a> to name a few. I personally have experience [...]]]></description>
			<content:encoded><![CDATA[<div id="attachment_249" class="wp-caption alignright" style="width: 310px"><img class="size-full wp-image-249" src="http://orensol.com/files/2009/06/299981441_7f00c6af77.jpg" alt="Dead Harddrive" width="300" height="206" /><p class="wp-caption-text">Image via http://www.flickr.com/photos/martinlatter/</p></div>
<p>Cloud computing has brought along the promise of easy-to-scale-and-yet-affordable computer clusters. There are various clouds out there that provide Infrastructure as a Service, such as <a href="http://aws.amazon.com/ec2/" target="_blank">Amazon EC2</a>, <a href="http://code.google.com/appengine/" target="_blank">Google App Engine</a>, <a href="http://www.mosso.com/" target="_blank">Mosso</a>, and the newcomer <a href="http://www.salesforce.com/platform/sites/" target="_blank">Force.com Sites</a> to name a few. I personally have experience as a developer only with Amazon EC2, and I am a devoted fan and user of the entire AWS stack. Nonetheless, I believe that what I have to say here is relevant to all other platforms.</p>
<p>While the cloud and <a href="http://en.wikipedia.org/wiki/Infrastructure_as_a_Service" target="_blank">IaaS</a> model have indeed many significant advantages over traditional physical hosting, there is one major annoyance still to overcome in this space, and that is: your virtual host is still connected to a physical machine. And that machine is non-redundant, it doesn&#8217;t have any hot backup, and there&#8217;s no way to transparently and hassle-free fail over from it once its malfunctioning. And this is why, from time to time I get this email from Amazon:</p>
<blockquote><p>Hello,</p>
<p>We have noticed that one or more of your instances are running on a host degraded due to hardware failure.</p>
<p>i-XXXXXXXX</p>
<p>The host needs to undergo maintenance and will be taken down at XX:XX GMT on XXXX-XX-XX. Your instances will be terminated at this point.</p>
<p>The risk of your instances failing is increased at this point. We cannot determine the health of any applications running on the instances. We recommend that you launch replacement instances and start migrating to them.</p>
<p>Feel free to terminate the instances with the ec2-terminate-instance API when you are done with them.</p>
<p>Let us know if you have any questions.</p>
<p>Sincerely,</p>
<p>The Amazon EC2 Team</p></blockquote>
<p>At this stage, this is one of the greatest shortcomings of EC2 from my point of view. As a customer of EC2, I don&#8217;t want to care if a host has hardware failure. Why can&#8217;t my instance just be mirrored somewhere else, consistent hot-backup style, and upon failure of host hardware be transparently switched to the backup host? I don&#8217;t care paying the extra buck for this service.</p>
<p>In my vision, in a true IaaS cloud there is no connection between the virtual machine and the physical host. The virtual machine is truly floating in the cloud, unbound to the physical realm by means of some consistent mirroring across physical hosts.</p>
<p>And you might be thinking &#8220;you can implement this on your own on the existing infrastucture that EC2 offers&#8221;, and &#8220;you should be prepared for any instance going poof&#8221;. And you are correct, at the current offering of EC2, this is the case. You always have to be prepared for an instance failure (in the last month, I had 2 physical hosts failure out of about 20, that&#8217;s about a monthly 10% (!!) ), and you always have to build your architecture so that a single host failure can fail over gracefully. But were my vision a reality, I wouldn&#8217;t have to worry about these things, and wouldn&#8217;t have to spend time and money on the overhead that they incur.</p>
<p>I am not certain that this is the situation in the other clouds, but if it is not, it might come with the price of less flexibility, which is a major part of EC2 on which I am not willing to give up. If that flexibility can be maintained, I would love to see my vision become a reality on EC2.</p>
]]></content:encoded>
			<wfw:commentRss>http://orensol.com/2009/06/17/detaching-infrastructure-from-physical-hosts-fantasy-vs-reality/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Hardware Failure Apocalypse</title>
		<link>http://orensol.com/2009/02/24/hardware-failure-apocalypse/</link>
		<comments>http://orensol.com/2009/02/24/hardware-failure-apocalypse/#comments</comments>
		<pubDate>Tue, 24 Feb 2009 16:39:38 +0000</pubDate>
		<dc:creator>Oren</dc:creator>
				<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[General]]></category>
		<category><![CDATA[failure]]></category>
		<category><![CDATA[hardware]]></category>
		<category><![CDATA[init]]></category>
		<category><![CDATA[kernel]]></category>
		<category><![CDATA[Lenovo R61]]></category>
		<category><![CDATA[panic]]></category>
		<category><![CDATA[RAM]]></category>
		<category><![CDATA[Ubuntu]]></category>

		<guid isPermaLink="false">http://orensol.com/?p=131</guid>
		<description><![CDATA[<p>I might know a thing or two about handling servers, configs, deployments and cloud architecture. But when it comes to hardware failure on my own workstation, I become a complete layman.</p>
<p>It&#8217;s the first time my <a href="http://reviews.cnet.com/laptops/lenovo-thinkpad-r61/4505-3121_7-32442904.html" target="_blank">Lenovo R61</a> failed me. It&#8217;s running a mighty <a href="http://www.ubuntu.com/" target="_blank">Ubuntu 8.04</a>, with all the components a hacker [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright" src="http://i.i.com.com/cnwk.1d/sc/32442904-2-440-overview-1.gif" alt="" width="264" height="198" />I might know a thing or two about handling servers, configs, deployments and cloud architecture. But when it comes to hardware failure on my own workstation, I become a complete layman.</p>
<p>It&#8217;s the first time my <a href="http://reviews.cnet.com/laptops/lenovo-thinkpad-r61/4505-3121_7-32442904.html" target="_blank">Lenovo R61</a> failed me. It&#8217;s running a mighty <a href="http://www.ubuntu.com/" target="_blank">Ubuntu 8.04</a>, with all the components a hacker needs (from a complete LAMP stack, through <a href="http://www.eclipse.org/pdt/" target="_blank">PDT</a> and a customized version of  <a href="https://launchpad.net/~clazzes.org/+archive/ppa" target="_blank">svn 1.5.1</a>, to <a href="http://www.inkscape.org/" target="_blank">InkScape</a> and <a href="http://xvidcap.sourceforge.net/">xvidcap</a>&#8230;), and it&#8217;s the first time that after the system froze and I rebooted, I just gazed at the terminal at startup and shrieked:</p>
<blockquote><p>Kernel panic &#8211; not syncing: Attempted to kill init!</p></blockquote>
<p>And a whole other bunch of error messages, every time at a different stage in the boot sequence. This behavior, combined with the fact that the system just froze and I didn&#8217;t do any dramatic changes, makes me think it&#8217;s bad RAM or other hardware components (like <a href="http://www.linuxquestions.org/questions/linux-kernel-70/kernel-panic-not-syncing-attempted-to-kill-init-313273/" target="_blank">here</a>, and disk is of course a candidate), but <a href="https://answers.launchpad.net/ubuntu/+question/3694" target="_blank">sometimes</a> it seems like people get over it by re-installing a kernel.</p>
<p>I don&#8217;t know what I prefer, hardware or software failure. I guess that RAM failure is the best, just swap it with new RAM. Disk failure might mean data loss, which I am sure I don&#8217;t want to handle, and recompiling the kernel can be a tedious task, but preferable than losing data and having to re-install the whole system again.</p>
<p>And what I asked myself, when I rode <a href="http://www.asso-scooter.org/IMG/jpg/Dink_125.jpg" target="_blank">my bike</a> back home today, is &#8220;why can&#8217;t I just instantiate a new instance in the cloud with the newest working snapshot of my system? Why hardware failure in the cloud is so easy to deal with, and hardware failure in the office isn&#8217;t?&#8221;. And I had a vision of all the people working on machines similar to mainframe terminals, running only the basic things and having the OS and all the data just sit in the cloud.</p>
<p><a href="http://www.thinkgos.com/index.html" target="_blank">This day isn&#8217;t far</a>. But tomorrow it&#8217;s back to the lab to (hopefully) have my RAM replaced.</p>
]]></content:encoded>
			<wfw:commentRss>http://orensol.com/2009/02/24/hardware-failure-apocalypse/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
<!-- WP Super Cache is installed but broken. The path to wp-cache-phase1.php in wp-content/advanced-cache.php must be fixed! -->