Cloud Operations – 1. The Frontal Lobe

In my previous post on cloud operations, the image has a sense\respond loop between the APM (Application Performance Monitor) and the rest of the system. This is the frontal lobe of cloud operations – its job is to analyze the information coming in from the APM and translate it into an appropriate action. This is one key area that still needs a lot of work (and invention) – but will be one key differentiator between cloud based applications and traditional applications.

The reason is the inherent elasticity of the cloud. You can always get more – more capacity, more storage – but it will cost you. If you have ever been part of an IT performance war-room then you know that capacity is magic elixir that fixes everything. The cloud makes that elixir so simple to obtain, it can get transformed into a panacea. Sure you can go and allocate another dozen web\app servers if the current systems aren’t keeping up with demand, but once you do that you’ll need to pay for the extra capacity. It becomes an immediate additional expense,  so just because you can do it doesn’t mean you should. These cost aware decisions will be a new role for operations, and will require a new type of SLA management – “cost aware SLA management”. Currently most SLAs focus on downtime (e.g. 99.9), and some focus on performance (x second response time) – but ignore the costs associated with maintaining the SLA. Once costs become more imediate and visible someone is going to have to manage them, and operations will be tagged with the job.

The problem is that APMs provide just too much information for humans to manage. There will need to be some sort of intelligent analysis distilling the information coming from the various APM systems and distilling the raw data into actionable information. I believe the only way to achieve that is through behavioral analysis of application and predictive analytics (I have been writing about this here). That is the only way to obtain the benefits that the cloud can provide, through intelligent systems that can make some decisions on their own (e.g. increase the number of servers to meet demand, within a predefined policy), and provide distilled, actionable information for operations when they can’t.


2 Responses to “Cloud Operations – 1. The Frontal Lobe”

  1. Cloud Operations 2 – The Parietal Lobe « Jacob Ukelson's Blog Says:

    « Cloud Operations – 1. The Frontal Lobe

  2. Steve Says:

    I like your blog entry.
    Gathering metrics from cloud deployments and triggering actions to make changes to those deployments, is something that is build right into the heart of RightScale’s Cloud Management Platform. It may not be quite as refined as “behavioral analysis of application and predictive analytics”, but the feature is there and proven.

