The up.time IT Systems Management Blog

Posts Tagged ‘cloud’

2010 – The Year of Cloud Experimentation – Part 1 of 2

Monday, November 30th, 2009

At uptime software, we’ve been quite bullish on Cloud’s potential but feel it still has some distance to cover before it lives up to the hype. In fact, I wrote a blog in January looking at a hypothetical company and the costs involved in moving an entire infrastructure into the Cloud (using Amazon EC2). The results were not impressive, Cloud computing was too expensive (in this example) to gain the critical mass it needs to catch on. It’s amazing how much had changed in the ten months since that blog, as we have learned more about how the Cloud can be best utilized. Recently, the media has driven the Cloud excitement and IT managers are now thinking about how the Cloud, in one form or another, can be used in their environments to drive performance and efficiencies.

The real question is this; in what capacity will organizations adopt Cloud over the next few years? With that in mind, we see the coming year as one of exploration and experimentation. The first step is for companies to quantify what Cloud means to their business.  Is it as banal as remote storage used for DR purposes, or something as evolved as dynamic compute with secure private/public networking?

Let’s take a look at the “IT Spectrum,” which is loosely aligned with IT maturity and size of organization.

In this diagram, the left represents most small businesses who house their own servers and have a small number of IT staff.  As the small business matures, they may evaluate SaaS-type applications (like Salesforce.com) or push some servers out to an MSP.  Further maturing, or growing, businesses may have additional servers in remote hosted datacenters, like web servers or remote disaster recovery storage.  At the right-most point in the spectrum, businesses/enterprises have opted to completely outsource their IT and minimize the number of IT staff employed by the business.

Understanding the spectrum’s components is important. They represent a “menu” of options that businesses can use to leverage virtualization and cloud technologies to reduce costs (either labor or infrastructure).  This “menu” is most likely how IT managers will choose to evaluate the relevance of Cloud to cost savings and enhanced service delivery.  For example, with VMware’s new VBlock offering and the ongoing relationship with Terremark, entire stacks of infrastructure can be pushed into off-premises locations and operated in a mission-critical environment. So, whether it’s just dipping a toe into the Cloud waters (like hosting a server in Amazon EC2 or the RackSpace Cloud to deliver a decoupled application) or leveraging the VBlock to move entire mission critical infrastructures, there are many options to consider. Keep in mind that issues such as backup management, lifecycle management, and systems management need to be addressed in all cases.

How is the experimentation starting?

[ more next week in Part 2 ]

Microsoft finally draws their line in the clouds

Monday, November 23rd, 2009

As many of you are likely aware, last week Ray Ozzie announced that Azure (Microsoft’s cloud service) would go into full production on January 1st, 2010. Azure is interesting because Microsoft wants to keep the paradigm of desktop OS’s as a key part of the architecture with “the cloud” as an adjunct in what they call the “three screens and a cloud” vision. This vision is important, because it makes the cloud real for consumers and makes it more understandable and accessible to the general populace. Project “Dallas” also re-affirms Microsoft’s commitment to cloud computing as a whole, Microsoft unveiled just enough details to make the project interesting – i.e.  data-as-a-service.

For all the “evil empire” slag that Microsoft gets, people tend to forget, or ignore, what happens when Microsoft embraces a technology and tries to dominate that market – the technology just gets easier to adopt and becomes more real.

This is an important milestone in the development of the entire “cloud story”. Let’s be clear – Microsoft, due to their size and market position, does not have the need to innovate or invent new paradigms. All they have to do, and what they are good at, is step into nascent markets that are at the edge of becoming mature enough to explode. This is generally a moment of truth for any incumbents, as Microsoft can and does take advantage of their massive resources in an all out war for dominance. Once they ‘put their toes in the water’, they slowly wage a war of attrition on the incumbents, and buy all the best players and minds, until eventually their technology is pervasive.  We have seen this strategy in effect to great success over the years. Remember the browser wars, Database (SQL?), ERP, CRM, Content Management (Sharepoint), Audio Devices (Zune), Console Gaming (XBox) and the list goes on.

So what’s the moral of the story? When Microsoft wades into the game, it’s a very strong sign that it’s time to get with the program and adopt this emerging pardigm.

451 Group – Cloud Codex

Friday, November 20th, 2009

The 451 Group just recently posted their CloudScape summary, or Cloud Codex.  It can be obtained  from here (http://www.451group.com/cloudscape/cloudscape_report_detail.php?icid=869).  I consider this report to be a very thorough summary of the current Cloud landscape and the various issues that surround Cloud architectures and deployments.  I’d like to summarize a few salient points from the report:

The report defines various kinds of Cloud computing (closed private cloud, community private cloud, hosted private cloud, enterprise public cloud, and commodity public cloud) and then goes on to define the four pillars that support these clouds: management, automation, security, and storage.  These pillars then sit on top of the underlying hardware (network gear, x86 servers, mass storage, and virtualization software).

Of particular interest to us is the cloud management and automation, specifically: cloud monitoring, analytics; and provisioning, and orchestration.  Application performance in the cloud is going to become an issue and you’ll need management tooling that can quickly drill down into the application stack, virtualization layer, and physical infrastructure to identify performance issues.  Analytics then becomes important to understand correlations between cloud infrastructure (and possibly private infrastructure as well).

On the automation side, as applications become increasingly elastic, cloud management tooling is going to have to understand dynamic changes in infrastructure and be able to adjust the number of elements being monitored in real-time.  The tooling will also have to be able to trigger orchestration events in the cloud to react to certain kinds of load (or outage) scenarios.

As cloud evolves, so will we, these are exciting times.

Alex

P.S. Here’s my plug, of course, check out our cloud monitoring information. Lots more to come.

The Cloud goes beyond Virtualization

Thursday, November 12th, 2009

There is a article over at The Cloud Option discussing how virtualization is not Cloud.  It is summed up very well in this statement:

“Cloud/IaaS goes beyond virtualization by providing extra services for dynamically allocating infrastructure resources to match the peaks and valleys of application demand.”

I think that when people discuss the public/private cloud, this is an often understated point.  Simply virtualizing your existing infrastructure with your favourite hypervisor does not mean you have implemented a private cloud within your datacenter.  Cloud is about enablement, not virtualization.  As ‘The Cloud Option’ says, virtualization is a valuable first step, but it is not Cloud.

From my perspective, Cloud is all about the ability to deploy and manage business services without the involvement of an infrastructure team.  If you develop application X for the Cloud, given the right permissions, you should be able to provision the application into production without ever involving someone from the IT department responsible for providing the Cloud resource.

Once provisioned, you should be able to manage, maintain and scale application X without ever involving IT.  Virtualization alone is never going to give you this.  Cloud is about tools, and given the infrastructure requirements to deliver todays applications and services, it’s about about simple tools performing complex tasks behind the scenes.  I go back to the article at ‘Cloud Option’ and, as they suggest, at a minimum the Cloud provider (internal or external) must bring:  Self Service, Resource Metring & Accountability, Image Management and Network Policy Enforcement.

up.time provides great visibility into your physical and virtual assets that are a part of your Cloud strategy, by provinding deep Cloud monitoring and Cloud management, as well as traditionally deployed applications.  In conjunction with our vOrchestrator integration, up.time can also provide resource automation for the scaling and provisioning of applications into the Cloud.

I think Terremark is heading down the right path with their Cloud offering, providing a complete solution to their customers with self management from the application to the virtual network and its security features.  I also think that as enterprises look to push their applications and data onto the Cloud, network capabilities are going to become the real differentiator between Cloud offerings.  We are at the point where virtualization at the server level is a known and pretty comodditized good.  However, at the network layer there are all kinds of opportunities to provide value as part of the overall Cloud offering.  From basic firewalling and load balancing to application aware layer 7 switching and deep packet manipulation, these are all capabilities that will allow Cloud providers like Terremark to differentiate themselves from one another.

Just how disruptive is Cloud technology?

Monday, November 9th, 2009

Let’s understand for a moment just how disruptive Cloud and virtualization technologies are to OTHER technologies. Ignore for a moment, all the changes required to business processes, maintenance processes, infrastructure deployment models and all the other stuff people have been beating to death over the past 2 months.

Just how pervasive and challenging is Cloud technology to entrenched technology? Well for one, people are redesigning and re-thinking how we use TCP/IP in order to enable and Long Distance VMotion. That’s right, in order to be able to forklift virtual instances and massive data over the internet, companies like netex have figured out how to make the old building block of the interwebs TCP/IP even better – dubbing their new UDP over IP translation technology “HyperIP”.  HyperIP optimizes TCP/IP so that you can move a full vmware instance over the wire up to 10X faster than usual. (Let’s not even talk about how people will monitor this new disruptive technology, but you can bet it’s the agile players who are even aware of the new challenges in this space).

The potential for this technology is 100% clear, and probably is somewhere in a lab being coveted by the people at VMWare as “my precious” – especially in the context of their desire to get remote DRS as a solidified feature in the VSPHERE platform.   If VMware manages to get this integrated as part of remote DRS and they start forklifting instances to/from and across the Savvis and Terremark clouds this will be a giant leap towards making unified compute and private/public clouds – “as real as it gets”. This doesn’t even take into account the latest ‘turnkey’ private cloud solutions unveiled by VMWare known as VBlocks.

The clouds just zapped TCP/IP, what’s next?

Cloud Computing -The Clouds Are Brewing, Are You Ready for the Storm?

Tuesday, October 27th, 2009

I recently watched some “unknown guy,” you know that “unknown techie” person Larry Ellison, rant about the cloud for at least 5 minutes. I found it interesting for a couple reasons:

1) He isn’t wrong that the cloud, in essence, is based on traditional hardware infrastructure placed essentially into the net, and that a lot of people are abusing the terminology for commercial means.

2) He has a huge interest in Netsuite, which is a SAAS based cloud CRM provider, and Oracle. Both organizations are doing a lot in the background around Cloud. Don’t believe me? Visit the Netsuite website or “http://www.oracle.com/us/technologies/cloud/index.htm“.

Companies and luminaries in leadership positions will always say one thing in particular during periods of challenging competition or changing market landscapes. When you dig a bit deeper, these companies usually try to deny the newcomers as long as they can while hedging their bets in the background to protect their leadership position against the ever dangerous “game changer”.  This gives them time to position themselves as a prime player when the time comes.

Cloud computing is coming hard and fast. It’s a game changer.

Although the underlying technology components are the same, the ability to connect them over a public carrier network has increased its potential effect exponentially. The obvious truth is that the current catalyst for Cloud and the resurgence of centralized compute from a technology perspective, is the decrease in cost for network bandwidth. I recently downloaded an 8 Gigabyte file in less than an hour over a home-based broadband connection (21MBPS) . This is unbelievable when put in the context of connectivity not that long ago (ok, maybe I’m old)  – remember 28.8K Baud modems?

It’s no wonder that Cloud based services, like Microsoft Live Mesh, Sugar Synch, and Salesforce CRM, are able to provide ever richer and broader services “over the wire”.

Edge bandwidth to wireless devices right now is reaching upwards of 5 to 8 mbps in 3G HDSPA areas, making the new generation of netbooks, smartphones, and hybrid smartphone/notebook technologies, prime candidates to join the Cloud computingand social networking phenomenon. If Larry is ranting now, wait ’till billions of smartphones join the Cloud. I call dibs on the term “SWARM COMPUTING” for the surge of all these consumer grade devices to the Cloud. So, when Larry Ellison wants to rant about it, he can call my mobile.

In other words, the clouds are brewing,  make sure you grab an umbrella, there’s going to be a storm.

P.S. – If you haven’t seen  Larry’s tirade against “The Cloud” click here and enjoy the fireworks.

Large Scale Cloud Computing Adoption

Monday, October 19th, 2009

There is a very well written article over at ulitzer.com regarding the US Federal Governments IT spend plan for FY11 and their investigation into leveraging cloud computing as a cost cuttimg measure for federal IT spend.  It breaks the analysis down into 3 options:  Public, Hybrid and Private cloud.  In their analysis, the public cloud comes out at a BCR of 15.4 (Benefit/Cost Ratio) with the hybrid and private cloud coming out at 6.8 and 5.7 respectively.  I found these results rather surprising considering the scope of what their analysis entails.

We aren’t talking about migrating a few workloads to the cloud, but thousands and thousands of servers worth of federal workloads.  When defining the public cloud versus hybrid/private solution and the assumptions, they state for the public cloud it is a migration of ‘low-sensitivity’ data onto existing public clouds.  Based on the ever increasing compliance requirements and demand for data privacy and integrity, I would think that the low-sensitivity workloads would not comprise the lions share of the workloads being examined, thereby leaning the tables to the hybrid and/or private cloud offering.

When migrating to the cloud, todays organizations have many terabytes or petabytes (in the case of the US Federal Government, for thousands of workloads) of data that has to be migrated onto the cloud in order to move the complete workload to the cloud.  Moving and synchronizing petabytes of storage while maintaining service continuity through the migration is a non-trivial task.

While the analysis within the article is sound, I think that there are significant hurdles still in place from a large scale public cloud adoption standpoint that are not taken into consideration to the extent that they deserve.  Everyone wants the public cloud computing model to be successful, after all the benefits stand to be great.  I think that in the public cloud, from a security and connectivity standpoint, is not quite there yet for large scale initiatives.  I think that the real successes will come from the creation and adoption of private clouds, with the slow learned migration of workloads to the public cloud as we iron out all of the security, networking and compliance requirements.

Maybe it would make sense to have the public cloud providers offer their own hybrid approach where you deploy your own private cloud and they manage it for you.  You get to leverage the benefits of their processes and technologies developed for managing the public cloud, with the benefits that come with a private cloud.

Comment on The Wrong Cloud

Tuesday, April 28th, 2009

Maya Design recently published an article accompanied by a 4 page whitepaper on cloud computing and what is being worked on today,  is in fact the wrong approach to cloud computing.  I found the article and whitepaper echoing a lot of my own sentiment about the current state of cloud computing.  From my perspective, the internet is the only real example of “true” cloud computing.  Salesforce.com, Google, and others, while referred to as cloud services, are not cloud computing but SaaS, which I see as mutually exclusive.  To me cloud is the ability to run arbitrary workloads on ‘the’ cloud, with absolute interoperability.  The internet being the communications cloud is based on a standard communication mechanism (IP) , allowing anyone to communicate over it that speaks IP.

This is not the case with cloud computing.  There are several offerings, APIs, VM target types, OSes, lions, tigers and bears, oh my!  As an industry, we’ve essentially rebranded what we’re already doing and called it cloud computing to make it sexy.  Larry Ellison put it perfectly.  Private cloud is not all that different than using ESX the way we have for years now, or by using grid technologies and application design networking to distribute workload across all of our infrastructure.  Don’t get me wrong, I think that there is a great opportunity for the concept of cloud computing, I just think that we’re taking the wrong approach to the fundamentals of cloud computing.

Turning the tables on up.time MDC (multi-datacenter)

Friday, April 24th, 2009

A brief introduction; I’m Dave Leith, I work in the Technical Solutions dept. at uptime software and have been involved heavily in the Client Services and Sales sides of uptime for the past few years.

As Alex has spoken to on a few occasions in his blog, up.time MDC (multi-datacenter) functionality has helped a number of our users achieve enterprise wide visibility into their global IT Services monitoring and performance health by combining many distinct up.time installations, called Local Datacenters (LDC),  into one global Enterprise Management Server (EMS). This mapping of one EMS to many LDCs is the typical approach to utilizing our loosely-coupled MDC architecture.

This week I worked with a client who has turned the tables on our MDC functionality. Instead of providing a single EMS to their users they decided to setup several EMSs all hooked into a collection of LDCs, a many to many relationship as seen below. Why did they do this? In this client’s case, as with many global organizations, each of their datacenters globally serve a set of applications to their globally dispersed lines of business (LOB). By providing one EMS to each line of business the administrators for that LOB can build a dedicated dashboard for their end users showing the global availability of their own IT Services without having to worry about impacting the other LOBs. Each LOB is completely contained from the others so that they can work freely while still having access to all of the raw data from the LDCs that the other LOBs do. From a capacity planners point of view, this allows capacity reporting against a sandbox of historical performance data only includes their own servers and applications performance data, making trending and forward planning much more straight forward.

There are a number of other interesting applications for this many to many MDC relationship which I’ll talk about in future posts.

Cost of cloud computing, expensive!

Wednesday, January 28th, 2009

With a large number of initiatives around cloud computing, I was interested in determining if the current cost of moving something like a lab environment into an outsourced environment would be cost effective.  Now, I realize that current ‘cloud’ offerings are really geared to dealing with temporary spikes in compute load rather than moving an entire infrastructure out of a corporate data center, however, mirroring a lab environment is perhaps a plausible use of the cloud.

This demonstration was simply to determine the monthly cost of hosting a lab environment in Amazon’s EC2 and then comparing it to the fully loaded cost of having a lab environment in house. 

The service that I ran the experiment on was Amazon’s EC2 and their storage service (S3) for persistent data management.  EC2 allows you to provision various types of x86 servers of differing compute capabilities and you are billed by instance hour of time.  There is no restriction on how compute intensive your instance is.  Their cost matrix for Linux instances and S3 storage can be viewed here and the Windows pricing is here.  The Windows pricing also includes options for SQL Server (and authentication services).
The experiments I ran were for five systems of various configurations running our application (up.time).  This included Linux running MySQL, Linux running Oracle, Windows running SQL Server and other combinations.  The databases were stored on Amazon’s EBS (Elastic Block Store) storage for persistence reasons.  The applications were run for two weeks under simulated load for monitoring 1,000 systems to get an idea of network and storage bandwidth.
After two weeks, the compute costs, I/O costs, and persistent storage costs were tallied and then scaled to mirror the monthly cost of a sample lab environment.
Amazon EC2 Costs for 300 lab instances.  There are 744 hours in a typical month (24*31).
Instance Type Num Cost/Instance Hour Compute Cost/Month
Windows 100 $0.125 $9,300
Windows + SQL Server 50 $1.100 $40,920
Linux 150 $0.100 $11,160
Windows (SQL/xlarge) 2 $2.400 $3,571.20
Total Cost Per Month $64,951.20
Storage Storage Cost/Month
5.6T (usable) $0.10 Gb/month $573.44
I/O 30B $0.10 per 1MM I/Os $300.00
Network Network Cost/Month
I/O 20 Gb $0.10 Gb/month $2.00
Total EC2 Cost/Month $64,826.64
Total EC2 Cost/Year $789,919.68

Now, if I calculate actual lab costs that mirror this environment here’s what we get (I’ve deliberately excluded our non-x86 platforms such as POWER and SPARC).  I’ve included the retail costs for Microsoft SQL Server and Oracle even though as an ISV we wouldn’t nearly pay as much.  The EC2 cost for Windows systems is considerably higher than Linux, and this is because of the software licensing costs blended into the instance hour calculation.

In the cases of leasing hardware, the number is more or less a constant cost as new gear is purchased and older gear is bought out. For software costs, they’ve been amortized over three years.

Gear Number Cost Per Month
Dell 1950 28
Dell 2950 2
HP DL585 2
10TB iSCSI 1 $10,000
Dell/HP/Equallogic Support $300
HVAC/Power $1,000
Floor Space 500 sq/ft $24 sq/ft/year $1,000
VMware ESX 9 $1,250
Annual Support (VMware) $1,250
Internet $1,200
Network Infrastructure $556
Total Infrastructure Cost/Month $16,556
Software Cost
SQL Server 2008 $2,083
Oracle 10g/11g $2,083
Labour Cost/Month $4,166
Total In-House Cost/Month $24,888.89
Total In-House Annual Cost $298,666.67
I’m torn about including labour, as instance management overhead is the same in both scenarios, however, the actual network and compute infrastructure when in-house, does require some amount of headcount.  In this case, I’ve added 0.5 of a resource (fully loaded cost).
So, the difference between an EC2 lab environment and an in-house environment is ($789,919.68 – $298,666.67) = $491,253.01.  This is quite a substantial difference for an always-on environment.
I am curious as to how many enterprises have truly dynamic workloads that could take advantage of a cloud (either internal or external) to truly derive the cost benefits of cloud computing.
Certainly, at first blush, a straight migration of servers is a costly proposition.
—————————–
Quick update: We have just launched uptimeCloud – the simple way to manage cost in the cloud. This new SaaS product will provide real-time, dynamic cloud cost monitoring, cloud cost forecasting, and cloud capacity management. for more, visit http://www.uptimecloud.com