This is my last post in a series of pontifications on subjects from my presentations this year on technologies changing print. The last area that I would like to dive a bit deeper into is Cloud Computing. Recently I presented at Marketing Innovations Summit put on by QuantumDigitial in Austin. When I brought up the term, the audience asked the logical question: What does that mean? What a great place to start!
The conference attendees took a tour of the QuantumDigital production facilities. I was impressed when they talked about how all of their production IT equipment was not on premise. The equipment is co-located at a Tier 1 hosting facility nearby. They connect their buildings to the datacenter with Gigabit Ethernet fiber. That is one definition of cloud computing. While there are many interpretations and definitions for Cloud Computing, let’s take a look at what Wikipedia currently says on the subject:
“Cloud computing is a paradigm of computing in which dynamically scalable and often virtualized resources are provided as a service over the Internet.Users need not have knowledge of, expertise in, or control over the technology infrastructure in the "cloud" that supports them.”
Generally, I think that is a pretty good explanation. But why “Cloud”? The term cloud comes from the notation used over the years in network or schematic diagrams. Often a cloud symbol is used labeled “Internet.” The idea is that things (data) go into and out of the public interconnections, and we do not know nor do we care what happens there. As long as it is reliable, we are happy. This key idea is that we do not need to be overly concerned during system design over the details. That is the chief advantage to Cloud Computing.
The other advantage is cost. While a number of people would look at monthly costs and say, I can spend less by doing it myself, I would challenge that idea. Outsourcing IT compute power to the clouds can lower costs in two ways. First, by co-locating entire server management, organizations can eliminate many costs associated with internal resources. These can be the obvious like capital and lease costs. It can also be human resources, both internal and external. At Trekk we have outsourced many functions and end up with lower TCO.
The second way is that you can purchase just the computer resources you need when you need them. You do not need to scale your resources for the rainy day needs. Many of the cloud systems can be easily scaled via virtual server technology. You can add entire servers rapidly when demand increases and remove servers when the demand has passed. Not only servers, but virtual engines are available that can expand and contract and run across various datacenters worldwide. Examples of this include Google AppEngine, Amazon’s Elastic Compute Cloud and Salesforce’s Force.com.
These vendors excel at the metrics used by datacenter professionals. As defined by The Green Grid, Power Usage Effectiveness (PUE) is the measure of how much power is delivered to IT equipment versus total power used by the data center. An average data center has a PUE of about 2.5, while world-class centers are more like 1.3. Server CPU utilization measures how much of total CPU cycles over a given period of time is actually used for desired work. Often corporate servers are at 10 to 20% utilization, whereas well managed virtual environments can deliver closer to 80 or 90%.
As you can see, well managed environments can be 4-5 more cost effective then poorly managed ones. If your internal systems are only half as efficient as vendors, this alone is a tremendous reason to outsource.
One of the keys to Google success is its parallel programming model called MapReduce. This allows programmers to build software that can take advantage of parallelism without regard to the details. A MapReduce allows a typical search on Google may invoke the resources of as many as 1000 machines in parallel. While the MapReduce library is proprietary, the algorithm has been discussed publically and an open source implementation is available called Hadoop.
Recent estimates put the number of servers hosted by Google at around 1,000,000! In the last year, Google has been more open about its technology, in part to help push forward their innovations in green initiatives. Google and other large-scale vendors have found they can provide much of their very large data center resources to us mere mortals. For example, Google delivers database solutions as part of an AppEngine called BigTable. Other cloud vendor database offerings include Amazon SimpleDB, Salesforce.com's Force Database and Microsoft’s SQL Azure Database. These allow developers to utilize large-scale, high-performance database services to their applications instead of a local or networked internal database server.
It sounds intimidating, but there are simple places to start. Trekk has utilized cloud vendors for storage of web assets. Web applications often reference graphic image, video and resource files. These not actually part of an HTML file, but referred to by filename. These files do not need to be on the same server. Often document servers, image servers and streaming video servers are part of a solution. One such cloud offering is Amazon Simple Storage Service. Amazon S3 allows us to store files ranging between 1 byte to 5 Gigabytes. They can be managed via REST or SOAP programming interfaces. This service is especially useful for storing very large files. Downloads to end user browsers utilize Amazon’s bandwidth and not ours. We just pay for usage and do not need to plan our infrastructure for high-demand events.
Why is there all of this cloud hype now? Is that why you are making a face? Haven’t SaaS, virtual servers and co-location offerings been around for years now? Yes. But the maturity of products, the number of competitors and lower prices has crossed a tipping point. Large online services have been on the forefront of very large data centers, distributed processing and virtual systems. Pricing alone will compel most businesses to look at migrating parts if not all their server infrastructure. Trekk has been moving this way for a number of years. I can see a day, probably sooner than we think, when most IT infrastructure will consist of workstations, laptops and mobile devices connected via broadband. Where they connect to is elsewhere - somewhere in the clouds.
Posted by JA Stewart at 09/29/2009 02:15:05 PM |