Harnessing the Cloud

Media storage is the final area where you can make performance enhancements. As you dive into file types, you'll discover that PHP files are different than HTML files, which are different than binary files, such as images. Each binary file can be compressed or decompressed, but each one has its own overhead. In addition, images that are compressed gain far less "boost" than text files like HTML. Therefore, it makes sense that certain types of files can be optimized certain ways. Or rather, servers and storage facilities can be optimized to specifically serve particular content types such as images, video, or strictly HTML.

One option to consider is cloud-based computing. I'm not a fan of using cloud-based computing for everything. There is a movement to take everything, including applications, into "the cloud" but often redundancy and latency cause other problems.

Nonetheless, the cloud can be very effective for content types that don't need the computing power and immediacy that WordPress does. For this, solutions such as Amazon S3 (Simple Storage Service) cloud storage provides ample image storage. Installing the Amazon S3 for WordPress plugin (available at http://wordpress.org/extend/plugins/tantan-s3/) effectively replaces local media storage with S3 media storage.

Another option is to store content on a system level. You could create a "virtual directory" out of your standard WordPress wp-content/uploads/ directory that maps to the Amazon S3 service. There will be latency, and it won't be speedy, but then you are serving images, not executing and compiling PHP. There's room to wiggle here.

Cloud Computing Does Not Spell the End for Common Sense IT Management_

In 2008, during an Internet-wide debacle stemming from a prolonged Amazon S3 outage, I wrote this extremely snarky post on my blog. The post was the result of my experience in enterprise IT and as the Director of Technology for a WordPress-powered blog network. I share it here to emphasize the importance of redundancy and to make the point that no single one of these solutions, by themselves, will save you. Reliance on any third party for any mission-critical functionality is a recipe for disaster.

Sometimes I think I might be the only one who retains common sense. Really. At least in the area of IT Management. Though we had our share of growing pains at b5media, the knowledge gained from working in an enterprise environment at Northrop Grumman was only accentuated by my tenure as the Director of Technology at b5media.

Unfortunately, some common best-use practices in developing infrastructure are often put aside by those with shiny object syndrome surrounding "cloud computing."

Let me explain.

You may have noticed a severe hampering of many Internet services over the weekend. The culprit was a rare, but yet heavy-duty, outage of Amazon S3 (Simple Storage Service) cloud storage. S3 is used by many companies, including Twitter, WordPress.com, FriendFeed, and SmugMug, to name a few. Even more individuals are using S3 for online data backup or for small projects requiring always-on virtual disk space. Startups often use S3 due to the "always on" storage, defacto CDN, and the inexpensive nature of the service ... it really is cheap!

And that's good. I'm a fan of using the cheapest, most reliable service for anything. Whatever gets you to the next level quickest and with as little output of dollars is good in my book, for the same reason I'm a fan of prototyping ideas in Ruby on Rails (but to be clear, after the prototype, build on something more reliable and capable of handling multi-threaded processes, kthxbai.)

However, sound IT management practice says that there should never be a single point of failure. Ever. Take a step back and map out the infrastructure. If you see any place where there's only one of those connecting lines between major resource A and major resource B, start looking there for bottlenecks and potential company-sinking aggravation.

Thus was the case for many companies using S3. Depending on the use of S3, and if the companies had failover to other caches, some companies were affected more than others. Twitter, for instance, uses S3 for avatar storage but had no other "cold cache" for that data rendering a service without user images — bad, but not deadly.

SmugMug shrugged the whole thing off (which is a far cry from the disastrous admission that "hot cache" was used very little when Amazon went down back in February), which I thought was a bit odd. Their entire company revolves around hosted photos on Amazon S3 and they simply shrugged off an 8-hour outage as okay "because everyone goes down once in awhile." Yeah, and occasionally people get mugged in dark city streets, but as long as it's not me, it's okay! Maybe it was the fact that the outage occurred on a Sunday. Who knows? To me, this sort of outage rages as a 9.5/10 on the critical scale. Their entire business is wrapped up in S3 storage with no failover. For perspective, one 8-hour outage in July constitutes 98.9 percent uptime — a far cry from five 9's (99.999 percent) which is minimal mitigation of risk in enterprise mission-critical services.

WordPress.com, as always, comes through as a shining example of a company who economically benefits from the use of S3 as a cold cache and not primary access or "warm cache."

Let me stop and provide some definition. Warm (or hot) cache is always preferable to cold cache. It is data that has been loaded into memory or a more reasonably accessible location — but typically memory. Cold cache is filebased storage of cached data. It is less frequently accessed because access only occurs if warm cache data has expired or doesn't exist.

WordPress.com has multiple levels of caching because they are smart and understand the basic premise of eliminating single point of failure. Image data is primarily accessed over their server cluster via a CDN; however, S3 is used as a cold cache. With the collapse of S3 over the weekend, WordPress.com, from my checking, remained unaffected.

This is the basic principle of IT enterprise computing that is lost on so much of the "web world." If companies have built and scaled (particularly if they have scaled!) and rely on S3 with no failover, shame on them. Does it give Amazon a black eye? Absolutely. However, at the end of the day, SmugMug, WordPress.com, Friendfeed, Twitter, and all the other companies utilizing S3 answer to their customers and do not have the luxury of pointing the finger at Amazon. If their business is negatively affected, they have no one to blame but themselves. The companies who understood this planned accordingly and were not negatively affected by the S3 outage. Those who didn't were left, well, holding the bag.

A third option for high capacity sites is to use a third-party content delivery network (CDN). CDNs tend to be very reliable, but it's also recommended that you keep a cold cache locally that you can fall back on in the case of a failure or outage. CDNs are cloud-based storage solutions most often used with high-bandwidth video. Think of them like a dedicated private Internet because they are global, extremely fast, and do not suffer from the same bandwidth problems you might experience when dealing with the regular Internet everyone uses.

Many CDN companies exist, and the most common include Limelight, Akamai, and Level3 Communications. With a plugin like MyCDN (http://wordpress.org/extend/plugins/ my-cdn/), you can "seed" the CDN once with all of your existing images, stylesheets, and so on, and then point WordPress in the right direction to retrieve data that is now hosted in the cloud. The plugin will handle redirecting all internal links so images won't be broken.

Was this article helpful?

0 0
101 Twitter Tips

101 Twitter Tips

This is an ebook all about helping you to learn everything you are needing to know about Twitter.

Get My Free Ebook


Post a comment