Summary

Java and Amazon EC2
For the first test, I’m building an application that will run on Amazon EC2. I’m using Eclipse with Amazon’s own ASW Tools for Eclipse. This is a plugin for Eclipse that lets you create an EC2 instance right from within Eclipse, which I did. However, in deciding the impact on my programming work, I’m not going to factor that in, since allocating instances is in the realm of IT as opposed to coding. In a large shop, it’s usually the production team that deploys the apps, not the programmers. Instead, the programmers would be working on a local installation. The AWS Toolkit for Eclipse includes a sample application called Travel Log. Buried inside this application are functions like this:public static InputStream loadOriginalPhoto (Photo photo) throws IOException {This is a function for loading an image from Amazon’s Simple Storage Service, or S3 for short. Notice the S3StorageManager object. That’s actually part of the sample code, not part of the AWS API. But the code for that class, in turn, does call AWS-specific code that lives in various packages under com.amazonaws.services.s3. In other words, the example has AWS-specific code in it. That’s not a problem, of course, but it does tell me that the example is locking us into AWS. In turn, that forces us to decide:S3StorageManager mgr = new S3StorageManager();
TravelLogStorageObject obj = new TravelLogStorageObject();
obj.setBucketName(uniqueBucketName);
obj.setStoragePath(photo.getId()+FULLSIZE_SUFFIX);
return mgr.loadInputStream(obj);
}
- Do we use AWS-specific code and take advantage of various AWS features but get locked into AWS?
- Or do we avoid lock-in, but also not be able to take advantage of AWS features?
A Brief Pause for Some Debriefing
As you can see, we’re barely into these tests and already there’s an issue. But let’s put it into perspective for a moment. Several companies out there provide vendor-agnostic management tools for managing your clouds. One of their claims to fame is that you can use their management console to launch instances on different platforms (Amazon, Rackspace, and others) without having to make adjustments for the particular platform. Some of these companies even have their own images, which are actually made up of a list of vendor-specific images. You launch one image and pick whichever vendor you want, and their system works behind the scenes to pick the real image specific to the vendor. That’s a big sell to the business managers, because it gives them a warm squishy feeling that they’re avoiding vendor lock-in. They even read articles (some written by yours truly) that explain how the IT staff can use a standardized API shared across vendors to create their scripts. However, in that situation we’re not talking software and Web applications: we’re talking management scripts for deploying instances. And that’s a huge difference. But there’s hope: new cloud standards also include developer-oriented APIs. For example, OpenStack (which was created in part by Rackspace) includes APIs for things like uploading objects (such as images) into containers. Even so, that’s of little help to us for our current project on Amazon. Back to our scenario: the VP has been sold on this whole notion that everything is portable, while us programmers were left out of the decision. We use the tools available to us (including the AWS SDK), and suddenly find ourselves in a bind.Quick, Back to Our Code Before It Gets Rewritten
Clearly Amazon’s API forces us into a vendor-lock-in situation. But if we recognize that fact going in, and if we’re okay with it (which you might not be), then the APIs are available. Let’s review some of the code in order to get a better sense of its complexity. I’m warning you here that my goal isn’t to teach you how to use the API; instead, I want to show you what it entails, so you can decide it does make your life easier or not. In essence, it comes down to this: do you like using third-party classes that fully encapsulate a RESTful API? Amazon has built their API using a REST interface (although some people have criticized it for abusing the REST verbs, arguing that it doesn’t technically qualify as “RESTful.”) Instead of calling into the API yourself, Amazon has built a rather large and cumbersome set of classes around these calls. Using the classes isn’t very difficult, but their presence does add some overhead. What’s interesting, though, is that the sample included in the toolkit, called Travel Log, features a wrapper of classes that sits atop the existing Java API. I mentioned these wrapper calls earlier in the article. But if you want to use the Java API itself, I encourage you to look at the samples that come with the SDK, separate from the AWS Toolkit for Eclipse. These demonstrate the basic set of Java classes provided for us by Amazon. For example, there’s a sample for uploading a file to an S3 account that only uses the API classes. When you cut out all the comments and exception handler, and get to just the code that actually does the work, you see this:AmazonS3 s3 = new AmazonS3Client(new PropertiesCredentials(The first statement reads your credentials from a properties file, and creates an AWS client object. The final two lines do the work of creating a bucket to put the object in, and then uploading the object itself. That’s really not so bad, which makes me wonder why the sample app has such a huge amount of classes written on top of these, when these aren’t awful to call. Ultimately, all these calls work in sync to piece together a REST call to the Amazon servers. You can see info on the REST call for the createBucket function here. Inside those docs, you can see a simple PUT verb taking place, along with credentials. Posted along with the call is an XML file containing information about the bucket. That XML is rather important here, because inside it is an element called CreateBucketConfiguration. That brings me to the notion of standard APIs. Some people consider Amazon to have essentially built the standard for clouds. The question, then, is whether any other vendor even supports an API call for creating a container of sorts by including XML that has a tag called CreateBucketConfiguration. If so, then it’s possible your code will port to those other cloud vendors. If not, then you—and your code—are stuck with Amazon, regardless of what the executives want. In that case, hope you’re already boiling up some coffee for that code-rewriting marathon. As it happens, all might not be lost. There are organizations trying to embrace standards—including some standards based on Amazon’s model. Look at this page for Google’s clodu storage. If you search through the page, you’ll see that they support the same XML and call. In fact, Google is working to implement the same API. Now I’m going to be completely honest here: I haven’t tried porting an Amazon Web Services app to Google’s cloud. Frankly, I have no idea if it would port, but I’m skeptical (and would love to hear if any of you have had any such luck.) Rackspace has its own API, for example, which it’s tried to unleash as a standard called OpenStack. And Eucalyptus claims its API features good compatibility with Amazon’s API (there’s even one little item called CreateBucketConfiguration that appears in some of its sample code). While a handful of companies are trying to standardize on Amazon’s API, there’s a big question of whether they’ll succeed—there are many cloud vendors out there, any many competing standards. In other words, the chances of switching to a fully compatible cloud don’t seem that great. While you might not be completely locked into Amazon, you are locked into a small subset of cloud vendors.S3Sample.class.getResourceAsStream("AwsCredentials.properties")));
String bucketName = "bucketname"; String key = "MyObjectKey"; s3.createBucket(bucketName); s3.putObject(new PutObjectRequest(bucketName, key, createSampleFile()));
Score So Far
We haven’t even gotten to Microsoft’s Azure yet, but Amazon’s API hasn’t proven too difficult to use. On top of that, there’s a chance you might not end up fully locked into Amazon, thanks to Google.Taming the Microsoft Beast
Now that we’ve dug down into Amazon’s API and determined some of the implications, let’s head over to Microsoft. When you sign up with Azure, you get an interface that has similar functionality as Amazon’s platform. You can allocate new instances of servers, and you can even choose from some Linux varieties to run on these servers. If you look at the Azure API docs for different languages, you’ll see many mentions of messaging (say that five times fast), whereby your different server instances can communicate with each other. The general idea is that you might share computing resources among different servers so that you can accomplish, for example, massive parallel computations that would otherwise be difficult (if not impossible) to do on a network in your own organization. The messages are implemented in what Microsoft calls a Queue service. Here’s a quick tutorial if you’re interesting in trying it out. Also, so we don’t go comparing apples to oranges, I’ll point out that Amazon also has a Queue service (you can read about it here). They also have a similar service called Simple Notification Service, which you can read about here. Note that Amazon’s messaging services all use the same RESTful interface, as I already described. But Microsoft does offer a storage API, one that—like Amazon’s—is also RESTful at heart. The company calls it Blob storage. Here’s the page that describes the REST API for adding an item to storage on Azure. A sample call looks like this:http://myaccount.blob.core.windows.net/mycontainer/myblobThis is, of course, different from Amazon’s API. From a programming perspective, however, you probably won’t be making the REST calls directly. Like Amazon, Microsoft offers a whole set of tools for writing your cloud code. These tools cover several languages, including Java. The concept is similar to Amazon’s in that you can create containers and put files (called blobs) into the containers. When you include the provided Java libraries, you can easily make calls just as you can with Amazon. Here, for example, are two lines of code for creating a container, which I copied directly from the docs found here.
CloudBlobContainer container = blobClient.getContainerReference("mycontainer"); container.createIfNotExist();This is a bit of odd approach: first you request the container, then call a special createIfNotExist function if it doesn’t exist. That means the first line returns a valid object even if the container doesn’t exist. The docs include a couple lines for uploading a blob, which is simply a matter of calling a function to request a reference to a new blob, before calling an upload function.