Cloud Connected

The Beauty of Cloud CMS Chaining

Chaining is a common technique that has been widely adopted by modern JavaScript libraries to chain method calls together.

The goal of chaining is to produce elegant and concise code that is easy to understand or maintain. For example, if you are a jQuery developer, you may produce similar code like this on daily basis.

$('#mydiv').empty().html('Hello Word!').css('font-size','10px');

However, most popular JavaScript libraries only support “static” chaining, e.g. DOM object manipulation. If the method to be chained makes Ajax calls, you will have to resort to callback which requires very strict and verbose syntax.

Again, let us take a look at a jQuery example that chains two serial Ajax calls.

$.ajax({
  url: 'service/endpoint1',
  success: function(data1) {
    // we have successfully made our first ajax call
    // and we are ready for our second ajax call
    $.ajax({
  	url: 'service/endpoint2',
  	success: function(data2) {
    	  // we have successfully chained two ajax calls 
  	},  	
  	error:function(error2) {
  	  // handle error from the second ajax call
  	}
    });    
  },
  error:function(error1) {
  	// handle error from the first ajax call
  }
});

The callback approach used by the above example looks clean and will do exactly what we expect. However the challenge is when we need to chain very large number of ajax calls the above code will grow significantly. With many levels of nesting, it will become very unpleasant to read or maintain and you will really miss the simplicity that chaining brings.

Before we introduce Cloud CMS chaining, let us look at another common use case of making parallel Ajax calls. It is easy to make concurrent Ajax calls but the challenge is to find out when all parallel calls finish and then execute the code that processes the returned results.

A simple and effective approach is to maintain a counter that tracks number of finished Ajax calls.

function processResults (result1, result2) {
// process returns from two parallel ajax calls } var count = 0, result1, result2; $.ajax({ url: 'service/endpoint1', success: function(data1) { result1 = data1; count ++; if (count == 2) { processResults (result1, result2); } }, error:function(error1) { // handle error from the first ajax call } }); $.ajax({ url: 'service/endpoint2', success: function(data2) { result2 = data2; count ++; if (count == 2) { processResults (result1, result2); } }, error:function(error2) { // handle error from the second ajax call } });

In the above example, we will make sure we get return from both Ajax calls before executing the processResults function. Just as the case for serial calls, you will definitely prefer chaining over using callbacks or explicitly managing the state of Ajax calls.

Now let us talk about the dynamic chaining that Cloud CMS introduces.

Cloud CMS is a content platform that help you build cloud-connected applications. When you build a Cloud CMS application, your application will interact with Cloud CMS platform through its REST APIs. In order to provide pleasant programming experience to developers, it is critical for Cloud CMS to have a driver that times or coordinates multiple REST/ajax calls in a very easy manner. That is why Cloud CMS provides a JavaScript Driver which comes with support for dynamic chaining.

The driver provides a Chain class that allow you to instantiate, extend or end a chain. A chain can carry an underlying proxied object which performs proxied Ajax calls to the REST services that Cloud CMS provides.

A chain can also be extended with a different proxied object if needed. The driver provides a list of “Chainable” classes that can be instantiated as the proxied objects for the chaining. For example, the Repository class will provide methods that deals with repository related or its sub-level objects related operations such as updating repository, creating new branch etc.

So when we make a call to

repository.readBranch('master');

it will make a proxied GET call to retrieve details of the master branch of the repository. 

The call will look like

http://localhost/proxy/repositories/aaa718b9bd29f76f443b/branches/master?metadata=true&full=true&cb=1338678288323

where aaa718b9bd29f76f443b is the repository id.

Please note that Cloud CMS JavaScript driver doesn’t reinvent the way to manage Ajax calls.

Under the hood, it still uses the callback approach to time the ajax calls and use the counter approach to keep track the parallel calls. It just provides utilities and simple APIs that shield developers away from dealing with those details.    

Let us start with a simple example that manage the life cycle of a simple chain which doesn’t deal with Ajax call.

 // Create a new chain
 Chain().then(function() {
  var data = 'Marry ';
   // Extend the chain
   this.subchain().then(function() {
     data += 'has a little ';
   }).then(function() {
      data += 'lamb.';
      // We should have "Marry has a little lamb." at the end the chain.
   });
 });

Now let us take a step further and try to use a chain with proxied objects to create a new node under the master branch of a given repository.

new Gitana({
   "clientId": "SOMEID",
   "clientSecret": "SOMESECRET"
}).authenticate({
   "username": "SOMEUSERNAME",
    "password": "SOMEPASSWORD"
 }).readRepository("SOMEREPOSITORYID").readBranch('master').createNode().then(function() {
    // we have successfully created a new node
});

In the above example, it first creates a new client with correct credentials. Once it is authenticated, it creates a new chain with a proxied Platform object. The Platform object will then be used to read the repository with the given ID.

The chain will then be extended and the underlying proxied object will be switched to a proxied Repository object populated from the previous ajax call response. It will keeps the chain going by reading the master branch (switching the proxied object to Branch) and then creating a new node under it (switching the proxied object to Node).        

If we want to extend the example to create three new nodes in serial, the example will look like

new Gitana({
   "clientId": "SOMEID",
   "clientSecret": "SOMESECRET"
}).authenticate({
   "username": "SOMEUSERNAME",
   "password": "SOMEPASSWORD"
}).readRepository("SOMEREPOSITORYID").readBranch('master').then(function() {
    this.createNode();
    this.createNode();
    this.createNode();
    this.then(function() {
     // we have successfully created three new nodes by making serial calls
    });
});

Now if we want to create new nodes in parallel, we can do something like this

new Gitana({
   "clientId": "SOMEID",
   "clientSecret": "SOMESECRET"
}).authenticate({
   "username": "SOMEUSERNAME",
   "password": "SOMEPASSWORD"
}).readRepository("SOMEREPOSITORYID").readBranch('master').then(function() {
   var f = function(){
      this.createNode();
   };
   this.then([f,f,f]).then(function() {
      // we have successfully created three new nodes by making parallel calls
   });	
});

As you can see from the above examples, the driver significantly simplifies the code with chaining. It makes the code easier to read and less error prone. It follows the human’s nature feeling of synchronous chaining while dealing with actual asynchronous ajax calls under the hood.   

Before we wrap up this blog, let us take a look a more complex example that does mix of serial ajax calls and parallel calls. It will also show how to manually switch the underlying proxied object by using subchain method.

new Gitana({
   "clientId": "SOMEID",
   "clientSecret": "SOMESECRET"
}).authenticate({
   "username": "SOMEUSERNAME",
   "password": "SOMEPASSWORD"
}).readRepository("SOMEREPOSITORYID").readBranch('master').then(function() {
   var node1, node2, node3;
   var f1 = function() {
      this.createNode().then(function() {
         node1 = this;
      });
   };
   var f2 = function() {
      this.createNode().then(function() {
         node2 = this;
      });
   };
   var f3 = function() {
      this.createNode().then(function() {
         node3 = this;
      });
   };            
   this.then([f1,f2,f3]).then(function() {
      // we have successfully created three new nodes by making parallel calls
 // we now associate node2 to node1 ( node1 ==> node2)
     this.subchain(node1).associate(node2).then(function() {
         // we have successfully associated node2 with node1
     });	
     // At this point, node2 has already been associated with node1
     // we then associate node2 to node3 ( node3 ==> node2)
    this.subchain(node2).associateOf(node3).then(function() {
        // we have successfully associated node2 with node3
	// The chain will end at this point.
    });                
  });	
});

For live chaining examples, please check out our online JavaScript Samples.

Introduction to Changeset Versioning

Cloud CMS provides you with content repositories that are powered by a “changeset” versioning model.  This a powerful versioning model that you won’t find in most conventional CMS products.  It’s one of the reasons why Cloud CMS is such a great platform for collaboration!

Document-level Versioning

A lot of legacy CMS products feature document-level versioning.  With document-level versioning, when you make a change to a document, the system simply increments a version counter.  You end up with multiple versions of your document.

It might look something like the following:

We all have or had an awesome grandparent who knew how to cook something good. For a recipe stored in a Microsoft Word file, the document-versioning model works pretty well!

Problems with Document-level Versioning

That said, there are some major drawbacks.

  1. Desktop Documents Only.  Document-level versioning is really only good for desktop documents (like Microsoft Office files) where everything (all of your nested images, fonts, etc) are contained within a single file.

    That’s why Dropbox uses file-level versioning.  It makes sense for people who work almost exclusively with desktop documents.
     
  2. No way to handle Sets of Changes.  If you’re working on mobile applications, web sites, or just about any non back-office projects, your content will be spread over multiple files.

    Think about a web site.  A web site might have hundreds or thousands of files - things like HTML, CSS, JS, image files and much more.  When you publish a web site, you really want to version the full set of files all at once so that you can push, pull and roll back updates to your web site.
     
  3. Bottlenecks.  If you’ve ever worked with Microsoft Sharepoint or any document-versioning CMS, then you’re aware of the bottlenecks that get introduced when two people want to work on something at the same time.  Either they both make changes (and you have to manually merge them together) or one person locks the file and the other person is sits on their hands.

    Most products that feature document-level versioning do so simply because it’s easy to implement.  However, it leaves your business users with the extremely limited tools for collaboration.  This makes collaboration frustrating as it cuts off people’s initiative, creativity and productivity.
     
  4. No ability to scale.  Okay, so let’s suppose now that you want to scale your content ingestion and production capabilities out to the broader world.  You might want to pull in content from Twitter, Facebook or Quora in real-time.  And let a broad community collaborate together…

    Nah, forget it.  With document-level versioning, that’d be like give everyone a phone and telling them to call each other.

    And then only giving them one phone line.

Changeset Versioning

Fortunately, this problem has been solved.  The solution comes out of the source control world and it is known as distributed “changeset versioning”.

If you’ve ever used Git, Mercurial or any modern source control software, then you’re already familiar with the concept.  It’s been around for awhile and has become extremely popular since it enables folks to work unimpeded, fully distributed and without any of the headaches of file locking and so forth.

It should be noted.  Cloud CMS is the only Content Management System to offer changeset versioning.  We’re it.  Why?  I suppose because it is hard to implement.  

And maybe because everyone else is busy chasing the desktop document problem.  However, if you’ve ever try to build a web or mobile app or tried consuming social content from Twitter, Facebook, LinkedIn, etc… well, then you know it’s all about JSON, XML, object relationships, lots of composite documents, highly concurrent writes and reads and so on!

Only your sales person will believe that a document-versioning system could be used for that purpose!

Changeset Versioning: The Basics

This article by no means intends to provide a Masters thesis on how changeset versioning works.  However, lets delve into the basics!

Let’s start with writing, editing and deleting content.  

When you write content into the Cloud CMS repository, your content gets stored on a “changeset”.  A changeset is a lot like a transparency (from the old transparency projector days).  This is a see-through sheet of plastic that you write on with one of those Sharpie pens.  The projector projects whatever you write up onto the screen.

The cool thing about transparencies is that you can layer them, one on top of the other.  What ends up getting projected is the composite of everything layered together.

So when you write content, the repository basically gets a new transparency and puts your content onto it.

If you make a change, it gets out another transparency, writes your change and layers it on top.

It also does this if you delete something.  It gets out a new transparency, masks (or covers up) your content so that it appears deleted.  

However, your content isn’t really deleted.  It is safe and tucked away somewhere in the stack of transparencies.  It’s just been hidden by the top-most transparency!

You can write as many things onto a changeset (transparency) as you want.  Cloud CMS manages the changesets for you, keeps them in a nice stack and lets you roll back changes if you make a mistake anywhere along the way.

Changeset Versioning: Branches and Merges

As noted, Cloud CMS manages your changesets for you.  The “stack” of changesets is known as a Branch.  As you add more changesets to the branch, the length of the branch gets longer (just like the stack of transparencies gets thicker).

A read operation simple pulls information out of the repository.  A write or a delete adds a new changeset.  Consider the branch shown below.  The reading operation just peeks at the branch looking down from the top.  The writing operation adds a new changeset.

With just a single branch, you can still get into the situation where two people want to change the same file at the same time.  Cloud CMS lets you lock the object and all that kind of thing if you want.  Or, you can create new branches so that everyone can work together at the same time and on the same things.

It kind of looks like this:

Here we have two workspaces.  Each workspace has its own branch which was stemmed off of the Master Branch at changeset V5.  The first user works on Branch A and the second user works on Branch B.  Both Branch A and Branch B have a common ancestor (changeset V5 in the Master Branch).

This allows both users to do whatever they want without stepping on each other’s toes. They can update documents, delete things and create new content.  At any time, they can push and pull changes between their workspace and any other workspace.  This gives them a way to preview what other people are working on and merge their work into their own branches.  They can also merge back to the Master Branch.

Cloud CMS provides an elegant merge algorithm that walks the changeset history tree from the common ancestor on up.  It uses a JSON differencing algorithm to allow for JSON property-level conflicts (as opposed to document level conflicts).  And it provides content model and scriptable policy validation for the merged result.

The result is a highly collaborative experience that encourages your users to experiment and take a shot at contributing without the worry of blocking others or screwing up the master content.

In a future blog, we’ll cover the details of how branching and merging works.  Our approach is one that did not seek to reinvent the wheel but rather ride on top of the wonderful innovation that has already occurred over the last decade within source control tools like Mercurial, Git and Bazaar.

OAuth2, Clients and Authentication Grants

One of the things that I really like about our approach to server authorization is that we’ve elected to get completely behind the OAuth2 specification.

Cloud CMS provides support for all of the OAuth2 flows.  We provide an authorization and resource server so that you can separate concerns and perform the full three-legged “auth code” flow.  Or you can simplify things and use something like a “password” or “implicit” flow depending on the security environment of your application.

For environments like HTML5/JS, we continue to recommend (rather strongly) that you employ a full untrusted “auth code” handshake.

When we started out, we briefly dabbled with OAuth 1.0 before realizing that it was tedious to use (much less implement).  Signatures needed to be computed and passed along with every request.  This meant that if you wanted to serve assets out of the repository directly, information would have to be either encoded within URLs or it would need to be available as a cookie.  And if you were going to store a cookie, then you had to implement some kind of server-side token registry with expiration and refresh in order to offer any kind of assurance of fidelity.  However, to do so meant that we’d be drifting from the spec and doing our own thing.

Fortunately, lots of other vendors were noticing the same shortcomings of OAuth 1.0 (not really shortcomings, per se, but definitely things that were not within the bounds of the specification to address).  OAuth2 represents a best effort by Facebook, Twitter, LinkedIn and a whole host of other vendors to address many of the issues we found we needed to deal with.

Cloud CMS - Clients

We decided early on that we wanted platform owners to be able to create as many OAuth2 “client” key/secret combinations as they wanted.  That way, they could provision client/keys on a per-application basis.  Or they could have a single client/key service a whole bunch of applications.

Furthermore, if for any reason a client/key combination were compromised (as in, some hacker out there figured out your client secret), you could monitor this, identity it and then shut down the client.  Create a new client key/secret, issue it and away you go.

Thus, we let you manage as many client key/secrets as you’d like.

In addition, we let you define on a per client key/secret basis what kinds of features or flows you’d like to enable.  You might restrict certain clients from participating in an untrusted client flow (like the “implicit” flow).  You can just toggle this stuff on and off.

Cloud CMS - Authentication Grants

Another feature that wanted to implement is what we can “authentication grants”.  These are basically alternative “username/password” combinations that you can grant to a principal running on a specified client.  Sounds kind of complicated, right?

The idea is that you often have a mobile app that just wants to sign on to Cloud CMS as a user.  You set up the user ahead of time.  The user might be called “app”.  To sign on, you need to send username and password information over the wire.

Anytime you send password information over the wire, there is a risk.  We fallback to HTTPS (SSL), so the risk in transport (i.e. over the network) is minimized.  OAuth2 requires SSL for this very reason.

However, there is still a risk in the application code itself.  What if someone could crack it open and see what password you set up?  Fortunately, this isn’t very easy to do with compiled application code like native iOS, Android or Appcelerator Titanium code.

But there is a really big problem if you’re using HTML5/JS running the browser.  Basically, anything running in a browser is completely snoop-able.  You don’t have to be a really proficient “hacker” to open up the source code of the application and find the embedded passwords in the <script></script> blocks.

We provide a few tools to protect against this.

One is to create an Authentication Grant object which provisions alternative credentials (a key and secret) that can be used to authenticate as the “app” user against a specific client key/secret combination (and only that combination).  That way, if a snooper were to figure out the Authentication Grant secret, they’d still only be able to use it for a) the “app” user and b) for that specific client.

Thus, if you detect foul play, you can shut down the Authentication Grant.  Worst case, they gain wrongful access to that one user on that one application.  The upside is that they can’t gain greater access to your platform.  They’re constrained, you can detect it, and shut it down.

Another tool is to specify valid Domain URLs from which token-requesting authorization calls are allowed to arrive.  You can constrain those who wish to authenticate for a given Client or Authorization Grant so that they must arrive from a specified set of Domain URLs.  That way, if someone tries to use your Authentication Grant username/password for a completely different application, it won’t work.

This works but it’s also pretty easy to trick.  HTTP headers can be manipulated and things like that.  However, it’s a good safeguard and yet another way that you can detect foul play.

Finally, we offer auto-provisioning of Client keys and Authentication Grant keys running in an implicit “untrusted” capacity.  If you use any of the Gitana drivers, you can elect to have the appropriate client/auth keys sent to you when you connect.  These can then change once application deployment or even once per-connection.  This is really the most full-proof way to bolster security for HTML5/JS applications.

It should also be mentioned that in all cases we fully support OAuth2 refresh tokens and expiration of access tokens.  So all the while, the client code must re-assert the validity of its tokens which offers all kinds of chances to detect and stop tokens from being tampered with.

Cloud CMS supports OAuth2 for everything in its REST API and provides convenience functions with all of its drivers so that connecting and working with Cloud CMS is completely transparent.  You normally don’t have to deal with any of the OAuth2 capabilities under the hood.  But, should you need to, we’ve really given you a good engine so that you can ensure the security of your mobile applications.