Evolving your JSON schema

Cloud CMS is provided in one of two ways - either as a public cloud service or as an on-premise installation (using Docker images). The former absolutely prohibits any database access. The only way you can work with content is through our APIs.

The latter approach still lets you accomplish everything via the APIs but you would have the additional option of querying (read-only) the underlying MongoDB database if you wanted to since it is all under your control.
It is worth noting that even with an on-premise installation, a support contract with Cloud CMS would prohibit you from making DB-level changes. All changes to content should come through our API. We cannot guarantee consistency of data modified outside of our API (such as if you made changes to MongoDB collections under the hood).
With this in mind, let's go through three realistic scenarios using our API to evolve your schema over time:
1. A new required property is added to the JSON schema.
Suppose you have a content definition called "custom:car" with two properties, "model" (a string) and "year" (a number). 
{
"_qname": "custom:car",
"type": "object",
"properties": {
"model": {
"type": "string"
},
"year": {
"type": "number"
}
}
}
And then suppose you have 3 content instances of this type:
- { "model": "toyota-rav4", "year": 2015 }
- { "model": "bmw-m5", "year": 2013 }
- { "model": "subaru-forrester", "year": 2009 }
Now you go to your "custom:car" definition and you add a new property "rating" (a number). You make this property required.
{
"_qname": "custom:car",
"type": "object",
"properties": {
"model": {
"type": "string"
},
"year": {
"type": "number"
},
"rating": {
"type": "number"
}
},
"required": ["rating"]
}
And you save the definition. 
The definition saves successfully. It does so because it is valid JSON schema. Nothing about the definition, it's parent type chain, mandatory features, constraints or other restrictions puts it in conflict with other definitions in the branch's dictionary.
That said, after the save, your three content instances of this type will be deemed to be invalid should they attempt to be updated (saved again). Meanwhile, they're still there and still active. Nothing is deleted or adjusted automatically. The content instances are simply there and waiting for you to do something about them.
You could write a script at this point that finds all content instances of type "custom:car" and touches them. A touch operation will perform a no-change update. The update will occur and the transaction will fail because the required "rating" field is missing on all of these instances. Your script could adjust the JSON to include a rating and then perform the update. After three updates, you are ready to go.
Note: With this approach exactly as described, you're actually committing four change sets. The first is the definition change. The second, third and fourth are the three content item updates. Thus, the operation is spread out across four different transactions.
You may wish to do all of this in one transaction (a single change set that has four nodes on it - the definition and three content instances). In this case, it doesn't matter, but it may matter if you intend to ensure that the Cloud CMS branch is consistent - i.e. nothing is half written or half updated, either the definition and all instances are updated or nothing is. Cloud CMS provides multi-document transactional commits for this purpose (which is something that MongoDB does not do).
Note: If the added property were not "required", the second step of adding the "rating" property to the content instances might not be necessary. This is a lazy commit approach. If the property is optional, you don't have to deal with it until someone actually does decide to update one of the content instances. And then it is up to them, at that time, to determine whether they want that optional property.
2. A property is removed, and we wish to cleanse that property from existing content instances.
This is very similar to #1. Removing the property from the definition can still result in a definition that is deemed to be valid. This validity check is based on whether the dictionary compiles properly during the transaction. Removal of a property does not take into account whether there are existing instances and whether those instances would be valid.
Rather, those instances would remain and each would now have an excess property. Suppose you completed #1 and now decided to remove the "rating" property. The definition would save and now you'd have three content instances which still have the "rating" property. They're still available and work with the API. In fact, they're still technically valid since the "rating" property is meaningless outside of the JSON schema definition (which makes no claim to the rating property). The validator that Cloud CMS uses is configured to ignore extraneous properties and so those 3 content instances would continue to save just fine (unlike scenario #1 where they wouldn't because they would be deemed to be in an invalid state).
Thus, as in scenario #1, you'd now have to reconcile those three content instances. You could use a script or do it by hand but either way, the matter to be settled is what to do with the extra "rating" value that each content instance has. You could delete it or rename it to keep it around. It is up to you.
Note: Within Cloud CMS, it is also possible to register a server-side script (JavaScript) as a behavior that automatically does all of this when the "beforeUpdateNode" policy is triggered. The script could look for instances and deal with the excess properties. However, you still have the question of what to do with the excess property.
3. The datatype on a schema property is changed (e.g. from a string to a specific content-type association) and existing values need to be changed accordingly.
This is also very similar to #1. You might change the "custom:car" definition so that it's "year" property is a string. This will save because it is a valid definition. The three existing content instances will not be hindered and will continue to serve through the API without any impact. However, if you attempted to save one of those content instances, you would get an error because the "year" value on the content instances is a number whereas the definition requires it to be a string.
As before, you could use a script to reconcile this. In the end, this isn't much different than the scenarios above.
4. Deletion of types
Suppose you attempt to delete the "custom:car" definition. This will fail because there are instances of this type. You would have to first delete the three "custom:car" content instances and then delete the "custom:car" definition. Or you can do it all in one fell swoop transaction. In general, we recommend transactions because it leaves your branch data in a consistently good state.
5. Deletion of dependent types
Suppose your "custom:car" extends another definition called "custom:vehicle". If you attempt to delete the definition "custom:vehicle", Cloud CMS will raise an exception since the "custom:vehicle" type has sub-types that are dependent on it. Similarly, you might imagine a "custom:car" that has a feature definition applied to it. The feature definition might be "custom:color" which introduces a "color" property to your car. If you attempt to delete the "custom:color" feature definition, the operation will fail because there are one or more content types using that feature.