API Server

The Cloud CMS API Server is a Java application that launches inside of a Java Servlet Container. The Java application surfaces a REST API as well as backend services and DAOs to support connectivity to Mongo DB, Elastic Search and a slew of Amazon services including S3, SNS, SQS, Route 53, Cloud Front and more.

Properties File

Cloud CMS is primarily configured via a properties file that is auto-detected and loaded when the underlying Spring Framework starts up. This properties file is typically named docker.properties. The file should be loaded from the classpath root (and is typically bootstrapped from a classes directory).

For the most part, this properties file consists of key=value pairs that are static in nature. This key has that value and so on. Values may be strings, numbers and boolean (true/false). You do not need to wrap strings in quotes. Cloud CMS will handle the conversion for you.

In addition, you may wish to pull in environment variables from your Docker container OS. This is useful if you're launching Docker in AWS (or similar) and wish to store sensitive values (such as passwords, access keys, etc) outside of the docker.properties config file. Not only that, but it potentially allows your single Docker configuration to more easily reused across environments (by just changing environment variables).

To pull in environment variables, use the ${VARIABLE} value. For example,

cluster.aws.accessKey=${AWS_ACCESS_KEY}
cluster.aws.secretKey=${AWS_SECRET_KEY}

For information on AWS, keep on reading. This is just an example.

XML File

The Cloud CMS API is a Spring MVC application that leverages Spring Security, Spring Transactions and a lot of other really good services to provide secure and fast request processing. As such, all extensions of the underlying services is done via Spring Bean configuration.

To make this easily extensible for customers, Cloud CMS uses an XML configuration file to let you override and configure beans. For the most part, you won't have to do this as most of the customizations are doing via the .properties file above.

However, there may be times that you wish to really modify how the API behaves by wiring in your own beans. We recommend doing this in a cloudcms-distribution-context.xml file. This file should be located in the classes/gitana/distributions directory.

The file should essentially look like this:

<?xml version="1.0" encoding="UTF-8"?>
<beans
        xmlns="http://www.springframework.org/schema/beans"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xmlns:util="http://www.springframework.org/schema/util"
        xsi:schemaLocation="http://www.springframework.org/schema/beans
                            http://www.springframework.org/schema/beans/spring-beans-4.1.xsd
                            http://www.springframework.org/schema/util
                            http://www.springframework.org/schema/util/spring-util-4.1.xsd">

    <!-- insert your customizations here -->

</beans>

This file is optional but it may be referenced in some of the sections below.

Admin User Password

One of the first things you'll usually want to configure is the admin user's password. The admin user is created when you first start up Cloud CMS. You can set the password like this:

gitana.admin.password=admin

The admin password will be enforced to this value and cannot be accessed or changed at runtime. When you restart the API, the admin password will be reset to the value provided.

Note that we're storing the password in plaintext here. That's often fine for development and testing. But in production, you will generally prefer to take advantage of property encryption to protect your password.

If you use property-level encryption, you will provide the setting more or less like this:

gitana.admin.password.enc=ENC(S/3GXvZYOhr/jgBlTIpT1VSfi34WUeq1gu9JAMEyN3z2URnuEIHcpVwXMEn3ucvKOxHIlkjtyXYqDFqDjnhMP5x+Q0Wnf4Ov4fJ1FXHAom7pKxGJH0IHtvgXeJ1VJ7bYpPyPeOwG3hpBpBNbVIJBbGjv0MDxu8wgokMA7/qR7Vs=)
gitana.admin.password=${gitana.admin.password.enc}

The encrypted value is generated using a public key and can only be read by the Cloud CMS API server itself. You are free to use encryption with any properties that you wish.

Concurrent Request Rate Limits

You can limit concurrent requests on a per-tenant and per-user basis within Cloud CMS. This keeps track of how many HTTP requests are "in-flight" at any given moment. When the number of "in-flight" or concurrent requests exceeds the specified amount, an HTTP 429 status code is returned.

From a web architecture viewpoint, a 429 is a valid status code response and client code that calls into Cloud CMS is expected to handle this gracefully.

To specify rate limits per-tenant, use the following:

gitana.ratelimiting.tenant.enabled=false
gitana.ratelimiting.tenant.defaultMaxConnections=-1

To specify rate limits per-user, use the following:

gitana.ratelimiting.user.enabled=false
gitana.ratelimiting.user.defaultMaxConnections=-1

Guest Mode

Cloud CMS supports an optional guest user account that you can enable to allow your front end applications to emulate anonymous connections into the API. By default, the guest account is disabled.

org.gitana.platform.services.authentication.guest.allowLogin=false
org.gitana.platform.services.authentication.guest.username=guest
org.gitana.platform.services.authentication.guest.password=guest

The guest account works like any other account. The guest user must be granted access rights into your projects, your repositories and your nodes just like any other user before it can be used as intended.

Attempted Logins

By default, Cloud CMS will track the number of attempted logins per account. When the maximum number of attempts is reached, the account will be locked out for a period of time. This helps to prevent brute force password attacks by eliminating the ability for script runners to guess at passwords for a given account.

By default, the limit is 3. If a user attempts to log in more than 3 times in a row, their account is locked for a period of 15 minutes. This is expressed in milliseconds below.

org.gitana.platform.services.authentication.attemptedLoginCheck.enabled=true
org.gitana.platform.services.authentication.attemptedLoginCheck.maxAttempts=3
org.gitana.platform.services.authentication.attemptedLoginCheck.resetTimeout=900000

You may also specify the name of a cookie, request parameter or request cookie that, if present, will disable the attempted login check:

org.gitana.platform.services.authentication.attemptedLoginCheck.disableToken=

System Locales

At a system level, there are two default locale settings that need to be considered.

JVM Default Locale

The JVM Default Locale is used internally for generation of error messages and other logging-level elements. The JVM determines this at startup. If nothing can be determined, it will fall back to using en.

To specify this, set the following environment variable:

export LANG=en_US.utf8

Gitana Default Locale

The Gitana Default Locale is the locale that is used as a default while localizing content retrieved via the API. You can adjust this to have the API always serve back content in a specific locale, such as Spanish, provided that a Spanish set of translation nodes exists for your multilingual content.

You can adjust this by setting the gitana.defaultLocale property in your properties file.

gitana.defaultLocale=en_US

The default is for this setting to be undefined. This means that the API will serve back the master node for any multilingual content. All localization is therefore dynamic and specified via the HTTP call (where locale is passed in via a request parameter or header).

Auditing

The Cloud CMS auditing service tracks every operation against auditable objects within the system and writes those operations to a special Audit collection. This Audit collection provides a reliable capture of what users did within your system and records every method invocation, including:

  • the method invoked
  • arguments passed to the method
  • the value returned from the method
  • the invoker of the method
  • when the method was invoked
  • exceptions that were raised

To enable auditing:

org.gitana.repo.audit.AuditService.enabled=false

Logging

The API uses Log4j2 as its logging engine and provides a configuration-based way for you to customize the logging of various sub-services within the product. Each logger within the product logs at a given log level. You can increase and/or reduce the amount logging for each logger by adjusting its respective log level.

Custom Log Levels

To customize the log levels, add a log4j2-docker.xml file to the classpath. This file should sit at the root of the classpath. If you're unfamiliar with how to mount this into Docker, take a look at the quickstart example provided in the Cloud CMS Docker distribution. It provides a sample log4j2-docker.xml file with blocks of configuration that you can easily comment out to get started quickly with debugging.

The log4j2-docker.xml file provides a way for you to specify the Log4j2 log level for individual service beans or entire packages of services at once.

By default, the log levels within Cloud CMS are pretty conservative and are optimized for production usage. However, if you're running Cloud CMS in development or would like to get more log information to diagnose a problem, you can adjust the log4j2-docker.xml configuration to provide the level of granularity that you seek. Often, this involves adjust log levels from INFO to DEBUG.

Note that DEBUG should only be used while debugging or in development. We do not recommend running Cloud CMS with DEBUG logging in place on production. It will produce far too many logs and will also run more slowly.

Let's say you wanted to enable more logging for the rules engine within the product. You could do that with a log4j2-docker.xml file like this:

<?xml version="1.0" encoding="UTF-8"?>
<Configuration>
    <Loggers>

        <!-- Set all classes in the org.gitana.platform.services.rule package to log with DEBUG -->
        <Logger name="org.gitana.platform.services.rule" level="DEBUG"/>

    </Loggers>
</Configuration>

Log Outputs (stdout / stderr)

Several log files are maintained by the API server Java application. These can be accessed by exposing the log folder as a "volume" on the API server's docker container. Update the docker-compose.yml config file so that the folder: /opt/tomcat/logs is available to the docker host system.

This example mounts the logs folder to a host folder named "api-logs":

api:
    build: ./api
    networks:
        - cloudcms
    depends_on:
        - mongodb
        - elasticsearch
    env_file:
        - ./api/api.env
    ports:
        - "8080:8080"
    volumes:
        - ./api-logs:/opt/tomcat/logs

After running a new build (docker-compose build --force-rm) the folder will be created and the log files will be visible to the host.

The default server logging goes to cloudcms.log.

If you require logging of each API call to the API server you can monitor the Tomcat access logs: localhost_access_log.*.txt

Access Logs

Every request and response that the API processes is logged into its own Log4j2 appender. You can customize this appender to have those entries write to a different location, format differently, rollover to disk or even roll over to S3 periodically.

By default, the access logs appender (named RequestFile) is defined like this:

<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="info" packages="org.gitana.platform.services.log">
    <Properties>
        <Property name="baseDir">logs</Property>
    </Properties>
    <Appenders>
        <RollingFile name="RequestFile" fileName="${baseDir}/cloudcms-requests.log" filePattern="${baseDir}/cloudcms-requests/cloudcms-requests-%d{yyyy-MM-dd-HH-mm}-%i.log" append="false">
            <PatternLayout>
                <pattern>%msg%n</pattern>
            </PatternLayout>
            <Policies>
                <OnStartupTriggeringPolicy/>
                <TimeBasedTriggeringPolicy interval="30"/>
                <SizeBasedTriggeringPolicy size="256 MB"/>
            </Policies>
            <DefaultRolloverStrategy max="20"/>
        </RollingFile>
    </Appenders>
</Configuration>

You can override these settings via the same process as described above in which you add your own log4j2-docker.xml file. This file will be picked up by Cloud CMS and its settings will be merged into an overall Log4j2 configuration set. Specifically, you can redeclare the RequestFile RollingFile implementation and Cloud CMS will let your implementation override the one provided out-of-the-box.

Upon making changes, make sure you to rebuild the API docker container (docker-compose build --force-rm) and restart.

You should now see JSON objects describing API calls written to the logs folder as cloudcms-requests.log.

Console Logging of Access Logs

Access logs are also written to the console log by default. You can control the formatting of this console log by using the org.gitana.platform.services.request.console.template property.

The default value looks like this:

org.gitana.platform.services.request.console.template=[  ]<->:   

This uses a Handlebars to produce the output text and you're free to modify this as you please. The following variables are available:

  • id - the ID of the request
  • count - the integral count of the request
  • host - the request host
  • tenant - the tenant identifier (either tenant ID or tenant slug) for the request
  • accessToken - the access token being used by the request
  • method - the request method
  • path - the request path
  • startDate - when the request was started
  • userName - the name of the user making the request
  • userId - the ID of the user making the request
  • userEmail - the email of the user making the request
  • headers - a stringified JSON object dump of the request headers
  • params - a stringified JSON object dump of the request parameters

For each request, there are two logging events for both the "request" phase and the "response" phase. The kind variable can be used to determine or log which phase is being logged:

  • kind - the kind of event (either req for request or res for response)

For the response phase, the following is additionally available:

  • executionTime - the amount of time (in ms) required to service the request within the API
  • contentType - the mimetype of the response to the request
  • status - the status code response for the request
  • size - the size of the response to the request
  • stats - a summary string that looks like ( ms, , )

Given the default setting, shown above, the RequestConsole appender will write entries for two events (request phase and response phase) and will log out like this:

[2019-10-22T02:20:17Z api appuser-1c980e98e0f5078f1f55]<10-req>: POST /applications/f0f3632a9ba6b7d0834d/settings/query 
[2019-10-22T02:20:17Z api appuser-1c980e98e0f5078f1f55]<10-res>: POST /applications/f0f3632a9ba6b7d0834d/settings/query (21 ms, 92, 200)

S3 Rollover

Cloud CMS includes an S3RolloverStrategy implementation that you can use to have your access logs rollover to S3 (in addition to rolling over on disk). This provides a convenient way to get your access logs off the server. And if you're running in a cluster, this provides a way to get your logs all collected into a single place.

To use this strategy, you simply need to add it to your RollingFile appender. This strategy takes a few arguments:

  • accessKey: the AWS account access key
  • secretKey: the AWS account secret key
  • region: the S3 region
  • bucket: the S3 bucket name
  • prefix: a prefix to append ahead of any created keys (such as cloudcms/production)

You can use a Properties block to accomplish this elegantly as shown in the code below:

<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="info" packages="org.gitana.platform.services.log">
    <Properties>
        <Property name="REQUESTS_S3_ACCESS_KEY"></Property>
        <Property name="REQUESTS_S3_SECRET_KEY"></Property>
        <Property name="REQUESTS_S3_REGION">us-east-1</Property>
        <Property name="REQUESTS_S3_BUCKET"></Property>
        <Property name="REQUESTS_S3_PREFIX">cloudcms/production</Property>
    </Properties>
    <Appenders>
        <RollingFile name="RequestFile" fileName="${baseDir}/cloudcms-requests.log" filePattern="${baseDir}/cloudcms-requests/cloudcms-requests-%d{yyyy-MM-dd-HH-mm}-%i.log">
            <PatternLayout>
                <pattern>%msg%n</pattern>
            </PatternLayout>
            <Policies>
                <OnStartupTriggeringPolicy/>
                <TimeBasedTriggeringPolicy interval="30"/>
                <SizeBasedTriggeringPolicy size="10M"/>
            </Policies>
            <S3RolloverStrategy accessKey="${REQUESTS_S3_ACCESS_KEY}" secretKey="${REQUESTS_S3_SECRET_KEY}" region="${REQUESTS_S3_REGION}" bucket="${REQUESTS_S3_BUCKET}" prefix="${REQUESTS_S3_PREFIX}" />
        </RollingFile>
    </Appenders>
</Configuration>

API Pagination

By default, any API calls that support pagination will be given a paginated limit of 25. In other words, if your API call doesn't specify how many records it wants back, it will get back 25 records at most. Cloud CMS does this to help protect against code that errantly forgets to include pagination in cases where you have very large record sets. Very large record sets implies lots of time to execute, lots of memory consumed and so on.

gitana.api.pagination.limit.default=25

Be on the safe side and specify a limit in your paginated calls.

In addition, Cloud CMS will enforce a maximum pagination limit of 1000. If you try to retrieve more than 1000 records in a result set, your result will be capped. We cap this at 1000 by default though you're free to change this in your own Docker installations to suit your needs:

gitana.api.pagination.limit.max=1000

If you need to retrieve more than 1000 results, we recommend making multiple calls to paginate through the total set using the limit and skip options. See our documentation on Pagination.

Clustering

Each API server that spins up supports clustering. When an API server comes online, it searches for other API servers that might be out there and part of the same cluster. If it finds any, it connects to them and redistributes any cache state to balance out the cluster.

Similarly, if a Cloud CMS API server goes offline, the other servers in the cluster become aware and re-balance as needed. In this way, a Cloud CMS API cluster is an ephemeral thing - servers may join and leave the cluster as demand increases or falls away.

Every API server, regardless of whether you intend it to participate in a multi-server cluster or not, requires that you provide a cluster.group.name and a cluster.group.password:

cluster.group.name=
cluster.group.password=

You can set these to anything you like. However, if you do intend to have other API servers join the cluster, they will also need to specify the same cluster.group.name and cluster.group.password.

When Cloud CMS starts up, it will bind a port that it will use to communicate with other cluster members (should they come along). As mentioned, even if your cluster is just size 1, you will need to bind this port.

cluster.port=5800

The default port is 5800 if you don't otherwise specify it. This means that any other API servers can communicate with this API server instance on port 5800.

Cloud CMS will assume the IP address of the first network interface it spots. In most cases, this will resolve to localhost or 127.0.0.1 for development or simple server configurations. For more complex configurations where you may have multiple network interfaces, you can specify which interface(s) should be considered via the following configuration:

cluster.interfaces.enabled=true
cluster.interfaces.interface=

The cluster.interfaces.interface property can either be the IP address of the interface or a comma-delimited value of multiple interfaces to consider in order.

Note that the use of interfaces applies most specifically when using multicast or tcpip discovery providers. For other provider types, such as aws or zookeeper, you will likely not need to bother with this.

In cloud deployment scenarios, where containers are distributed across multiple hosts (and across availability zones), you will need to provide a publicly accessible IP address per server instance. This public-address is an IP address which other Cloud CMS API Servers can use to connect to -this- API server.

cluster.public-address=

The public address identifies the public IP address of the box as seen by the outside world. If you have two EC2 instances, with IP addresses 12.34.56.788 and 12.34.56.789, each box will have a slightly different configuration with cluster.public-address=12.34.56.788 for the first box and cluster.public-address=12.34.56.789 for the second.

Suppose box #1 comes online first. It will bind to 12.34.56.788:5800. Now suppose that box #2 comes online. It will bind to 12.34.56.789:5800. If configured properly, it will then call out to try to find other boxes that are members of the same cluster. It will find the first box and connect to it on 12.34.56.788.

Thus, the public address is important as it establishes an IP address that the outside world can use to connect to your API server. This IP address must be publicly accessible.

If you're configuring this and get stuck, use Telnet to make sure that your cluster port can be connected to. It must be available from one API server to the other.

Note that the public-address property is typically required for aws, zookeeper and other dynamic and cloud-friendly deployment scenarios.

By default, Cloud CMS will perform the handshake described above. If it can't find something on 5800, it'll make another attempt at 5801. Then 5802 and so on. It will try three times and if nothing works, it'll consider things to have failed. This feature is provided to make it easier to have things work in cases where there are minor port conflicts or multiple API containers running on a single host.

cluster.port-auto-increment=true

Note that port auto incrementing is enabled by default.

Cloud CMS's clustering and discovery of other API servers runs automatically at startup. It then remains active throughout the lifetime of the server. As other servers come and go, your API server will log messages to indicate that it has discovered a new member or that a member left.

By default, Cloud CMS will wait 10 seconds upon startup to "allow things to settle". For low latency or typical environments, this is more than sufficient. However, for slow network scenarios, you may wish to increase this.

cluster.initial.wait.seconds=10

Cluster Discovery

When Cloud CMS servers are brought online, they discover one another and then get on about business. The exact strategy used to discover one another is configurable. The following discovery services are available:

  • Multicast
  • TCP/IP
  • Amazon Web Services
  • ZooKeeper

By default, the Cloud CMS API server is configured to use Multicast. This means that will work out-of-the-box for scenarios where containers are launched against the host's network (such as when Docker is launched with -Dnet=host`).

However, for cases where containers are launched on their own network or launched as part of a Docker Machine running on a cloud provider (such as EC2), you will need to use a different discovery mechanism.

Interface Configuration

Depending on how Docker is set up, you may have multiple network interfaces defined. For example, if we bash into the container running the API, we can run:

ip address

And we may see something like:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1
    link/ipip 0.0.0.0 brd 0.0.0.0
3: ip6tnl0@NONE: <NOARP> mtu 1452 qdisc noop state DOWN group default qlen 1
    link/tunnel6 :: brd ::
121: eth0@if122: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:ac:13:00:05 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.19.0.5/16 brd 172.19.255.255 scope

This indicates that there are multiple network addresses set up. In this case, when Cloud CMS starts up, it will make a best guess about which network interface to bind to. This isn't ideal because while Cloud CMS is very clever and will make a good guess, it may still get this wrong.

For example, from the list above, Cloud CMS might thoughtfully choose 127.0.0.1. That's the first one. And if the Docker containers were running in host mode, this could work. However, if you're running Docker in bridge mode (which is the mode that most of the Cloud CMS quickstart kits ship with), then this will result in a situation where each cluster member is bound to a loopback address which won't let them see each other.

To resolve this, we want to make sure that the cluster members bind to the 172.19.0.5 address. That's for this specific case. We won't know the precise IP address of course, since Docker assigns those dynamically, but we do know that it will be a wild card value essentially matching 172.*.*.*.

Thus, we may opt to adjust the docker.properties file to include:

cluster.interfaces.enabled=true
cluster.interfaces.interface=172.*.*.*

That way, when the API servers start up, they'll bind to their respective 172.*.*.* address and will be able to see each other.

Discovery Services

The following Discovery Services are available:

Multicast

Cloud CMS supports multicast for scenarios where containers are deployed to the same network as the host. In this case, everything (host and containers) are bound under the same network interface and multicast can be used as a communication mechanism between the Cloud CMS API servers.

Multicast is typically very good for development servers or even test servers that run on top of a simplified or common network configuration. When Docker is launched with the -Dnet=host option, the network used by the containers is the same as that of the host and multicast applies.

cluster.multicast.enabled=true
cluster.multicast.group=224.2.2.3
cluster.multicast.port=54327
cluster.multicast.loopbackmode.enabled=true

In anything that is production grade (such as cloud deployments), you will typically have your containers running is isolated network environments and so multicast will simply not work.

Suppose, for example, that you have multiple API servers defined in your Docker setup. With multicast, you don't have worry about how many -- they can all use the same configuration. We might set it up like this:

cluster.multicast.enabled=true
cluster.multicast.group=224.2.2.3
cluster.multicast.port=54327
cluster.multicast.loopbackmode.enabled=true

cluster.interfaces.enabled=true
cluster.interfaces.interface=172.*.*.*

Note that we're forcing both servers to look to the 172.*.*.* network interface (instead of the loopback). Please see the section above on interfaces for more information. By providing an explicit interface, we remove the possibility of Cloud CMS auto-configuring for the wrong network interface.

TCP/IP

Cloud CMS provides a way for you to name all of the servers explicitly that are participating in your cluster. The members field in the tcpip config lets you explicitly spell out the <ip>:<port> network reachable addresses for the API server.

In this way, if you wish to have no dynamic or automatic discovery at all, you can use tcpip discovery to spell out all participants from the beginning.

cluster.tcpip.enabled=true
cluster.tcpip.members=
cluster.tcpip.timeout.seconds=-1

In general, you should think of tcpip discovery mode as a fallback if nothing else works. Running in tcpip mode trades off any dynamic detection of servers except for those that start up on the given list of IP addresses. This can be good for some scenarios but for most "cloudy" deployments, you will want to use a different discovery services such as aws or zookeeper.

Suppose, for example, that you have two API servers defined in your Docker setup. One container might be called api1 and the other might be called api2. These containers, if running on the same host, might have ports mapped for 5801 and 5802 respectively.

The cluster could be configured using TCP/IP like this:

cluster.tcpip.enabled=true
cluster.tcpip.members=api1:5801,api2:5802
cluster.tcpip.timeout.seconds=-1

cluster.interfaces.enabled=true
cluster.interfaces.interface=172.*.*.*

Note that we're forcing both servers to look to the 172.*.*.* network interface (instead of the loopback). Please see the section above on interfaces for more information. By providing an explicit interface, we remove the possibility of Cloud CMS auto-configuring for the wrong network interface.

Amazon Web Services

If you're running your containers on Amazon AWS (EC2), then you will likely want to take advantage of the aws discovery service. To use this service, just provide the following:

cluster.aws.enabled=true
cluster.aws.accessKey=
cluster.aws.secretKey=
cluster.aws.region=
cluster.aws.timeout.seconds=-1
cluster.aws.tag.key=
cluster.aws.tag.value=
cluster.aws.iamrole=
cluster.aws.securitygroup=
cluster.aws.portrange=5800-5820

When your API server starts up, it connects to Amazon's API to find other EC2 instances that came online and were running API servers for the same cluster group and password. It then auto-configures for the public IP addresses and ports of those API servers.

This is a very nice and efficient mechanism for discovery. It gives you the advantages of elastic instances (you can add and remove instances on the go) and avoids the need to detect and wire in IP addresses ahead of time.

The required properties are:

cluster.aws.enabled=true
cluster.aws.region=

You must also specify either an access/secret key pair (cluster.aws.accessKey and cluster.aws.secretKey) or an IAM role (cluster.aws.iamrole).

The user associated with the IAM role should have the "ec2:DescribeInstances" policy. See the Hazelcast documentation.

All other properties are optional:

  • cluster.aws.timeout.seconds specifies the maximum amount of time to wait for a member of the cluster to discover another member of the cluster.
  • cluster.aws.portrange specifies a port or a range of ports that will be scanned when discovering any cluster members

An API server will make a socket request to each IP address using the port range defined by cluster.aws.portrange.
If cluster.aws.portrange=5800, then it will do a single request to each IP address at that single port.
If cluster.aws.portrange=5800-5899, then it will make 100 socket requests to each IP address (one for each port).

Filtering

By default, the EC2 discovery process looks across all of the EC2 instances in your region. It connects to each one and tries to authenticate using the cluster group name and password. If the EC2 instance doesn't respond or the cluster parameters do not match, the EC2 instance is filtered out. The remaining set after filtering comprises the cluster members.

For efficiency purposes, you can improve this lookup process by filtering. You can filter either on tags or security groups:

  • cluster.aws.tag.key and cluster.aws.tag.value filters to only consider EC2 instances with a matching tag key/value pair.
  • cluster.aws.securitygroup filters to only include EC2 instances in a given security group.
Public Address

There are several ways to launch Cloud CMS within AWS. These may include: the AWS Docker Engine for Docker Machine, Amazon's Elastic Container Service, Amazon Elastic Beanstalk or perhaps another way. Depending on how you launch, you'll end up with Docker containers running in a network environment over which you will have some degree of control.

The ideal scenario is one where the networking environment is running in host mode so that the container's networking environment is shared with the host. In this configuration, the default network interface is usually the public interface. The clustering mechanism usually picks up the right "public" IP address in this scenario.

However, in some environments (such as when using multiple container Elastic Beanstalk Dockerrun.aws.json files to launch), you'll simply be given a network environment. These may be bridged environments with multiple interfaces. In these scenarios, it isn't always possible for the clustering mechanism to pick out the right public IP address.

Furthermore, AWS maintains the notion of "public" IP addresses vs. "private" IP addresses for your EC2 instances. In most cases, what you're looking to use is the "private" IP address and the clustering mechanism out to be able to pick it out for you.

However, at times when it cannot, you may need to set the cluster.public-address property to nudge things along. This property tells your container how to identify itself to other members in the cluster. Suppose you have Container A and Container B. Container B starts up and calls out to Container A to see if it is a member of the cluster. Container A must reply with a "public address" that matches what Container B expects. For simple networking configurations, this is trivial and automatic. However, if you have multiple interfaces, it is possible for Container B to pick the wrong interface and thus the wrong IP address in response.

In these scenarios, you can force the public address to the private IP of your EC2 instance like this:

cluster.public-address=ec2:private-ip

If you want to force to the public IP, you can do so like this:


cluster.public-address=ec2:public-ip
Configuring for EC2 or ECS

In order for EC2 Discovery to work within an AWS EC2 or ECS Cluster, you will need to accomplish the following:

  1. Ensure that your API docker container starts up in host mode (not bridge)
  2. Ensure that the API selects the correct network interface when it starts up.

When using ECS, make sure that your taskdef.json file starts the container in host mode by setting the following:

"networkMode": "host"

If you're launching in docker-compose, you can adjust your docker-compose.yml file to feature something like this:

services:

  api:
    build: ./api
    network_mode: host
    env_file:
      - ./api/api.env
    ports:
      - "80:8080"
      - "5800:5800"

If you're launching using the docker command, make sure to pass --network=host as an argument to make sure the container launches with host networking.

Adjust your docker.properties file to make sure Cloud CMS binds to the correct interface, like this:

cluster.interfaces.enabled=true
cluster.interfaces.interface=10.0.*.*

Please note that the 10.0.*.* value depends on your CIDR block definition. If more than one subnet or custom VPC is used for cluster, it should be checked that container instances within the cluster have network connectivity or have tracepath to each other.

ZooKeeper

If you're running your containers in the cloud and are either not running on AWS or elect not to use AWS EC2 Discovery Services for any reason, then you may choose to use ZooKeeper as an alternative.

Apache ZooKeeper provides a directory service whereby Cloud CMS API Servers register themselves as they come online. When additional services come online, they discover previous servers and so on.

The following properties must be set:

cluster.zookeeper.enabled=true
cluster.zookeeper.url=
cluster.zookeeper.path=

In effect, ZooKeeper provides the same mechanism as AWS Discovery Services but will work for any cloud provider. The only caveat is that you must run ZooKeeper yourself as part of your environment so that any API servers running can connect and utilize it as a service.

Debugging Cluster Configuration

Cloud CMS uses Hazelcast under the hood to establish and configure the cluster. This uses Log4j2 for logging.

As such, you can configure logging for the entire Hazelcast top-level package by configuring a logger for the com.hazelcast namespace, like this:

<?xml version="1.0" encoding="UTF-8"?>
<Configuration>
    <Loggers>

        <Logger name="com.hazelcast" level="DEBUG"/>

    </Loggers>
</Configuration>

To increase logging from Hazelcast’s member joining mechanism, you can do something like this:

<?xml version="1.0" encoding="UTF-8"?>
<Configuration>
    <Loggers>

        <Logger name="com.hazelcast.impl.TcpIpJoiner" level="DEBUG"/>

    </Loggers>
</Configuration>

For more information on how to set up custom Log4j2 loggers, see the section on Custom Logging and read up on how to add your own log4j2-custom.xml files to the classpath.

Starting with version 3.2.36 of Cloud CMS, the API container now includes OpenJDK 11. This version of OpenJDK includes the addition of Java Modules and includes some enhanced security around the use of internal JVM classes. Hazelcast attempts to access some internal classes for optional optimization purposes and so the following warning will appear in your logs:

WARNING: Hazelcast is starting in a Java modular environment (Java 9 and newer) but without proper access to required Java packages. Use additional Java arguments to provide Hazelcast access to Java internal API. The internal API access is used to get the best performance results. Arguments to be used:
 --add-modules java.se --add-exports java.base/jdk.internal.ref=ALL-UNNAMED --add-opens java.base/java.lang=ALL-UNNAMED --add-opens java.base/java.nio=ALL-UNNAMED --add-opens java.base/sun.nio.ch=ALL-UNNAMED --add-opens java.management/sun.management=ALL-UNNAMED --add-opens jdk.management/com.sun.management.internal=ALL-UNNAMED

This warning message is expected and is not an error. It does not impact the performance or security of the Cloud CMS API application.

For more information on Hazelcast and Java 9 module support, see https://docs.hazelcast.org/docs/3.11/manual/html-single/index.html#running-in-modular-java.

For further information from the OpenJDK project, see http://openjdk.java.net/jeps/261.

The OpenJDK project notes that:

The --add-exports and --add-opens options must be used with great care. You can use them to gain access to an internal API of a library module, or even of the JDK itself, but you do so at your own risk: If that internal API is changed or removed then your library or application will fail.

As such, we've elected to leave the default OpenJDK policies in place and just let the warning message be. You should feel free to ignore it. Our expectation is that Hazelcast will resolve this in a future release of their library.

Note that you may also see a warning messages similar to this:

WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by com.hazelcast.internal.networking.nio.SelectorOptimizer (file:/opt/tomcat/webapps/ROOT/WEB-INF/lib/hazelcast-3.12.jar) to field sun.nio.ch.SelectorImpl.selectedKeys WARNING: Please consider reporting this to the maintainers of com.hazelcast.internal.networking.nio.SelectorOptimizer WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release

These warning messages should only appear once and can be ignored as well. They're related to the above. It's just the JDK indicating the Hazelcast attempted to access an internal class (sun.nio.ch.SelectorImpl). Hazelcast is working to resolve this and we expect that they soon will.

Binary Storage

Cloud CMS lets you configure one or more back-end Binary Storage providers that the system will use to persist and retrieve binary files for a given datastore. Binary Storage providers are sensitive to datastore scoped configurations allowing tenants to customize storage on a per-datastore, per-project and per-platform basis.

At its core, Binary Storage providers are configured at the Spring bean level. You can define as many Binary Storage providers as you wish. Each Binary Storage provider instance is a singleton that plays the role of manufacturing Binary Storage instances when requested by upstream services. The provider framework takes on responsibility for caching storage instances where appropriate.

Binary Storage providers are used to binary files and attachments where needed in the product. Attachments to content nodes, for example, are stored via a Binary Storage provider. As are images attached to principals, archives, projects and more.

By default, Cloud CMS has several providers wired for you out-of-the-box so as make it simple to set up global binary persistence to a single location. This is the most common use case and it also doesn't preclude further configuration and customization later.

The following Binary Storage provider types are available out-of-the-box:

  • mongodb
  • file
  • AWS_S3
  • caching
  • fallback
  • s3gridfs
  • azure_blob

To configure the global Binary Storage provider, set the following property to one of the values above:

org.gitana.platform.services.binary.storage.provider={providerType}

If not otherwise specified, the default is to use a provider type of mongodb (which is for GridFS).

Mongo DB (Grid FS) provider

This provider is configured by default and will be used if you don't override the global setting.

GridFS is a "file system" implementation that is provided by Mongo DB whereby binary files are written into Mongo DB. The advantage here is that your files will reside in Mongo DB (keeping everything in one place) and will enjoy all of the replica set and shard architecture advantages that Mongo DB affords. Furthermore, GridFS is low latency and works nicely in a clustered or distributed setting where all servers in the cluster can fall back on it as a single source of the truth.

One downside with GridFS is that, since everything goes into MongoDB, your MongoDB data partitions and volumes need to grow to accommodate the total storage size of the binary files. If you start putting really big binary files into Cloud CMS, your MongoDB storage requirements will increase likewise. This can have cost implications and may introduce some challenges from a DevOps perspective in terms of managing EBS volumes and the like.

GridFS is a good solution if your total binary file size is predictable and manageable.

To enable the mongodb GridFS as the global Binary Storage provider, set the following:

org.gitana.platform.services.binary.storage.provider=mongodb

Client Encryption

Cloud CMS can automatically encrypt any files written to Mongo DB Grid FS using either symmetric or asymmetric keys. This ensures that the files stored inside of Grid FS are encrypted at rest and encrypted as they are transferred between GridFS and the Cloud CMS API server.

To enable symmetric encryption using an AES shared secret, you will need to add the following properties:

org.gitana.platform.services.binary.storage.provider.mongodb.encryption.enabled=true
org.gitana.platform.services.binary.storage.provider.mongodb.encryption.keyPath=

Where keyPath is the path to a file on disk that holds your AES encryption file.

To enable asymmetric encryption using public and private RSA keys, you will need to add the following properties:

org.gitana.platform.services.binary.storage.provider.mongodb.encryption.enabled=true
org.gitana.platform.services.binary.storage.provider.mongodb.encryption.publicKeyPath=
org.gitana.platform.services.binary.storage.provider.mongodb.encryption.privateKeyPath=

Where publicKeyPath is the path to a file on disk that holds your RSA public key and privateKeyPath is the path to a file on disk that holds your RSA private key.

File System provider

For non-clustered environments where you only have a single Cloud CMS API server, the file system Binary Storage provider is available and will let you store all of your binary files on local disk. This is ideal for development boxes.

This provider is not distributed or cluster-aware. It cannot be used in clustered API deployments in any capacity since each server in the cluster is essentially managing its own local store.

To enable the file system as the global Binary Storage provider, set the following:

org.gitana.platform.services.binary.storage.provider=file

You must also specify the storagePath where files are to be written:

org.gitana.platform.services.binary.storage.provider.file.storagePath=/data/cms/binaries

This directory must exist and the API process must have sufficient read/write privileges to the directory.

Client Encryption

Cloud CMS can automatically encrypt any files written to the file system using either symmetric or asymmetric keys. This ensures that the files stored on disk by this provider are encrypted at rest at a file-level.

To enable symmetric encryption using an AES shared secret, you will need to add the following properties:

org.gitana.platform.services.binary.storage.provider.file.encryption.enabled=true
org.gitana.platform.services.binary.storage.provider.file.encryption.keyPath=

Where keyPath is the path to a file on disk that holds your AES encryption file.

To enable asymmetric encryption using public and private RSA keys, you will need to add the following properties:

org.gitana.platform.services.binary.storage.provider.file.encryption.enabled=true
org.gitana.platform.services.binary.storage.provider.file.encryption.publicKeyPath=
org.gitana.platform.services.binary.storage.provider.file.encryption.privateKeyPath=

Where publicKeyPath is the path to a file on disk that holds your RSA public key and privateKeyPath is the path to a file on disk that holds your RSA private key.

Amazon S3 provider

Cloud CMS can be configured to write and read binary files from Amazon S3 directly. This enables your API cluster to use Amazon's scalable S3 storage without concern for growth in local disk volumes. Since each server in the cluster communicates to S3 and uses S3 as a common resource, this Binary Storage provider implementation is cluster-safe and ready to go.

To enable the AWS_S3 backend as the global Binary Storage provider, set the following:

org.gitana.platform.services.binary.storage.provider=AWS_S3

You must then specify the Amazon API keys to use and any additional properties for the S3 connection pool:

org.gitana.platform.services.binary.storage.provider.AWS_S3.accessKey=
org.gitana.platform.services.binary.storage.provider.AWS_S3.secretKey=
org.gitana.platform.services.binary.storage.provider.AWS_S3.bucketName=
org.gitana.platform.services.binary.storage.provider.AWS_S3.region=
org.gitana.platform.services.binary.storage.provider.AWS_S3.maxConnections=500

One downside to using S3 directly is latency. Since every binary operation requires a network connection back to S3, latency will eventually prove to be an issue. Fortunately, Cloud CMS provides per-server local caching for optimized performance. Read on to learn more about this.

Client Encryption

Cloud CMS can automatically encrypt any files written to Amazon S3 using either symmetric or asymmetric keys. This ensures that the files stored within S3 by this provider are encrypted at rest at a file-level. It also ensures that those files will be encrypted as they travel over the wire back to the API server.

To enable symmetric encryption using an AES shared secret, you will need to add the following properties:

org.gitana.platform.services.binary.storage.provider.AWS_S3.encryption.enabled=true
org.gitana.platform.services.binary.storage.provider.AWS_S3.encryption.keyPath=

Where keyPath is the path to a file on disk that holds your AES encryption file.

To enable asymmetric encryption using public and private RSA keys, you will need to add the following properties:

org.gitana.platform.services.binary.storage.provider.AWS_S3.encryption.enabled=true
org.gitana.platform.services.binary.storage.provider.AWS_S3.encryption.publicKeyPath=
org.gitana.platform.services.binary.storage.provider.AWS_S3.encryption.privateKeyPath=

Where publicKeyPath is the path to a file on disk that holds your RSA public key and privateKeyPath is the path to a file on disk that holds your RSA private key.

To enable Amazon KMS (Key Management Services), where KMS manages your public and private keys, you will need to add the following properties:

org.gitana.platform.services.binary.storage.provider.AWS_S3.encryption.enabled=true
org.gitana.platform.services.binary.storage.provider.AWS_S3.encryption.kmsDefaultCustomerMasterKeyId=
org.gitana.platform.services.binary.storage.provider.AWS_S3.encryption.kmsRegion=

Where kmsDefaultCustomerMasterKeyId is the alias or identifier of your KMS Master Key set and kmsRegion is the region (such as us-east-1) where the KMS service is running.

Caching provider

A caching provider lets you add cluster-aware caching to any other provider. It wraps an existing provider with caching so that binary files are written to local disk and served back from local disk whenever possible. As assets are updated, local disk cache is maintained and purged as needed across the cluster.

org.gitana.platform.services.binary.storage.provider=caching

You must set up caching like this:

org.gitana.platform.services.binary.storage.provider.caching.primaryProviderType=AWS_S3
org.gitana.platform.services.binary.storage.provider.caching.cachePath=/data/cms/binaries

In this example, the caching provider is set up to wrap around the AWS_S3 provider to offer cluster-aware, disk-based caching. Binary files are written to disk at /data/cms/binaries.

If cachePath is not provided, a temp directory path will be used (and cached disk state will not survive server restarts).

Fallback provider

The fallback Binary Storage provider provides a safe way to layer a "master" provider on top of another (or several others) with the objective of gradually migrating binary dependencies from the other providers to the master. The fallback Binary Storage provider takes a list of providers and binds them together into this configuration.

The first provider in the list is the "master" provider. The master provider is the preferred provider.

  • When binary files are read, the master provider is consulted first. If the file isn't found there, the other providers are consulted in turn. If none of the providers have the file, a 404 is returned. If any of the providers have the file, it is streamed back.
  • When binary files are created or updated, the master provider receives the file.
  • When binary files are deleted, they are deleted from ALL providers.

This is ideal for situations where you may have data already existing in one provider (GridFS) and want to transition to using another provider (S3). In this case, you'd set S3 as the primary provider and GridFS as the other provider in the list. Binary data that cannot be found in S3 will fallback to being served from GridFS. However, any new binary data going forward will be written solely to S3.

To enable the fallback provider as the global Binary Storage provider, set the following:

org.gitana.platform.services.binary.storage.provider=fallback

And then configure the provider like this:

org.gitana.platform.services.binary.storage.provider.fallback.providerTypes=AWS_S3,mongodb

S3 with Grid FS fallback and file caching

A specific provider is offered out-of-the-box to support S3 with file caching turned on. It further supports GridFS as a fallback in case binary content pre-existed therein.

org.gitana.platform.services.binary.storage.provider=s3gridfs

You must then configure the provider like this:

org.gitana.platform.services.binary.storage.provider.s3gridfs.cache=true
org.gitana.platform.services.binary.storage.provider.s3gridfs.cachePath=/data/cms/binaries

If cachePath is not provided, a temp directory path will be used. Note that a temp directory means that the local disk cache (per server) will not survive server restarts. First requests for binary assets to newly started servers will pull down from S3 and begin rebuilding the cache anew.

We recommend using s3gridfs in production and as a simplified means of configuring S3 in production clusters.

One additional requirement for the s3gridfs provider is that you register a custom Spring bean that defines the S3 settings. This is done by adding some XML to the cloudcms-distribution-context.xml file. The custom bean should look like this:

<util:map id="defaultBinaryStorageConfiguration">
    <entry key="accessKey" value=""/>
    <entry key="secretKey" value=""/>
    <entry key="bucketName" value=""/>
    <entry key="region" value=""/>
</util:map>

You will need to fill in the values for accessKey, secretKey and bucketName. You can simply drop in these values or you may opt to draw these from the docker.properties file.

For example, you could set the XML to:

<util:map id="defaultBinaryStorageConfiguration">
    <entry key="accessKey" value="${custom.accessKey}"/>
    <entry key="secretKey" value="${custom.secretKey}"/>
    <entry key="bucketName" value="${custom.bucketName}"/>
    <entry key="region" value="${custom.region}"/>
</util:map>

And then add the following to your docker.properties file:

custom.accessKey=
custom.secretKey=
custom.bucketName=
custom.region=

Now just plug in your accessKey, secretKey, bucketName and region to your docker.properties file.

Azure Blob Storage

Cloud CMS provides a binary storage provider that will store your binary files into Azure Blob Storage.

To use it, configure your binary storage provider like this:

org.gitana.platform.services.binary.storage.provider=azure_blob
org.gitana.platform.services.binary.storage.provider.azure_blob.connectionString=
org.gitana.platform.services.binary.storage.provider.azure_blob.containerName=
org.gitana.platform.services.binary.storage.provider.azure_blob.prefix=

Where:

  • connectionString is an Azure Blob Storage connection string. This describes the authentication credentials to the connections and any additional settings you need to supply to the connection.
  • containerName is the name of the root container to store binary objects within.
  • prefix is an optional prefix to append to any storage paths within your container.

Mongo DB

Mongo DB provides the primary data store for Cloud CMS. Cloud CMS creates a connection pool that it uses to communicate with Mongo DB while Cloud CMS is in service. You can configure Cloud CMS to connect to Mongo DB running either as a Docker container or as a standalone service.

In addition, you can configure Cloud CMS to connect to Mongo DB running as a standalone services, in a replica set or in a sharded configuration.

Hosts

Use the mongodb.hosts setting to specify a comma-delimited set of <server>:<host> entries. Each entry should be the network-accessible address of either a mongod (in the case of a standalone or replica set configuration) or mongos process (in the case of a sharded configuration).

To connect to a single mongod or mongos process, you might use:

mongodb.hosts=mongodb.mycompany.com:27017

Or connect to a multiple servers in a replica set:

mongodb.hosts=repl1.mycompany.com:27017,repl2.mycompany.com:27017,repl3.mycompany.com:27017

Cloud CMS will initialize the MongoDB driver connection and manage that connection to best take advantage of what was supplied. If you supplied a list of replicas, for example, Cloud CMS will automatically migrate between members of the replica set on failure or when you switch primary.

Authentication

By default, Cloud CMS assumes that connectivity to Mongo DB is unauthenticated. In other words, it assumes that Mongo DB has been configured in such a way as to not require authentication.

This is generally acceptable for development. But for production environments, you will want to make sure that Mongo DB is configured for authentication, that a user exists in Mongo DB with sufficient access privileges and that you supply the username and password of that user like this:

mongodb.default.authentication.required=true
mongodb.default.authentication.username=
mongodb.default.authentication.password=

SSL

By default, Cloud CMS assumes that connectivity to Mongo DB does not use SSL. If you wish to enable SSL, then set the following properties:

mongodb.default.ssl.enabled=true

Depending on the nature of your SSL Certificate Authority (SA) certificate file, you may also need to register that certificate with the API Server as a keystore or a truststore. Please see the section on SSL KeyStore and TrustStore for more information on how to set this up.

Grid FS

By default, Cloud CMS will store binary files into Mongo DB's Grid FS storage system. This is specified via the following configuration option:

org.gitana.platform.services.binary.storage.provider=mongodb

You can use setting to change to a different storage provider or activate a custom implementation that you've built.

Count

Mongo DB has an interesting "feature" (which some might argue is a bug and others religiously would defend that it isn't) in that count()` operations against collections take O(n) time. This means that the amount of time needed to count rows in a collection will increase as the collection size increases.

This opens up a downside in that a request could come along and perform a count() and therefore take an unpredictable amount of time. To defend against this, Cloud CMS lets you specify the maximum amount of time you wish a count()` operation to execute before it is forced to fail. A forced fail means that the operation fails, the database cursor is released, memory is cleaned up and the API server nicely releases any resources it might be holding on to.

mongodb.defaultMaxCountTimeMs=2000

In addition, you may opt to limit the maximum number of items to count. This is an alternative to defaultMaxCountTimeMs in that you can lock down the maximum count size, letting you be sure that things can't get out of hand. This makes your request calls more predictable but also means that your total record set size may be inaccurate for large counts.

mongodb.defaultMaxCount=100000

An example - suppose defaultMaxCountTimeMs were set to 10 seconds and there were 10 million items in a collection. If you ran a count() and it took 11 seconds to execute, an exception would be raised and the operation would fail. You could raise defaultMaxCountTimeMs to 11 seconds and then things would work (but the operation would still take 11 seconds to complete).

But what if you really didn't care about whether the results had 10 million items or 1000 items. Your end users aren't going to paginate through 10 million entries are they? (or maybe they are... hmmm... it is your call)

Still, suppose we knew they'd never do that. Perhaps our UI front end doesn't even provide pagination or we have some other kind of UI control which is far more intuitive. Anyway, in that case, we could set defaultMaxCount to 1000.

Now when the count() operation occurs, it will be extremely fast. And the reason is because it takes a lot less time to return the first 1000 results than the first 10 million.

Find

Another potentially expensive operation in Mongo DB is a find(). An end user could run a query that runs for a very long time. While the query is running, the Mongo DB database connection is consumed, a thread is hung in the API server and the end user is waiting.

In general, when building scalable and fail-fast applications, we'd rather nip this in the bud. To do so, we can limit the maximum amount of time a find() operation can run like this:

mongodb.defaultMaxFindTimeMs=60000

Now, if the find() operation runs for more than the prescribed amount of time, an exception is raised, the operation fails and the resources released. Nice and clean.

Slow Queries

While developing your front-end applications, you'll occasionally experience slow queries that build up as you put in more and more content. These queries are usually slow because they haven't been indexed. Cloud CMS lets you add custom indexing to branches and so you'll want to do that.

To help you along, Cloud CMS offers the ability to log slow queries to its log file. This is corollary to the tenant log file that you already find in Cloud CMS, however it will be available to your developers and administration team.

mongodb.query.explain=false

This will produce a lot of explanations and so we only recommend this on development or non-production environments.

Write Concern

By default, the Write Concern for MongoDB is set to ACKNOWLEDGED. This means that any writes to the MongoDB database will wait for acknowledgement from the DB before proceeding. This is a relatively safe way to run as it ensures that MongoDB is aware of any any data that was committed and puts MongoDB in a position where has the opportunity to control the situation from a disaster recovery perspective.

That said, you may wish to change the Write Concern to make it more robust. The JOURNALED setting, for example, tells Cloud CMS to wait for MongoDB to acknowledge that the data was written to its journal before continuing. This takes a little longer but ensures that MongoDB can fully recover from its own journal via a repair or on next startup.

If you are running a replica set, then the W1, W2 and W3 tell Cloud CMS to wait until the data was successfully written to 1, 2 or 3 members of the replica set respectively. You may also choose to set this to MAJORITY to tell Cloud CMS to wait until the majority of replica set members commit the data before proceeding.

With replica sets, it is important to understand that the primary (W1) must be committed and the other members are eventually consistent by default. You may choose to set the non-primaries as slaveOk (done within MongoDB) to support reads from non-primary members. However, if you do this, you will need to make sure that the WriteConcern configured here enforces consistency across the non-primary members on commit (in other words, the data should be written across all members or none on each commit).

To adjust the write concern, use the following setting:

mongodb.default.writeconcern=ACKNOWLEDGED

Replica Set and Read Preference

If you're connecting to a replica set, you may need to identify the replica set on connection and/or specify the read preference on connect. These can be controlled with these configuration properties:

mongodb.default.replicaSetName=rs0
mongodb.default.readPreference=SECONDARY

The exact values will vary depending on your connection needs.

Amazon DocumentDB

Cloud CMS supports Amazon DocumentDB as an MongoDB-compatible alternative to running your own MongoDB containers.
Amazon DocumentDB is a fast, scalable, highly available, and fully managed document database service that supports MongoDB workloads.

To use Amazon DocumentDB, simply adjust your docker.properties file to point to the DocumentDB cluster endpoint.

If you're using TLS/SSL with DocumentDB, you'll need to download the Amazon distributed PEM file and add it to the TrustStore (as described in Configuring SSL for MongoDB).

See https://docs.aws.amazon.com/documentdb/latest/developerguide/security.encryption.ssl.html for more information.

In addition, please note that the Cloud CMS API container MUST be run inside of the same AWS VPC as your Amazon DocumentDB cluster itself.
If you wish to run Cloud CMS in a separate environment while connecting to DocumentDB, you will need to set up a SSH tunnel.

This is Amazon's restriction (not our's) when using DocumentDB. For more information, see https://docs.aws.amazon.com/documentdb/latest/developerguide/connect-from-outside-a-vpc.html.

At a minimum, with TLS off, your configuration might look something like this:

# indicate that wre're connecting to AWS Document DB
mongodb.default.engine=aws-document-db

# fill in the AWS Document DB endpoint here
mongodb.hosts=

# fill in the AWS Document DB connection user credentials here
mongodb.default.authentication.required=true
mongodb.default.authentication.username=
mongodb.default.authentication.password=

Where:

  • mongodb.default.engine must be set to aws-document-db
  • mongodb.hosts should point to your Amazon DocumentDB cluster endpoint.
  • mongodb.default.authentication.username is your cluster auth username
  • mongodb.default.authentication.password is your cluster auth password

In addition, if you're connecting to a replica set, you may need to specify the following connection properties:

mongodb.default.replicaSetName=rs0
mongodb.default.readPreference=SECONDARY

Elastic Search provides a secondary index of searchable content for Cloud CMS. It provides full-text search and structured query against it from within Cloud CMS using the Elastic Search DSL.

When content is written into Cloud CMS, it is primarily written into Mongo DB. It then secondarily written into Elastic Search. As content is created, updated and deleted, Elastic Search has its indexes kept perfectly in sync so that text-based search is available against every branch in a repository.

Elastic Search runs as a separate service from the API server. Each API server, upon starting up, creates a connection pool to your Elastic Search endpoint that it uses to communicate, execute queries and maintain its search indexes in real time.

Cloud CMS supports Elastic Search clustering. It only needs to know the IP address of one member of the cluster and the cluster name to connect.

The configuration properties are specified like this:

elasticsearch.remote.cluster.name=elasticsearch
elasticsearch.remote.hosts=
elasticsearch.remote.defaultPort=9300

Where hosts is a comma-delimited set of <host>:<port> or simply <host> entries.

Providers

At this time, Cloud CMS provides two Elastic Search client providers that can be used to communicate with the Elastic Search cluster members.

  • The Transport Client Provider (transport) offers a fast, Java-optimized client that communicates on port 9300.
  • The HTTP Client Providers (http) offers communication using REST over HTTP on port 9200.

The clients provided by these two Providers effectively do the same thing.

The Transport Client is an older client that performs a bit faster whereas the HTTP Client is newer and offers more compatibility between versions of Elastic Search as well as support for SSL and Basic Authentication.

The Transport Client is enabled by default. If you wish to use SSL or Basic Authentication, you will need to switch to the HTTP Client.

To use the Transport Client Provider, you can set the following:

elasticsearch.remote.defaultProviderType=transport

To use the HTTP Client Provider, you can set the following:

elasticsearch.remote.defaultProviderType=http

Basic Authentication

If you wish to use Basic Authentication to communicate with Elastic Search cluster members, you will need to provide the following in your properties file:

elasticsearch.remote.username=
elasticsearch.remote.password=

Fill in the values for your username and password respectively.

You will also need to enable the http Client Provider as described above.

SSL

If you wish to enable SSL to communicate with the Elastic Search cluster members, you will further need to provide the following in your properties file:

elasticsearch.remote.ssl=true

You will also need to enable the http Client Provider as described above.

Depending on the nature of your SSL Certificate Authority (SA) certificate file, you may also need to register that certificate with the API Server as a keystore or a truststore. Please see the section on SSL KeyStore and TrustStore for more information on how to set this up.

Bulk Optimization

In many cases, the Elastic Search clients will attempt to chunk writes and copy operations for efficiency. You can use the following setting to limit the maximum number of items to include in each chunk.

elasticsearch.remote.bulkPartitionChunkSize=250

The default setting of 250 means that the Elastic Search client (either http or transport) will divide up any writes or copies into multiple calls consisting of at most 250 items at once. This reduces the # of items processing for any individual call.

In general, you shouldn't need to adjust this setting. However, if you find that your Elastic Cluster is being overly stressed (i.e. high CPU utilization), you may opt to reduce this so that individual calls process more quickly (in that each individual call has a smaller number of items to process). This may slow down the API but may free up execution threads within the Elastic Search cluster.

Elastic Search HTTP Provider Settings

The http client for Elastic Search has the following settings preconfigured. You may wish to override these depending on network considerations and the configuration of your Elastic Search cluster.

# the timeout (in milliseconds) until a connection is established
elasticsearch.http.connectTimeoutMs=10000

# the timeout (in milliseconds) for waiting for data or, put differently, a maximum period inactivity between two consecutive data packets
elasticsearch.http.socketTimeoutMs=60000

# the maximum timeout (in milliseconds) to honour in case of multiple retries of the same request
elasticsearch.http.maxRetryTimeoutMs=120000

One consideration is to make sure that socketTimeoutMs is high enough to account for any long-running processing that may occur within your Elastic Search service. If the API (using the http Elastic Search client) sends over a request that takes 20 seconds to process, then you'll need to make sure that socketTimeoutMs is larger than 20 seconds. If socketTimeoutMs is less than 20 seconds, the client may hang up before the call completes (since the socket isn't sending anything and it considers it to be a "timeout").

In this case, you'll see an error similar to:

org.gitana.platform.services.exception.IndexEngineException: listener timeout after waiting for [150000] ms"

By default, the socketTimeoutMs is set high to allow for long-running operations in the Elastic Search cluster.

However, this should still beg the question as to why such a long-operation occurs at all. Some attention may be needed to size or support the required throughput within the ES cluster.

Amazon Elastic Search Service

Cloud CMS supports Amazon Elastic Search Service as an Elastic Search-compatible alternative to running your own Elastic Search containers. Amazon Elasticsearch Service is a fully managed service that makes it easy for you to deploy, secure, and operate Elasticsearch at scale with zero down time.

To use Amazon Elastic Search Service, simply adjust your docker.properties file to point to your Elastic Search Service endpoint. You should opt to use the http provider, like this:

elasticsearch.remote.defaultProviderType=http
elasticsearch.remote.defaultPort=80
elasticsearch.remote.cluster.name=elasticsearch
elasticsearch.remote.hosts=

Where elasticsearch.remote.host is your Amazon Elastic Search Service endpoint.

Email Provider

Cloud CMS allows you to configure an email provider that will be used to send email to people that you invite to participate in your project. The default email provider is used to dispatch those invitations and is also used to send emails during a coordinated registration process.

To set your email provider, use the following properties:

oneteam.emailprovider.host=
oneteam.emailprovider.port=-1
oneteam.emailprovider.username=
oneteam.emailprovider.password=
oneteam.emailprovider.smtp.enabled=true
oneteam.emailprovider.smtp.secure=true
oneteam.emailprovider.smtp.requiresauth=true
oneteam.emailprovider.smtp.starttls=true
oneteam.emailprovider.from=

Job Dispatcher

Each API server that spins up supports 2 primary functions. The first is to handle incoming web requests and turn them around quickly. The second is to process background jobs that are queued up in the distributed job queue. Background jobs typically include content indexing, exporting and complex mimetype conversions or data extractions.

These kinds of jobs "take a while" and so they're frequently moved off the request and placed in the asynchronous job queue. Every server in the cluster (or in an API worker cluster) that is configured to process jobs will work together to coordinate the distribution of job work by all cluster members.

By default, every API server performs both functions (web request handler and job worker). However, you can control this behavior via the following flag:

gitana.jobdispatcher.enabled=true

For more advanced and scalable deployments of Cloud CMS, you will want to run two tiers of API servers -- one tier for handling web requests and the other for working on background jobs. That way, any long-running intensive work won't steal CPU cycles from your web request handling. You can use the flag above to achieve this by enabling the job dispatcher only for the API servers in the job worker tier.

Any servers running the job dispatcher can configure the maximum number of jobs using this setting:

gitana.jobqueue.server.maxConcurrentJobs=25

You can also configure the maximum number of jobs that the job dispatcher will dispatcher per tenant platform:

gitana.jobqueue.server.maxConcurrentJobsPerPlatform=25

If you're running a multi-tenant offering, you can use this to ensure that no single tenant may draw too much job handling. If you're running single tenant, you may disable this by setting:

gitana.jobqueue.server.maxConcurrentJobsPerPlatform=-1

In addition, individual job workers can be turned on and off. This allows you to segment your API workers so that you can allocate job workers to certain servers (so that heavier tasks can be allocated to higher powered servers and so on).

Here's an example of some of the jobs offered being enabled:

gitana.workers.webcapture.enabled=true
gitana.workers.transfer.import.enabled=true
gitana.workers.transfer.export.enabled=true
gitana.workers.transfer.copy.enabled=true
gitana.workers.bulkTransactionCommit.enabled=true
gitana.workers.bulkTransactionCleanup.enabled=true
gitana.workers.export.enabled=true
gitana.workers.create-project.enabled=true
gitana.workers.binaryStorageMigration.enabled=true
gitana.workers.indexDatastore.enabled=true
gitana.workers.indexPlatform.enabled=true
gitana.workers.oneteamStartProjectCopy.enabled=true
gitana.workers.replication.export.enabled=true
gitana.workers.replication.import.enabled=true
gitana.workers.publication.export.enabled=true
gitana.workers.publication.import.enabled=true
gitana.workers.pdfPreview.enabled=true
gitana.workers.rendition.enabled=true
gitana.workers.generateThumbnails.enabled=true
gitana.workers.nodeListReduction.enabled=true
gitana.workers.index.enabled=true
gitana.workers.filefolder-reindex.enabled=true
gitana.workers.search-reindex.enabled=true
gitana.workers.finalize-release.enabled=true
gitana.workers.create-release.enabled=true
gitana.workers.indexBranch.enabled=true
gitana.workers.interactionPageInsight.enabled=true
gitana.workers.awsDominantLanguage.enabled=true
gitana.workers.awsSentiment.enabled=true
gitana.workers.awsTranscode.enabled=true
gitana.workers.awsTranscribe.enabled=true
gitana.workers.awsTranslate.enabled=true

Job Reaper

This section describes features that will be available in Cloud CMS version 3.2.38 and beyond.

Cloud CMS offers a job reaping feature that will automatically run in the background and clean up records for jobs whose records no longer need to be retained. These jobs have already executed and completed. Their runtime state no longer needs to be maintained for purposes of system operation or book keeping. They can be cleaned up to reduce the size of your jobs collection to keep the system fast and speedy.

The Job Reaper is designed to implement a cap on the size of the Jobs collection. It runs immediately when an API server starts up. It then goes to sleep and wakes up periodically to check on the state of the Jobs collection.

By default, the Job Reaper is disabled. You can enable it like this:

gitana.jobreaper.enabled=true

You can also specify the following properties:

gitana.jobreaper.maxAgeMs=2592000000
gitana.jobreaper.sleepTimeMs=86400000

Where maxAgeMs is the maximum age of the records in milliseconds. The default value, shown above, is for 30 days. The reaper will look for jobs that are older than 30 days and will remove them for you automatically.

The sleepTimeMs specifies how long the reaper should sleep in-between runs. The default value, shown above, is for 24 hours. Once the reaper completes, it will wait 24 hours before running again. Note that the reaper will also attempt to run on startup of each API server.

Be sure to leave these values fairly high. The provided values of 30 days and 24 hours (for the sleep period) should be sufficient for most production systems. Bear in mind that, for a large implementation of Cloud CMS, you may have hundreds of jobs running at any one point and so these timeframes should be large, allowing for those jobs to execute, complete and be inspected if needed. After a sufficient amount of time has passed, those records can then be reaped.

Web Shot

Cloud CMS features the ability to take snapshots of web pages and other HTTP resources. Most of this integrated into the product automatically. The Cloud CMS analytics engine, for example, captures snapshots automatically for pages whose interaction is being reported upon.

The Cloud CMS Web Shot server provides the ability to capture these snapshots. The Cloud CMS API calls out to this server when needed.

This setting is optional. If provided, Cloud CMS will capture snapshots for your assets.

To configure the location of the Cloud CMS Web Shot server, use this property:

org.gitana.platform.services.webdriver.webshot.endpoint=

Backdoor Authentication

At times, you will want to configure a secret 'backdoor' password that allows you to log in or impersonate any user in your platform. You can do so by enabling backdoor authentication and specifying a password like this:

org.gitana.platform.services.authentication.backdoor.enabled=false
org.gitana.platform.services.authentication.backdoor.password=

Field Encryption

Cloud CMS generates a salt value that is used to encrypt fields such as password fields or credential keys and secrets. This salt value takes a number of inputs -- one of which is the a secret key that you can provide like this:

org.gitana.platform.services.encryption.secret=<anything you like>

Ticket Encryption

Cloud CMS writes back a GITANA_TICKET cookie with every API request. The ticket provides a cookie-based way to store the access token that is required for every request to the Cloud CMS API. In this way, the Cloud CMS API can be used to serve back assets directly to a browser where requests do not originate from a driver but instead of native HTML elements like IMG tags (with src attributes).

This GITANA_TICKET can be encrypted so that the access token is never out in the open. To encrypt the ticket, you simply need to provide an encryption secret for the ticket generator - liek this:

org.gitana.platform.services.ticket.secret=

Deletions

Cloud CMS retains deleted nodes in per-branch collections (known as "deletion" collections). The product provides services so that editorial teams can quickly restore deletions back to the originating branch in the event that something was deleted accidentally. The "deletions" collection (per-branch) provides a fast and efficient way to discover recent deletions and restore without a deep interrogation of the master node list.

Note that each branch's "deletions" collection is essentially a copy of the node made available to support query and recovery quickly. The actual master copy of the deleted node is always contained in the master node record. Cloud CMS is a copy-on-write system and so a master record of all nodes is always retained. In the end, your data is always recoverable.

That said, the facility of the deletions collection can be customized according to your editorial needs. Given that it tracks all deletions, the collection can grow to be quite large and so you may want to set automatic collection capping so that the total number of available deletions is limited in size over time.

To enable auto-capping, set the following properties, kind of like this:

org.gitana.platform.services.deletion.DeletionService.autocap.enabled=true
org.gitana.platform.services.deletion.DeletionService.autocap.maxSize=10000
org.gitana.platform.services.deletion.DeletionService.autocap.resetSize=7500

The maxSize setting describes the maximum number of deleted records allowed before the collection is capped back to the resetSize. In this case, if the collection has 10000 deletion records in it and you delete just one more node, the deletion records will drop their oldest entries and the collection will resize to 7500.

Antivirus Scanning

Provides Cloud CMS with support for virus scanning and file quarantine on upload and storage.

To achieve this, the Cloud CMS API must be configured to connect with a Cloud CMS Antivirus Server.

You can either run this Antivirus endpoint yourself (by hosting your own container) or you can connect to a free hosted Antivirus Server that Cloud CMS operates for its customers in the cloud.

Connectivity to this server is performed via HTTPS and requires authentication. Files are transferred to the Antivirus Server, scanned and then discarded.

By default, the Cloud CMS API is configured to connect to the free hosted Antivirus Server. You do not have to make any additional changes to take advantage of this facility. However, you may wish to run your own Antivirus servers locally. There are good reasons for this, including:

  1. A lower latency connection (it might be slow to connect to the cloud for each scan)
  2. Lack of public internet connectivity from your API container
  3. Security policies (your organization does not wish to content transferred to an external location for scanning)

In these cases, you can run the Cloud CMS Antivirus container locally. You then need to adjust the following properties for the API:

org.gitana.platform.services.antivirus.enabled=true
org.gitana.platform.services.antivirus.scanURL=http://{host}/scan
org.gitana.platform.services.antivirus.statusURL=http://{host}/status
org.gitana.platform.services.antivirus.username=
org.gitana.platform.services.antivirus.password=

Using these properties, you can enable or disable automatic virus scanning. You should set the other properties to match the settings provided through environment variables to your Cloud CMS Antivirus container.

For more information, please read our documentation on Antivirus Scanning

Archive Registries

By default, Cloud CMS is configured with a public Archive Registry that gives editorial users access to Project Templates. These can be installed via the Create Project Wizard.

You can disable the default Archive Registry by using this setting:

oneteam.archiveregistry.default.enabled=false

You may also wish to register your own Archive Registries. To do so, you will need to add an XML block similar to what is shown here to the XML for each Archive Registry:

<bean class="org.gitana.platform.services.archive.registry.ArchiveRegistry">
    <property name="id"><value>{id}</value></property>
    <property name="enabled"><value>true</value></property>
    <property name="url"><value>{url}</value></property>
    <property name="username"><value>{username}</value></property>
    <property name="password"><value>{password}</value></property>
</bean>

Where:

  • id is a unique ID for your registry (example: myregistry)
  • url is an URL to your registry.json file (example: http://www.myserver.com/registry.json)
  • username and password are optional basic authentication credentials

Third Party / OS Libraries

The Cloud CMS API runs in an OS environment that has been configured and optimized to support its runtime needs. These include OS updates and third party installations of common libraries.

ImageMagick

https://www.imagemagick.org/script/index.php

Provides Cloud CMS with services for mimetype transformation, extraction and more for image formats.

org.gitana.platform.services.transform.imagemagick.basepath=/usr/bin

FFMpeg

https://ffmpeg.org

Provides Cloud CMS with video services including mimetype conversion, extraction and image manipulation of frames.

org.gitana.platform.services.transform.ffmpeg.basepath=/usr/bin

LibreOffice

https://www.libreoffice.org

Provides Cloud CMS with support for OpenDoc and Microsoft Office Formats.

org.gitana.platform.services.transform.openoffice.enabled=true
org.gitana.platform.services.transform.openoffice.path=/opt/libreoffice6.2
org.gitana.platform.services.transform.openoffice.portNumbers=9100
org.gitana.platform.services.transform.openoffice.maxTasksPerProcess=200
org.gitana.platform.services.transform.openoffice.taskExecutionTimeout=120000
org.gitana.platform.services.transform.openoffice.taskQueueTimeout=30000

For versions of Cloud CMS API prior to 3.2.37, these settings should be:

org.gitana.platform.services.transform.openoffice.enabled=true
org.gitana.platform.services.transform.openoffice.path=/usr/lib64/libreoffice
org.gitana.platform.services.transform.openoffice.portNumbers=9100
org.gitana.platform.services.transform.openoffice.maxTasksPerProcess=200
org.gitana.platform.services.transform.openoffice.taskExecutionTimeout=120000
org.gitana.platform.services.transform.openoffice.taskQueueTimeout=30000

GeoLite2 Database

https://dev.maxmind.com/geoip/geoip2/geolite2

Provides Cloud CMS with the ability to interpret latitude/longitude information. The database for GeoLite2 is included with the Cloud CMS API.

org.gitana.platform.services.geolocation.databaseFilePath=/opt/geoip2/GeoLite2-City.mmdb

Zip/Unzip

org.gitana.platform.services.zip.executable.path=zip
org.gitana.platform.services.zip.timeout=1800000
org.gitana.platform.services.unzip.executable.path=unzip
org.gitana.platform.services.unzip.timeout=1800000

Web / HTTP

The API runs inside of a servlet container. You can adjust the server-side request-handling characteristics through the following properties.

Multipart File Handling

Set the maximum size (in bytes) of any multipart requests. This is the maximum size of the entire multipart request (as a summation of the sizes of all parts). This is set to 1GB by default.

org.gitana.platform.services.webapp.multipart.maxUploadSize=1073741824

Set the maximum size (in bytes) of any individual parts in the multipart request. This is set to 512MB by default.

org.gitana.platform.services.webapp.multipart.maxUploadSizePerFile=536870912

Set the maximum size (in bytes) for a part that is allowed to reside in a memory buffer during multipart processing. This is set to 128KB by default.

org.gitana.platform.services.webapp.multipart.maxInMemorySize=131072

Set the default encoding for any parts. This is set to the servlet spec ISO-8859-1 by default.

org.gitana.platform.services.webapp.multipart.defaultEncoding=ISO-8859-1

Notifications (UI Server)

The API can be configured to send notification messages to the UI Server and other endpoints when content is created, updated or deleted. These messages are delivered asynchronously and can be used to invalidate caches on external applications (among other things).

This section describes how to configure notifications between the Docker API container and the Docker UI container.

If you're looking for information on how to set up ad-hoc notifications between an Application instance and a custom application, see Configuring Amazon SNS.

Active MQ (UI Server)

To enable Active MQ between the API and the UI Server, add the following to your cloudcms-distribution-context.xml file:

<bean id="cloudcmsUIServerApplicationDeployer" class="org.gitana.platform.services.application.deployment.CloudCMSApplicationDeployer" parent="abstractApplicationDeployer">
    <property name="type"><value>${gitana.default.application.deployer.uiserver.type}</value></property>
    <property name="deploymentURL"><value>${gitana.default.application.deployer.uiserver.deploymentURL}</value></property>
    <property name="domain"><value>${gitana.default.application.deployer.uiserver.domain}</value></property>
    <property name="baseURL"><value>${gitana.default.application.deployer.uiserver.baseURL}</value></property>
    <property name="notificationsEnabled"><value>${gitana.default.application.deployer.uiserver.notifications.enabled}</value></property>
    <property name="notificationsProviderType"><value>${gitana.default.application.deployer.uiserver.notifications.providerType}</value></property>
    <property name="notificationsProviderConfiguration">
        <map>
            <entry key="host"><value>${gitana.default.application.deployer.uiserver.notifications.configuration.host}</value></entry>
            <entry key="port"><value>${gitana.default.application.deployer.uiserver.notifications.configuration.port}</value></entry>
            <entry key="username"><value>${gitana.default.application.deployer.uiserver.notifications.configuration.username}</value></entry>
            <entry key="password"><value>${gitana.default.application.deployer.uiserver.notifications.configuration.password}</value></entry>
        </map>
    </property>
    <property name="notificationsTopic"><value>${gitana.default.application.deployer.uiserver.notifications.topic}</value></property>
</bean>

And fill in the following to docker.properties to override the Active MQ settings:

gitana.default.application.deployer.uiserver.notifications.enabled=true
gitana.default.application.deployer.uiserver.notifications.providerType=activemq
gitana.default.application.deployer.uiserver.notifications.topic=cloudcms.ui.topic
gitana.default.application.deployer.uiserver.notifications.configuration.username=
gitana.default.application.deployer.uiserver.notifications.configuration.password=
gitana.default.application.deployer.uiserver.notifications.configuration.host=activemq
gitana.default.application.deployer.uiserver.notifications.configuration.port=61616

Amazon SNS (UI Server)

To enable Amazon SNS between the API and the UI Server, add the following to your cloudcms-distribution-context.xml file:

<bean id="cloudcmsUIServerApplicationDeployer" class="org.gitana.platform.services.application.deployment.CloudCMSApplicationDeployer" parent="abstractApplicationDeployer">
    <property name="type"><value>${gitana.default.application.deployer.uiserver.type}</value></property>
    <property name="deploymentURL"><value>${gitana.default.application.deployer.uiserver.deploymentURL}</value></property>
    <property name="domain"><value>${gitana.default.application.deployer.uiserver.domain}</value></property>
    <property name="baseURL"><value>${gitana.default.application.deployer.uiserver.baseURL}</value></property>
    <property name="notificationsEnabled"><value>${gitana.default.application.deployer.uiserver.notifications.enabled}</value></property>
    <property name="notificationsProviderType"><value>${gitana.default.application.deployer.uiserver.notifications.providerType}</value></property>
    <property name="notificationsProviderConfiguration">
        <map>
            <entry key="accessKey"><value>${gitana.default.application.deployer.uiserver.notifications.configuration.accessKey}</value></entry>
            <entry key="secretKey"><value>${gitana.default.application.deployer.uiserver.notifications.configuration.secretKey}</value></entry>
            <entry key="region"><value>${gitana.default.application.deployer.uiserver.notifications.configuration.region}</value></entry>
        </map>
    </property>
    <property name="notificationsTopic"><value>${gitana.default.application.deployer.uiserver.notifications.topic}</value></property>
</bean>

And fill in the following to docker.properties to override the Amazon SNS settings:

gitana.default.application.deployer.uiserver.notifications.enabled=true
gitana.default.application.deployer.uiserver.notifications.providerType=sns
gitana.default.application.deployer.uiserver.notifications.topic=arn:aws:sns:us-east-1:accountId:queueName
gitana.default.application.deployer.uiserver.notifications.configuration.accessKey=
gitana.default.application.deployer.uiserver.notifications.configuration.secretKey=
gitana.default.application.deployer.uiserver.notifications.configuration.region=us-east-1

Multifactor Authentication (MFA)

Cloud CMS lets you set up Multifactor Authentication (MFA) to provide better security for your users. With Multifactor Authentication configured, your users will have the option to receive a verification code on their mobile devices that is required for login. The helps ensure that hackers cannot solely gain access to your users' account credentials and use them to compromise your system.

Cloud CMS lets you configure Multifactor Authentication at runtime using a Service Descriptor from within your tenant. However, if you're setting up Cloud CMS via Docker, you also have the option to adjust default MFA bindings as well as configure how the admin user and backdoor passwords work with MFA accounts.

Define System Authenticators

For each provider type (such as Authy or Duo), there will typically be a Registrar object that you can use to quickly register a new system-defined Authenticator of that type.

For example, you might define an Authy authenticator with ID test like this:

<bean class="org.gitana.platform.services.authenticator.authy.AuthyAuthenticatorRegistrar">
    <property name="id"><value>test</value></property>
    <property name="apiKey"><value>API_KEY</value></property>
    <property name="apiUrl"><value>API_URL</value></property>
</bean>

Or a Duo authenticator with ID test like this:

<bean class="org.gitana.platform.services.authenticator.duo.DuoAuthenticatorRegistrar">
    <property name="id"><value>test</value></property>
    <property name="integrationKey"><value>INTEGRATION_KEY</value></property>
    <property name="secretKey"><value>SECRET_KEY</value></property>
    <property name="apiHost"><value>API_HOST</value></property>
</bean>

These authenticator instances must then be bound to principals or behaviors as described below.

Configure MFA for Admin Account

You can lock down the admin account so that attempts to log in as admin will require multifactor authentication. To do so, use the BindAdminUserSystemAuthenticator bean.

Here is an example where we bind the admin account to a specific predefined Authy user with a given Authy ID.

<bean class="org.gitana.platform.services.authenticator.BindAdminUserSystemAuthenticator">
    <property name="bindingProperties">
        <bean class="org.gitana.platform.services.authenticator.authy.AuthyAuthenticatorBindingPropertiesBeanFactory">
            <property name="authyId"><value>12345678</value></property>
        </bean>
    </property>
    <property name="descriptor">
        <bean class="org.gitana.platform.services.authenticator.authy.AuthyAuthenticatorDescriptorBeanFactory">
            <property name="id"><value>test</value></property>
        </bean>
    </property>
</bean>

Note that this makes use of the AuthyAuthenticatorBindingPropertiesBeanFactory and AuthyAuthenticatorDescriptorBeanFactory classes to make configuration easier.

And here is an example using the Duo provider:

<bean class="org.gitana.platform.services.authenticator.BindAdminUserSystemAuthenticator">
    <property name="bindingProperties">
        <bean class="org.gitana.platform.services.authenticator.duo.DuoAuthenticatorBindingPropertiesBeanFactory">
            <property name="userId"><value>DUO_USER_ID</value></property>                                           
            <property name="username"><value>DUO_USER_NAME</value></property>
        </bean>
    </property>
    <property name="descriptor">
        <bean class="org.gitana.platform.services.authenticator.duo.DuoAuthenticatorDescriptorBeanFactory">
            <property name="id"><value>test</value></property>
        </bean>
    </property>
</bean>

This uses the DuoAuthenticatorBindingPropertiesBeanFactory and DuoAuthenticatorDescriptorBeanFactory classes to make configuration easier.

Configure MFA for a Principal

You can also bind an authenticator into place at the system level for any arbitrary principal provided that you know that principal's Domain ID and Principal ID.

Use the BindPrincipalSystemAuthenticator bean to achieve this.

Here is an example:

<bean class="org.gitana.platform.services.authenticator.BindPrincipalSystemAuthenticator">
    <property name="bindingProperties">
        <bean class="org.gitana.platform.services.authenticator.authy.AuthyAuthenticatorBindingPropertiesBeanFactory">
            <property name="authyId"><value>12345678</value></property>
        </bean>
    </property>
    <property name="descriptor">
        <bean class="org.gitana.platform.services.authenticator.authy.AuthyAuthenticatorDescriptorBeanFactory">
            <property name="id"><value>test</value></property>
        </bean>
    </property>
</bean>

Configure MFA for backdoor password

You can also bind an authenticator in place for any user who attempts to log in using the backdoor password.

Use the BindBackdoorSystemAuthenticator bean to achieve this.

Here is an example:

<bean class="org.gitana.platform.services.authenticator.BindBackdoorSystemAuthenticator">
    <property name="bindingProperties">
        <bean class="org.gitana.platform.services.authenticator.authy.AuthyAuthenticatorBindingPropertiesBeanFactory">
            <property name="authyId"><value>12345678</value></property>
        </bean>
    </property>
    <property name="descriptor">
        <bean class="org.gitana.platform.services.authenticator.authy.AuthyAuthenticatorDescriptorBeanFactory">
            <property name="id"><value>test</value></property>
        </bean>
    </property>
</bean>

Setting up SSL

The Cloud CMS API supports connecting via SSL to MongoDB, Elastic Search and any custom HTTPS endpoints (for web hooks or other purposes).

SSL Handshake

When a connection is made to an SSL endpoint (such as an HTTPS URL), the API needs to get its hand on an SSL Certificate that it will use to encrypt any data that it sends to that endpoint. The API asks the remote server for the SSL Certificate.

If the remote server does not hand back an SSL Certificate, then you'll see something like this:

javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure

This means that the remote server either isn't speaking SSL (HTTPS) or the remote server is somehow misconfigured. Check your ports and URLs to make sure you're configured to hit an SSL endpoint.

Once the API gets back an SSL Certificate, it then has to validate that it is legitimate. It does so by contacting the Certificate Authorities (CA) that it knows about. These Certificate Authorities look at the SSL Certificate and validate whether it is real or not. Specifically, the Certificate Authority that issued the SSL Certificate will assert that its real and valid.

The Cloud CMS API comes preinstalled with a wide berth of support for popular Certificate Authorities (CA).

Once the Cloud CMS API has validated the SSL Certificate, it will proceed to use it to send encrypted data over the wire.

However, if none of the known Certificate Authorities can validate the SSL certificate, then the API will not be able to communicate over SSL to the remote server. In this case, you'll get an error that looks like this:

Exception in thread "main" javax.net.ssl.SSLHandshakeException: 
sun.security.validator.ValidatorException: 
PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: 
unable to find valid certification path to requested target

If the remote server is using a self-signed SSL Certificate (meaning that it was generated locally) or if you're using an SSL Certificate that was signed by a CA that Cloud CMS doesn't know about, then you will need to register the SSL Certificate with the Cloud CMS API Trust Store.

We provide examples below for how to do this for MongoDB, Elastic Search and an arbitrary custom HTTPS endpoint (useful for web hook endpoints).

Configuring SSL for MongoDB

Let's start by looking at MongoDB. We'll walk through the following steps:

  1. Generate a SSL Certificate
  2. Configure MongoDB for SSL
  3. Register the SSL Certificate with the Cloud CMS API
  4. Configure the Cloud CMS API to connect to MongoDB via SSL

After these steps are complete, Cloud CMS will connect properly to MongoDB over SSL.

Further to this reading, we recommend that you read through MongoDB's formal documentation on SSL. It goes into much greater depth than what we provide here. But this should give you the basic idea.

These instructions assume a Linux environment and assume you're using the Cloud CMS quickstart kit.

Generate a SSL Certificate

This step is only necessary if you wish to generate an SSL certificate.
If you already have one then you can skip this step.

To generate a certificate, open a shell window and run the following:

  1. Generate your self-signed SSL certificate
openssl req -newkey rsa:2048 -new -x509 -days 3650 -nodes -out mongodb-cert.crt -keyout mongodb-cert.key
  1. Create a PEM file
cat mongodb-cert.key mongodb-cert.crt > mongodb.pem

Configure MongoDB for SSL

To configure MongoDB to use SSL:

  1. Copy mongodb-cert.crt and mongodb.pem next to your MongoDB Dockerfile.

  2. Append the following to the MongoDB Dockerfile:

ADD mongodb-cert.crt /etc/ssl/mongodb-cert.crt
ADD mongodb.pem /etc/ssl/mongodb.pem

RUN chmod +777 /etc/ssl/mongodb-cert.crt
RUN chmod +777 /etc/ssl/mongodb.pem
  1. Append the following to the MongoDB mongod.conf:
net:
    ssl:
        mode: requireSSL
        allowConnectionsWithoutCertificates: true
        PEMKeyFile: /etc/ssl/mongodb.pem
        CAFile: /etc/ssl/mongodb-cert.crt

Note that the allowConnectionsWithoutCertificates is used here so that we don't have to set up TLS Mutual Auth.

You should then rebuild your MongoDB container. When the container comes back up, you should see:

waiting for connections on port 27017 ssl

Register the SSL Certificate with the Cloud CMS API

To configure the Cloud CMS API to use SSL to connect to MongoDB:

  1. Copy mongodb-cert.crt next to your API Dockerfile.

  2. Append the following to the API Dockerfile:

# merge our certificate into the truststore
ADD mongodb-cert.crt /etc/ssl/mongodb-cert.crt
RUN \
    cd $JAVA_HOME/lib/security \
    && keytool -keystore cacerts -storepass changeit -noprompt -trustcacerts -alias mongodb -import -file /etc/ssl/mongodb-cert.crt

Note: The sample above is for 3.2.37 and above. If you're using a version of Cloud CMS prior to this, adjust $JAVA_HOME/lib/security to $JAVA_HOME/lib/jre/security.

Configure the Cloud CMS API to connect to MongoDB via SSL

Adjust your docker.properties to tell the API to use SSL:

mongodb.default.ssl.enabled=true

You should then rebuild and restart your API container.
The container should come up without any issue.

Troubleshooting

If the API starts up and you see an error like this:

java.security.cert.CertificateException: No name matching mongodb found

It means that MongoDB is configured with an SSL Certificate whose hostname doesn't match the HTTP hostname being connected against. For the quickstart Docker kit that Cloud CMS ships, the MongoDB hostname is mongodb. This is the service name within docker-compose.yml.

In Cloud CMS version 3.2.31 and beyond, hostname checking is disabled by default. This means that you shouldn't see the error above in more recent versions of Cloud CMS.

For security reasons, you may wish to adjust this behavior. You can do so by adjusting docker.properties to provide the following:

mongodb.default.ssl.invalidHostNameAllowed=false

For more information, see https://docs.mongodb.com/manual/tutorial/configure-ssl.

Now let's look at Elastic Search.

To take advantage of SSL / TLS in Elastic Search, you will need to install the Enterprise version of their offering. SSL support is only offered on Elastic Search's enterprise version. It is not available, by default, in their open source (OSS) version.

Further, the Cloud CMS sample docker distribution kits use the open source (OSS) version of Elastic Search. As such, to configure Elastic Search for SSL, you will need to make sure your Elastic Search containers are the Enterprise version and not the OSS version.

Let's walk through this. We'll do the following:

  1. Generate a SSL Certificate
  2. Configure Elastic Search for SSL
  3. Register the SSL Certificate with the Cloud CMS API
  4. Configure the Cloud CMS API to connect to Elastic Search via SSL

After these steps are complete, Cloud CMS will connect properly to Elastic Search over SSL.

Further to this reading, we recommend that you read through Elastic Search's formal documentation on SSL. It goes into much greater depth than what we provide here. But this should give you the basic idea.

https://www.elastic.co/guide/en/elasticsearch/reference/current/configuring-tls-docker.html

These instructions assume a Linux environment and assume you're using the Cloud CMS quickstart kit.

Generate a Certificate

This step is only necessary if you wish to generate an SSL certificate.
If you already have one then you can skip this step.

To generate a certificate, open a shell window and run the following:

  1. Generate your self-signed SSL certificate
openssl req -newkey rsa:2048 -new -x509 -days 3650 -nodes -out elasticsearch-cert.crt -keyout elasticsearch-cert.key
  1. Create a PEM file
cat elasticsearch-cert.key elasticsearch-cert.crt > elasticsearch.pem

Configure Elastic Search for SSL

In order to use SSL, we must convert the Docker Compose elasticsearch service to use the Enterprise version.

If you already have Elastic Search Enterprise, you may opt to configure this to your heart's content. However, for those who do not, we provide here a few drop-in replacement snippets that will get you up and running straight away using the Trial version of Elastic Search Enterprise.

  1. Replace this line in the Elastic Search Dockerfile to have this FROM tag:
FROM docker.elastic.co/elasticsearch/elasticsearch:6.7.0
  1. Append the following to the Elastic Search Dockerfile:
ADD elasticsearch-cert.crt /usr/share/elasticsearch/config/certificates/elasticsearch-cert.crt
ADD elasticsearch-cert.key /usr/share/elasticsearch/config/certificates/elasticsearch-cert.key
ADD elasticsearch.pem /usr/share/elasticsearch/config/certificates/elasticsearch.pem
  1. Add these lines to the elasticsearch.env file:
ELASTIC_PASSWORD=password
xpack.license.self_generated.type=trial 
xpack.security.enabled=true
xpack.security.http.ssl.enabled=true
xpack.security.transport.ssl.enabled=true
xpack.security.transport.ssl.verification_mode=certificate 
xpack.ssl.certificate=certificates/elasticsearch-cert.crt
xpack.ssl.key=certificates/elasticsearch-cert.key
#xpack.ssl.certificate_authorities=certificates/ca/ca.crt

Where password can be set to any password you like. This is required because the Enterprise Trial version of Elastic Search sets up a default user with the name elastic.

You should then rebuild and restart your Elastic Search container.

When the container comes back up, you should be able to run:

curl -k -u elastic:password https://localhost:9200/

Note that -u elastic:password should have the password value substituted with whatever you chose above.

Register the SSL Certificate with the Cloud CMS API

To configure the Cloud CMS API to use SSL to connect to MongoDB:

  1. Copy elasticsearch-cert.crt next to your API Dockerfile.

  2. Append the following to the API Dockerfile:

# merge our certificate into the truststore
ADD elasticsearch-cert.crt /etc/ssl/elasticsearch-cert.crt
RUN \
    cd $JAVA_HOME/lib/security \
    && keytool -keystore cacerts -storepass changeit -noprompt -trustcacerts -alias elasticsearch -import -file /etc/ssl/elasticsearch-cert.crt

Note: The sample above is for 3.2.37 and above. If you're using a version of Cloud CMS prior to this, adjust $JAVA_HOME/lib/security to $JAVA_HOME/lib/jre/security.

Configure the Cloud CMS API to connect to Elastic Search via SSL

Adjust your docker.properties to tell the API to use SSL:

# turn on elastic search SSL
elasticsearch.remote.defaultProviderType=http
elasticsearch.remote.ssl=true
elasticsearch.remote.defaultPort=9200
elasticsearch.remote.hosts=elasticsearch:9200
elasticsearch.remote.username=elastic
elasticsearch.remote.password=password

Note that in the above, we also switch to the Elastic Search http client and now use port 9200.

We also provide a username and a password. This is because the Enterprise version of Elastic Search comes with XPack and we turn security on. This generates a user and assigns a password (as per the Elastic Search configuration above).

You should then rebuild your API container.
The container should come up without any issue.

Troubleshooting

By default, the Elastic Search HTTP Client within Cloud CMS is configured so that Hostname Verification is disabled.

For security reasons, you may wish to adjust this behavior. You can do so by adjusting docker.properties to provide the following:

elasticsearch.remote.invalidHostNameAllowed=false

For more information, see https://www.elastic.co/guide/en/elasticsearch/reference/current/configuring-tls-docker.html

Configuring SSL for Custom HTTPS Endpoints

These instructions assume a Linux environment and assume you're using the Cloud CMS quickstart kit.

Generate a Certificate

To generate a certificate, open a shell window and run the following:

  1. Generate your self-signed SSL certificate
openssl req -newkey rsa:2048 -new -x509 -days 3650 -nodes -out custom-cert.crt -keyout custom-cert.key
  1. Create a PEM file
cat custom-cert.key custom-cert.crt > custom.pem

Register the SSL Certificate with Cloud CMS API

To configure the Cloud CMS API to use SSL to connect to MongoDB:

  1. Copy custom-cert.crt next to your API Dockerfile.

  2. Append the following to the API Dockerfile:

# merge our certificate into the truststore
ADD custom-cert.crt /etc/ssl/custom-cert.crt
RUN \
    cd $JAVA_HOME/lib/security \
    && keytool -keystore cacerts -storepass changeit -noprompt -trustcacerts -alias custom -import -file /etc/ssl/custom-cert.crt

Note: The sample above is for 3.2.37 and above. If you're using a version of Cloud CMS prior to this, adjust $JAVA_HOME/lib/security to $JAVA_HOME/lib/jre/security.

You should then rebuild and restart your API container.
The container should come up without any issue.

Inspecting SSL Certificates

To inspect the SSL certificates registered for the API service keystore (using Docker Compose), you can first bash into the API container like this:

docker-compose exec api bash

And then list the certificates in the keystore like this:

cd $JAVA_HOME/lib/security
keytool -keystore cacerts -storepass changeit -list -v

Note: The sample above is for 3.2.37 and above. If you're using a version of Cloud CMS prior to this, adjust $JAVA_HOME/lib/security to $JAVA_HOME/lib/jre/security.

Troubleshooting

This section provides various troubleshooting scenarios. These aren't in any particular order but are provided to help others with suggestions on how to diagnose typical problems that can arise.

Reset the Virtual Driver

Cloud CMS maintains a "Virtual Driver User" that the UI Server uses to support virtual hosting. The Cloud CMS UI (User Interface) connects as this Virtual Driver User to fetch API credentials for one or more tenants that try to log in. In this way, Cloud CMS supports multiple tenants on the same user interface box. Each tenant has their own API credentials and these credentials are fetched, dynamically, using the Virtual Driver User.

The credentials for the Virtual Driver User must match between the API and the UI. In other words, the credentials specified in your docker.properties file for the API must match those specified in the ui.env file for the UI.

In the docker.properties file, these are specified like this:

cloudcmsnet.virtualdriver.clientKey=
cloudcmsnet.virtualdriver.clientSecret=
cloudcmsnet.virtualdriver.authGrantKey=
cloudcmsnet.virtualdriver.authGrantSecret=
cloudcmsnet.virtualdriver.username=
cloudcmsnet.virtualdriver.password=

In the ui.env file, these are specified like this:

CLOUDCMS_VIRTUAL_DRIVER_CLIENT_KEY=<value of cloudcmsnet.virtualdriver.clientKey>
CLOUDCMS_VIRTUAL_DRIVER_CLIENT_SECRET=<value of cloudcmsnet.virtualdriver.clientSecret>
CLOUDCMS_VIRTUAL_DRIVER_AUTHGRANT_KEY=<value of cloudcmsnet.virtualdriver.authGrantKey>
CLOUDCMS_VIRTUAL_DRIVER_AUTHGRANT_SECRET=<value of cloudcmsnet.virtualdriver.authGrantSecret>

These values can be changed at any time.

If you decide to change these values or if you find yourself to be in a state where your virtual driver user isn't connecting or is somehow preventing the API from starting up, you may want to reset the API to boosttrap the values from api.env anew.

To do this, you can either launch the Cloud CMS API with the following Java parameter:

-DresetVirtualDriver=true

Or you can set an environment variable (using api.env):

CLOUDCMS_RESET_VIRTUAL_DRIVER=true

When your API starts up, it will remove any old virtual driver client and authentication grant information and create it anew using the values from your api.env file.