Why do I need at least three nodes for a cluster?

A node needs a way to determine whether the node is still a member of the cluster even if a node is removed. If a node can communicate with the majority of the nodes, the node knows that it is still a member of the cluster. If the cluster started as a two node cluster and the connection between the nodes fail, the nodes cannot decide which node forms the cluster. The nodes will therefore refuse to accept incoming connections. It is advised to use an uneven number of nodes to make sure that in case of a network split, a majority of nodes can be formed.

Can a node run in a different data center than the other nodes?

Yes that is possible. Just make sure that the connection between the two data centers is reliable. Whether or not using two data centers is enough depends on what level of high availability (HA) you want to accomplish. Suppose you have two nodes running in data center A and one node in data center B (you need an uneven number of nodes). If the connection between data center A and B is failing, the two nodes in data center A will have the majority (2 out of 3) and continue to function. The node in data center B however does not have the majority and therefore stops to accept connections. If data center A is completely destroyed (for example because of a fire), the nodes at data center A no longer work. However, the node at data center B does not have the majority (in only has 1 out of 3) and therefore stops to accept connections until it is manually bootstrapped. To have the highest level of HA, an uneven number of data centers should be used using at least 5 nodes.

If two nodes are running in data center A and one node in data center B and the connection between the data centers fail, can I make sure that CipherMail is functional in both data centers?

If the connection between the data centers fail, the two nodes at data center A will keep functioning because they have the majority (2 out of 3). The node at data center B stops accepting incoming connections. It is possible to force bootstrapping the node at data center B. This will result in two separate clusters, one at data center A and one at data center B. The two clusters are however not synchronized (split brain situation). Any changes done at one of the gateways running in data center A or B will result in two divergent systems. If later the full cluster is restored, some changes might be overwritten (depending on which node is bootstrapped). It is therefore advised to only do this in cases where restoring the connection between the data centers take a long time or if there is some other issue which might take long to solve (for example a fire destroyed data center A).