Salesforce

How to troubleshoot Mule cluster communication issues on hybrid or on-prem environments

« Go Back

Information

 
Content

GOAL

To troubleshoot communication issues between your Mule cluster nodes in an on-prem or hybrid environment.
 

PROCEDURE

The following is a list of ordered steps that we recommend to perform on any communication issue scenario:
 

1) Enable cluster verbose mode

Please follow the instructions from here: How to enable cluster verbose logging in Mule Runtime 
 

2) Identify the cluster configuration file

Locate the $MULE_HOME/.mule/mule-cluster.properties file on each server. Please consider that the ".mule" folder may appear hidden in some environments.
That file is created automatically when creating the cluster in hybrid environments, or manually created in on-prem (not hybrid) environments.
If the file is missing:
  • Verify if any non-Mule related software is deleting it
  • Consider restoring a backup
  • Or create it manually. For more information, please read Create and Manage a Cluster Manually documentation
Note: is normal to see "mule.clusterSize=0" property in clusters created from Runtime Manager. Should not be an issue.
 

3) Please check that the cluster configuration file is not corrupted

The mule-cluster.properties file may get corrupted, so please validate that:
  • You may open the file
  • The security rights are correct (644 in Linux)
  • There are no hidden characters
 

4) Validate if the wrapper.conf is overriding the cluster configuration file

It's possible to override the mule-cluster.properties file using the $MULE_HOME/conf/wrapper.conf file, so make sure which ones are the current properties in use. More info: How to override cluster properties
 

5) Check cluster members on every node

In the cluster.log file (or in the mule_ee.log file if you are using runtime 3.7.x up to 4.1.x), each cluster node should print the list of members once every few minutes. For example:
Members [2] {
    Member [192.168.0.50]:5701 this
    Member [192.168.0.199]:5701
}
If you get a communication issue, it should log a different list of members like:
Members [1] {
    Member [192.168.0.50]:5701 this
}
In the above example, you may need to troubleshoot communication issues from 192.168.0.50 to 192.168.0.199.

There is a strange scenario where the list of members does not include the current node, for it, please check: Error com.hazelcast.partition.NoDataMemberInClusterException on Mule 4 clusters

If you actually get a communication issue, you should check Mule EE log ($MULE_HOME/logs/mule_ee.log file) of each node, since a common restart or unresponsiveness could be triggering the communication issue.
 

6) Recognize if you have a unicast or multicast cluster configuration

In hybrid environments, you may check if it's UNICAST or MULTICAST from the Runtime Manager UI in Anypoint Platform, but you may also know this by checking the mule-cluster.properties file for the mule.cluster.multicastenabled property. 
 
mule.cluster.multicastenabled=falseUNICAST
mule.cluster.multicastenabled=trueMULTICAST
 


7.A) If you have a unicast cluster

Check the mule-cluster.properties file for the mule.cluster.nodes property to know the IP addresses.
That property may also contain the ports, otherwise, the default ports are 5701, 5702 and 5703

Please check that every node has open communication to the other nodes using the destination IP address and ports. You may use the following article as guidance on how to do networking tests: Network connectivity testing

More info: Mule Runtime High Availability (HA) Cluster Overview - Prerequisites
 

7.B) If you have a multicast cluster

For multicast environments, please perform the following checks:
  • The same checks as any unicast clusters
  • Also, validate that the multicast IP address 224.2.2.3 is available
  • Finally, check that the UDP port 54327 is open
If you have more than one multicast cluster in the same host, please check the mule.cluster.multicastgroup property in mule-cluster.properties or wrapper.conf file, since it may contain additional IP addresses to validate.

To know how to check multicast IP addresses, please check: How To Check If Multicast Works On The Network Hosting Mule Cluster or record tcpdump from one node while sending pings to the multicast IP address. Example:

In node 1, considering that "eth0" is the network interface:
tcpdump -ni eth0 host 224.2.2.3
In node 2, send pings to 224.2.2.3 and the tcpdump should record each ping.
 

8) Your issue could be related to a known and already fixed issue...

So please consider updating your runtime to the latest version of a standard support runtime. You may read Product Versioning and Back Support Policy to know the runtimes in standard or extended support, and download the latest monthly patch for it from our MuleSoft Help Center.
 

9) If the issue remains...

Please contact MuleSoft Support and provide the following from each node:
  • Mule EE logs
  • Cluster logs 
  • Cluster configuration file 
  • Wrapper configuration file
Attachments

Powered by