GOAL
To troubleshoot communication issues between your Mule cluster nodes in an on-prem or hybrid environment.
PROCEDURE
The following is a list of ordered steps that we recommend to perform on any communication issue scenario:
1) Enable cluster verbose mode
Please follow the instructions from here:
How to enable cluster verbose logging in Mule Runtime
2) Identify the cluster configuration file
Locate the
$MULE_HOME/.mule/mule-cluster.properties file on each server. Please consider that the ".mule" folder may appear hidden in some environments.
That file is created automatically when creating the cluster in hybrid environments, or manually created in on-prem (not hybrid) environments.
If the file is missing:
- Verify if any non-Mule related software is deleting it
- Consider restoring a backup
- Or create it manually. For more information, please read Create and Manage a Cluster Manually documentation
Note: is normal to see "mule.clusterSize=0" property in clusters created from Runtime Manager. Should not be an issue.
3) Please check that the cluster configuration file is not corrupted
The mule-cluster.properties file may get corrupted, so please validate that:
- You may open the file
- The security rights are correct (644 in Linux)
- There are no hidden characters
4) Validate if the wrapper.conf is overriding the cluster configuration file
It's possible to override the mule-cluster.properties file using the $MULE_HOME/conf/wrapper.conf file, so make sure which ones are the current properties in use. More info:
How to override cluster properties
5) Check cluster members on every node
In the cluster.log file (or in the mule_ee.log file if you are using runtime 3.7.x up to 4.1.x), each cluster node should print the list of members once every few minutes. For example:
Members [2] {
Member [192.168.0.50]:5701 this
Member [192.168.0.199]:5701
}
If you get a communication issue, it should log a different list of members like:
Members [1] {
Member [192.168.0.50]:5701 this
}
In the above example, you may need to troubleshoot communication issues from 192.168.0.50 to 192.168.0.199.
There is a strange scenario where the list of members does not include the current node, for it, please check:
Error com.hazelcast.partition.NoDataMemberInClusterException on Mule 4 clustersIf you actually get a communication issue, you should check Mule EE log ($MULE_HOME/logs/mule_ee.log file) of each node, since a common restart or unresponsiveness could be triggering the communication issue.
6) Recognize if you have a unicast or multicast cluster configuration
In hybrid environments, you may check if it's UNICAST or MULTICAST from the Runtime Manager UI in Anypoint Platform, but you may also know this by checking the mule-cluster.properties file for the
mule.cluster.multicastenabled property.
| mule.cluster.multicastenabled=false | UNICAST |
| mule.cluster.multicastenabled=true | MULTICAST |
7.A) If you have a unicast cluster
Check the mule-cluster.properties file for the
mule.cluster.nodes property to know the
IP addresses.
That property may also contain the
ports, otherwise, the default ports are 5701, 5702 and 5703
Please check that
every node has open communication to the other nodes using the destination IP address and ports. You may use the following article as guidance on how to do networking tests:
Network connectivity testingMore info:
Mule Runtime High Availability (HA) Cluster Overview - Prerequisites
7.B) If you have a multicast cluster
For multicast environments, please perform the following checks:
- The same checks as any unicast clusters
- Also, validate that the multicast IP address 224.2.2.3 is available
- Finally, check that the UDP port 54327 is open
If you have more than one multicast cluster in the same host, please check the
mule.cluster.multicastgroup property in mule-cluster.properties or wrapper.conf file, since it may contain additional IP addresses to validate.
To know how to check multicast IP addresses, please check:
How To Check If Multicast Works On The Network Hosting Mule Cluster or record tcpdump from one node while sending pings to the multicast IP address. Example:
In node 1, considering that "eth0" is the network interface:
tcpdump -ni eth0 host 224.2.2.3
In node 2, send pings to 224.2.2.3 and the tcpdump should record each ping.
8) Your issue could be related to a known and already fixed issue...
So please consider updating your runtime to the latest version of a standard support runtime. You may read
Product Versioning and Back Support Policy to know the runtimes in standard or extended support, and download the latest monthly patch for it from our MuleSoft Help Center.
9) If the issue remains...
Please contact MuleSoft Support and provide the following
from each node:
- Mule EE logs
- Cluster logs
- Cluster configuration file
- Wrapper configuration file