DESCRIPTION: EX SSL-VPN: Policy Replication fails on 10.6.3
Policy replication fails with high number of ACL rules and resources.
Post upgrade to 10.6.3 when we replicate the master says replication failed. On the recipients it logs a successful replication but the changes are left as pending and don't apply. Also we get Java errors in AMC console until we restart mgmt-server.
*****snippet of management.log*********** Nov 27 05:24:27 127.0.0.1/127.0.0.1 AMC: 2013-11-27 05:24:27 +0000 WARNING com.aventail.mgmt.gbladmin.conversations.SenderConversation - Connection failure trying to start replication conversation: ; nested exception is: <167> java.net.ConnectException: Connection refused Nov 27 05:24:27 127.0.0.1/127.0.0.1 AMC: 2013-11-27 05:24:27 +0000 ERROR com.aventail.mgmt.gbladmin.conversations.ReplicationSenderConversation - Policy replication to lab-internal-3 failed with error: CONNECTION_FAILED
Lot of Java exception errors. There were java memory errors in: receiver-management.log: Dec 3 05:20:04 127.0.0.1/127.0.0.1 java.lang.OutOfMemoryError: Java heap space Dec 3 05:39:36 127.0.0.1/127.0.0.1 java.lang.OutOfMemoryError: GC overhead limit exceeded
Nov 27 05:26:49 127.0.0.1/127.0.0.1 AMC: 2013-11-27 05:26:49 +0000 VERBOSE com.aventail.mgmt.sql.Sql - Query to obtain local user list took 0.211 seconds Nov 27 05:26:50 127.0.0.1/127.0.0.1 AMC: 2013-11-27 05:26:50 +0000 ERROR com.aventail.mgmt.gbladmin.conversations.ReplicationSenderConversation - Policy replication to cpu3-lab-internal-4 failed with error: INSTALL_FAILED Nov 27 05:26:57 127.0.0.1/127.0.0.1 AMC: 2013-11-27 05:26:57 +0000 WARNING com.aventail.mgmt.gbladmin.conversations.ReplicationSenderConversation -
Tracking ID DTS #137424
SonicWall engineering fixed the issue with a workaround (Hand edit) along with a test hotfix. Please contact Technical support for the test-hotfix.
Root cause of the problem is JVM Out Of Memory during replication. To increase the memory allocation, please do the following on BOTH sender and receiver(s) nodes:
1. edit /usr/local/app/mgmt-server/bin/start.sh (line 107: add -Xmx512m after $JAVA_HOME/bin/java) (start.sh attached) 2. #/etc/init.d/mgmt-server/restart 3. Try replicate, we should be able to replicate successfully.
Note: This is officially fixed in 10.6.5 and 10.7.1 already