03/26/2020 8 People found this article helpful 453,982 Views
EX SSL-VPN: Error Message: Threads blocked and pool could not be grown
In the system message log, an administrator sees an error message very similar to the following:
All 256 threads blocked and pool could not be grown, stack traces will be dumped every 60 seconds.
Users may be unable to log in to the appliance, or logins could take a long time.
The policy service (also referred to as policyserver) is a process which runs on the appliance and is responsible for user authentication and authorization. It is at the heart of the policy model and to remain efficient, it needs to keep certain system resources such as memory or CPU in reserve. If memory usage climbs too much or CPU usage and load are too high, policyserver will not create new threads to handle tasks because of the burden that would put on the system. In the section below, a few scenarios will be outlined to give you an idea of why policyserver may get into this condition.
Possible causes for this condition, and resolutions will be given here.
If you are using nested group lookup on your LDAP or AD authentication server, make sure that you are also caching the lookup result; searching the entire directory tree takes time and increases CPU usage. Here is how to enable caching:
In addition, setting your appliance's nested group lookup level to 2 could also help. Here is now to make that change:
When an appliance is heavily loaded with incoming user requests or authentication requests and the backend authentication server(s) or DNS server(s) are slow to respond or completely unresponsive, the appliance may need to grow its pool of threads to service these requests. Ensure that these backend servers are responding normally to minimize this condition for the appliance. If policyserver reaches a CPU or memory threshold and decides it cannot grow its thread pool, the message above will be logged, and users may notice slower access through the VPN.
As documented in KB item #5329, there is a policyserver bug which has been resolved in a hotfix where tunnel users cannot log into an appliance because policyserver has gotten into a thread blocking condition. There is a link to a hotfix for appliance versions 9.0.0, 9.0.1, and 9.0.2 in that KB article.
In appliance release 10.0.1 and earlier, if an IP address pool resource is used in an access control rule, policyserver can get itself into a threadblocking condition when that rule is checked by policyserver. This problem is exacerbated when a client system makes a large number of requests to policyserver in a very short period of time (on the order of about 60 per second, for example).
To workaround this issue, a customer can do the following:
This problem is fully resolved in appliance release 10.0.2. Customers running that release should not see threadblocking conditions due to the problem described in this section.
80725