-
Notifications
You must be signed in to change notification settings - Fork 14
Description
System: Ubuntu 18.04.5 LTS
Webserver: Apache Tomcat 8.5.39
24 cores, 8 GB RAM
We're using the following startup script to run our optimised maxima image, since using the command directly has resulted in failure:
#!/bin/bash /var/lib/maximapool/2019090200/maxima-optimised-rub -eval "(cl-user::run)"
We're not sure whether this is the root of the problem we've been experiencing the past days but suspect it might be related.
In irregular intervals, our MaximaPool has produced runaway processes that use up one CPU core an increasingly large amounts of RAM (> 5GB). We guess this is due to syntactically correct but hard to compute requests like those described here:
https://moodle.org/mod/forum/discuss.php?d=278063
We managed to reproduce the behaviour by entering the following example on the MaximaPool status page Test form:
array('AlgEquiv', 'y=(x-a)^5999', 'y=(x-a)^6000', 0, '', '');
Our pool configuration is as follows:
| Name | Value |
|---|---|
| Root directory: | /var/lib/maximapool |
| Min pool size: | 10 |
| Max pool size: | 10 |
| Limit on number of processes starting up: | 5 |
| Maintenance cycle time: | 500 ms |
| Number of data points for averages: | 5 |
| Pool size safety multiplier: | 3.0 |
Values in process.conf for startup & execution timeouts as well as maximum lifetime are on default:
### Maximum lifetimes.
# This is the time that a process is allowed to take when starting up (ms).
startup.timeout = 10000
# This is the time added to the lifetime of a process when it is taken to use
# so that it wont be killed while in use (ms).
execution.timeout = 30000
# This is the lifetime given to a process (ms).
maximum.lifetime = 600000
We currently "solved" the issue by implementing a cronjob that kills all maximapool processes that have been running for more than 2 minutes. This solution is undesirable, though, as it is not very robust - multiple bad requests at the same time might still pose a problem for our server. We're also considering giving the machine more RAM to provide some safety margin.