Hello,

At my Company, we are testing both gearmand and shinken for our next monitoring infrastructure.
We are facing some problems with shinken pollers : it seems every check (via nagios perl plugins, both officials and of our own) is ending in a zombie.
Sometimes we have up to 800 zombies at a time.
The check are done through.

The same plugins used by nagios and gearmand show no problems.
Does someone have any ideas ?

We are checking 16000 services with one poller. The same poller is used for shinken and nagios / gearmand (not at the same time of course )
the average load is higher using shinken than nagios / gearmand.

The poller is a 8 cores / 16 MT cores with 12 Go RAM.
We have another physical server as arbiter, broker (ndo and NPCD), receiver and reactionner ; and a VM for scheduler.

Any idea is wlecome

Regards