Results 1 to 2 of 2

Thread: Host monitoring immediately goes into HARD state

  1. #1

    Host monitoring immediately goes into HARD state

    In my infrastructure I have servers, which periodically do some heavy work. Let's say, that every 10 minutes the cpu utilization is 100% and it lasts no longer than 5 minutes. If the servers work in this state for longer than 15 minutes, then I want to be notified, otherwise peak utilization for a short period of time is normal.

    Because of that, I created a special host class in
    which looks like this:

    define host {
            name                            linux_1min_15tries_every1min
            use                             linux
            max_check_attempts              15
            check_interval                  1
            retry_interval                  1
            flap_detection_enabled          0
            register                        0
    Then I use it in
    like this:

    define host{
            use                     linux_1min_15tries_every1min
            contact_groups          admins
            host_name               serv1
    Meanwhile the host goes into CRITICAL state in Shinken every time the utilization peaks. It looks like it reaches the HARD state after the first check. Also, the counter doesn't ever go up. It always stays at 1/15 lvl. Any ideas how to fix that?

    Shinken version is 1.4.2

    Screenshot from 2015-08-07 16-41-12.jpg

  2. #2
    OK, so my problem was that I modified the host check, which is basically ping, but didn't modified the service checks, which is how the cpu and memory monitoring is handled. After doing so everything works as expected.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts