Results 1 to 3 of 3

Thread: satellites poller does not received nor do any check for the services

  1. #1
    Junior Member
    Join Date
    Nov 2013

    satellites poller does not received nor do any check for the services

    • Hi almighty shinken folks,

      First of all, i really apprieciate shinken

      CONTEXT :
      1 central shinken: with arbitrer/broker/scheduler and main poller
      8 other satellite pollers (8 poller_tag)
      1 realm : All
      all in shinken 1.4
      we use nrpe_booster (nrpe_poller) as 90% of our checks are "nrpe" .
      9500 services

      PROBLEM :
      satellites poller (poller_tag poller ) does not seems to receive and do anything regarding the service . only hosts check seems to work
      viar emote poller :
      - host-check from main poller -> OK
      - services-check from main poller -> OK
      - host-check from poller_tag poller -> OK
      - service-check from poller_tag poller -> KO[/li]

    The hard part is :
    - I have no way to know where the conf stop spreading and tool to troubleshoot. Do you ?
    - I have no way to read the conf inside the arbitrer, inside the scheduler (the in-memory conf and not the file conf) to compare both and see if my in-file conf is missunderstood by arbitrer/scheduler/poller or see if my poller_tag is kick-off somewhere in the pipe between my in-file conf and the poller_tag poller : Do you ?

    All my services poller_taged services conf (in order):
    - use a generic_service template (no poller_tag and register = 0 )
    - got a poller_tag MYTAG
    - applied to hostgroup MYTAG
    Then my poller_taged hosts conf (in order) :
    - use templateX which use generic_template (onlly the last got poller_tag None)
    - got a poller_tag MYTAG
    - applied into hostgroup MYTAG
    Then my satellite poller conf :
    - got a poller_tag MYTAG
    - use generic_poller (which has poller_tag None and register 0)

    MORE TEST/infos :

    - same issue even in shinken 1.2.2
    - in debug mode i can see my remote_tag pollers received some conf from my central scheduler but i don't know if there is the services or not . even in debug mode
    - when in thurk i force a reshceduled : it still stay in PENGING state and tcpdump of the ressrouces never receive the expected nrpe traffic (check with tcpdump)
    - in thruk the next scheduled time is never updated and so outpassed.
    - i used : check_for_orphaned_hosts=1 and check_for_orphaned_services=1 but no effect.

    Tried quite a bunch of test-and-failed solution that i really need a constructive help here

    plz, almighty shinkeners help

  2. #2
    Junior Member
    Join Date
    Nov 2013

    Re: satellites poller does not received nor do any check for the services

    Hi folks,

    Here's the update :
    It's now working as expect (almost)..we'll see for how long this time.

    but HOW :
    - I upgraded to 1.4.0 (which cause this new anoying BUG "Warning : [Livestatus Broker] Closing socket failed: [Errno 107] Transport endpoint is not connected" each time i reschedule a check via thruk)
    - I disabled the nrpe_booster (nrpe_poller) (1)--> this module is definetly not STABLE yet especially with my architecture (satellite poller)
    - I disabled mongodb retention (back to pickerentention) for my scheduler. seems to screwed my scheduling and troubleshooting
    - restart shinken like so :"/etc/init.d/shinken stop; rm -f /tmp/retention.dat ; /etc/init.d/shinken start" to avoid a scheduling WEIRDNESS cause by the rentention file.

    (1) when i say disable the nrpebooster , i meant EVERYWHERE !!, it's not possible to keep it for the central poller and disable it for all the satellite poller. and disable it in ALL the "define command" as well.

    With proper tuning in nagios.cfg i can now survive so far without the nrepbooster....but for how long...mystery

    I hope it will help other people.

    Cdt, Aurélien Lemaire / Smile Hosting

  3. #3
    Shinken project leader
    Join Date
    May 2011
    Bordeaux (France)

    Re: satellites poller does not received nor do any check for the services

    As I already put a lot since this nrpe_module was out, it should NOT be used until you mastered the architecture and especially the nrpe internals (versions and thing like this). But whatever.

    For the poller tag, remember that this property is inherited from the host (and not the group, you can-should- remove this configuraiton part), but only if the servie are not already taggued. You should get a closer look at your service templates to check for this.

    In order to dump configuration, if I can remmeber the shinken-admin command can dump objets from the arbiter, but I didn't used it since long so I don't remember how to do.

    For the mongodb never got such screwed problems, especially because the data saved inside pickle is the same in fact.
    No direct support by personal message. Please open a thread so everyone can see the solution

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts