Results 1 to 5 of 5

Thread: [Solved] Distributed Shinken. Can't connect to the poller

  1. #1
    Administrator
    Join Date
    Jun 2011
    Posts
    216

    [Solved] Distributed Shinken. Can't connect to the poller

    Hi there!

    I was trying to use the distributed shinken following the official doc.

    First thing, I thing i found bug : When i declare a new poller in the shinken-specific.cfg and use an already declared name, it doesn't raise an error ???.

    So after a "good" conf file, the main Shinken doen't manage to connect to my second poller :

    Code:
    [1311580930] [poller-2] Init de connection with myscheduler at PYROLOC://localhost:7768/Checks
    [1311580930] [poller-2] Scheduler myscheduler is not initilised or got network problem: connection failed
    [1311580930] Sent failed!
    Here's the shinken-specifc file :
    Code:
    define poller{
        poller_name   poller-2
        address     192.168.238.70
        port 7771
    }
    The TCP is made, as I can see here : (netstat -plan |grep 7771 from the .70 server)
    Code:
    tcp    0   0 0.0.0.0:7771      0.0.0.0:*        LISTEN   6483/python
    tcp    0   0 192.168.238.70:7771   192.168.238.143:45167  ESTABLISHED 6483/python
    tcp    0   0 192.168.238.70:7771   192.168.238.143:45160  ESTABLISHED 6483/python
    Is it normal to have two connections?

  2. #2
    Administrator
    Join Date
    Jun 2011
    Posts
    216

    Re: Distributed Shinken. Can't connect to the poller

    I found this in the log of the poller :


    Code:
    We have our schedulers : {0: {'wait_homerun': {}, 'name': u'myscheduler', 'uri': u'PYROLOC://localhost:7768/Checks', 'actions': {}, 'instance_id': 0, 'running_id': 0, 'address': u'localhost', 'active': True, 'port': 7768, 'con': None}}
    The address is wrong : it may be the IP of the other Shinken isn't it?

  3. #3
    Administrator
    Join Date
    Jun 2011
    Posts
    216

    Re: Distributed Shinken. Can't connect to the poller

    Ok, it is that . When I put its own IP in the address field in shinken-specific.cfg (localhost before) it works.
    But I have a very unexpected feature with it : the nagios.log is no more written ??? ???
    When I change back, I have the error and the nagios.log ;D

    I'll lock later, cause i still have a strange behavior :P

  4. #4
    Administrator
    Join Date
    Jun 2011
    Posts
    216

    Re: Distributed Shinken. Can't connect to the poller

    I've got a little time to go deeper.

    Whan I launch the arbiter in the bash I have this :


    Code:
    Dispatching Realm All
    [All] Dispatching 1/1 configurations
    [All] Schedulers order : myscheduler
    [All] Dispatching configuration 0
    [All] Trying to send conf 0 to scheduler myscheduler
    [All] WARNING : configuration dispatching error for scheduler myscheduler
    WARNING : All schedulers configurations are not dispatched, 1 are missing
    [All] Trying to send configuration to receiver receiver-1
    [All] Dispatch OK of for configuration to receiver receiver-1
    Run baby, run...
    Scheduler configuration 0 is unmanaged!!
    Warning : Missing satellite reactionner for configuration 0 :
    Warning : Missing satellite poller for configuration 0 :
    Warning : Missing satellite broker for configuration 0 :
    The arbiter doesn't manage to dispatch configuration to the scheduler ???

    EDIT :

    I've uncommented some print in sattelitelink.py and I've found that :

    Code:
    Try to put conf: (<shinken.objects.config.Config object at 0xb7953bcc>, {}, [], {'pollers': {0: {'passive': False, 'name': u'poller-1', 'poller_tags': ['None'], 'instance_id': 0, 'reactionner_tags': [], 'address': u'localhost', 'active': True, 'port': 7771}}, 'reactionners': {0: {'passive': False, 'name': u'reactionner-1', 'poller_tags': [], 'instance_id': 0, 'reactionner_tags': ['None'], 'address': u'localhost', 'active': True, 'port': 7769}}})
    PYROLOC://192.168.238.143:7768/ForArbiter <DynamicProxy for PYROLOC://192.168.238.143:7768/ForArbiter> 120 3
    Traceback (most recent call last):
     File "/usr/lib/python2.5/site-packages/shinken/satellitelink.py", line 94, in put_conf
      self.con.put_conf(conf)
     File "/usr/lib/python2.5/site-packages/Pyro/core.py", line 390, in __call__
      return self.__send(self.__name, args, kwargs)
     File "/usr/lib/python2.5/site-packages/Pyro/core.py", line 467, in _invokePYRO
      self.adapter.bindToURI(self.URI)
     File "/usr/lib/python2.5/site-packages/Pyro/protocol.py", line 255, in bindToURI
      raise ProtocolError('connection failed')
    ProtocolError: connection failed

  5. #5
    Administrator
    Join Date
    Jun 2011
    Posts
    216

    Re: Distributed Shinken. Can't connect to the poller

    Ok I've got it


    If a put an DNS name or an IP in the scheduler (to fix the first issue), the arbiter doesn't manage to give the conf to the scheduler. Therefore, others sattelites don't have their conf. They are all waiting for it. This is why no log are written etc, Shinken don't really start...


    If I let the localhost address, the poller-2 got a bad conf. In fact, I think that no difference are made between a local satellite and a distributed sattelite. The address of the scheduler have to be changed when sending the conf to an "external" poller. I don't know how to fix that guyz, maybe in the dispatcher.py


    Here some debug print I've done :

    Code:
    [All] poller satellite order : poller-2 (spare:False), poller-1 (spare:False),
    [All] Trying to send configuration to poller poller-2
    printing cfg_for_satellite_part : 
    {'instance_id': 0, 'active': True, 'port': 7768, 'name': u'myscheduler', 'address': u'localhost'}
    The poller go a bad conf, that why he can't connect to the scheduler.


    EDIT :

    Fixed ! The scheduler.ini was bad. The host line have to be 0.0.0.0 to listen on all interfaces. Thks Nap'

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •