Page 1 of 2 12 LastLast
Results 1 to 10 of 16

Thread: High availibility, problem with restart of shinken

  1. #1
    Junior Member
    Join Date
    Nov 2011
    Posts
    8

    High availibility, problem with restart of shinken

    Hi,

    I followed the tutorial to try high availability at : http://www.shinken-monitoring.org/wi...bility_shinken

    I want to test if my spare daemons will take the relay if those on the master server fall.

    I work with vmware workstation, I have 2 virtual machines.


    I add those lines in /etc/shinken/shinken-specific.cfg on the master server :

    Code:
    define scheduler{
        scheduler_name	scheduler-spare
        address	    server3
        port	    7768
        spare	    1
        }
    
    define poller{
        poller_name   poller-spare
        address     server3
        port      7771
        spare      1
    }
    
    define reactionner{
        reactionner_name	reactionner-spare
        address	    server3
        port	    7769
        spare	    1
        }
    
    define receiver{
        receiver_name  receiver-spare
        address     server3
        port       7773
        spare      1
    }
    
    define broker{
        broker_name   broker-spare
        address     server3
        port      7772
        spare      1
        modules     Simple-log,Livestatus
    }
    
    define arbiter{
        arbiter_name  arbiter-spare
        address     server3
        host_name    server3
        port      7772
        spare      0
    }

    But i have an error when I restart all daemons on the master server :

    Failed : shinken.pyro_wrapper.PortNotFree; sorry, the port 7772 is not free : couldn't start pyro daemon, : [Errno 99] Cannot assign requested addres ( full output is in /tmp/badstart_for-arbiter) ...failed! failed!


    I tried to update the pyro version with apt-get install pyro, but I already have the last version.

    Whatever port I put for the spare arbiter, it says : the port "XXXX" is not free....

    And i don't understand why on the tutorial, the spare for arbiter is at 0, and why it has the same port as the broker.


  2. #2
    Administrator
    Join Date
    Jun 2011
    Posts
    216

    Re: High availibility, probleme with restart of shinken

    I guess your problem is that the spare doesn't handle the HA correctly

    I see that what you've added to your master configuration is a copy/paste from the tutorial. I assume that there is no mistakes in it and that you modify "server3" by the hostname of your server.

    By the way, did you copy the shinken-specific.cfg on the server3?

    If so, try to paste here some line of nagios.log or something to see how Shinken behaves

  3. #3
    Junior Member
    Join Date
    Nov 2011
    Posts
    8

    Re: High availibility, probleme with restart of shinken

    No i put my own hostname for the spare arbiter, i just forgot to show you the modified configuration ^_^

    And yes, the files is copied on server3

    Ok no problem, i'll try to show you the nagios log

  4. #4
    Administrator
    Join Date
    Jun 2011
    Posts
    216

    Re: High availibility, probleme with restart of shinken

    I've just noticed your edit.

    This is not a pyro version issue, don't worry . In fact, every satellite of Shinken listen to tcp port. Without distributed monitoring, everything is done at the localhost level. Ports are differents because every satellite has to listen to a different port.

    By the way, I've just noticed that the port for the arbiter is 7772 in the example and it should be 7770.
    Try to modify it for both arbiters in both shinken-specific.cfg

  5. #5
    Junior Member
    Join Date
    Nov 2011
    Posts
    8

    Re: High availibility, probleme with restart of shinken

    Oh ok , i thought it was an error ^^, i'll try that, then i'll show you my nagios log if it still doesnt work

    And i let the spare at 0 for the slave arbiter?

  6. #6
    Administrator
    Join Date
    Jun 2011
    Posts
    216

    Re: High availibility, probleme with restart of shinken

    Well, you can try both .
    But I think the value should be 1

  7. #7
    Junior Member
    Join Date
    Nov 2011
    Posts
    8

    Re: High availibility, probleme with restart of shinken

    It doesnt work with 1 or 0 for the spare, i let it to 1. I cloned the server1 to have my server3, then I changed the Ip and hostname, maby, its the problem?

    here you have the last 100 lines of nagios.log :

    Code:
    [1322369579] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369579] Sent failed!
    [1322369580] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369580] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369580] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369580] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369580] Sent failed!
    [1322369581] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369581] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369581] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369581] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369581] Sent failed!
    [1322369582] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369582] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369582] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369582] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369582] Sent failed!
    [1322369583] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369583] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369583] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369583] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369583] Sent failed!
    [1322369584] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369584] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369584] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369584] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369584] Sent failed!
    [1322369585] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369585] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369585] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369585] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369585] Sent failed!
    [1322369586] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369586] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369586] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369586] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369586] Sent failed!
    [1322369587] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369587] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369587] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369587] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369587] Sent failed!
    [1322369588] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369588] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369588] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369588] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369588] Sent failed!
    [1322369589] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369589] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369589] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369589] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369589] Sent failed!
    [1322369590] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369590] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369590] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369590] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369590] Sent failed!
    [1322369591] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369591] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369591] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369591] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369591] Sent failed!
    [1322369592] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369592] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369592] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369592] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369592] Sent failed!
    [1322369593] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369593] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369593] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369593] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369593] Sent failed!
    [1322381776] [reactionner-1] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322381776] [reactionner-1] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322381776] [reactionner-1] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322381776] [reactionner-1] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322381776] Sent failed!
    [1322369594] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369594] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369594] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369594] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369594] Sent failed!
    [1322381777] [reactionner-1] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322381777] [reactionner-1] Scheduler scheduler-1 is not initilised or got network problem: ('no object found by this name', u'Checks')
    [1322381777] [reactionner-1] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322381777] [reactionner-1] Scheduler scheduler-1 is not initilised or got network problem: ('no object found by this name', u'Checks')
    [1322381777] Sent failed!
    [1322381777] [broker-1] [broker-1] Connection problem to the scheduler scheduler-1 : connection lost
    [1322369595] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369595] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369595] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369595] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369595] Sent failed!
    [1322381778] [broker-1] [broker-1] the scheduler 'scheduler-1' is not initilised : ('no object found by this name', u'Broks')
    [1322381778] [broker-1] [broker-1] Connection problem to the poller poller-1 : connection lost
    [1322369596] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369596] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369596] [poller-2] Init de connection with scheduler-1 at PYROLOC://localhost:7768/Checks
    [1322369596] [poller-2] Scheduler scheduler-1 is not initilised or got network problem: connection failed
    [1322369596] Sent failed!

  8. #8
    Administrator
    Join Date
    Jun 2011
    Posts
    216

    Re: High availibility, probleme with restart of shinken

    Ok, I assume that 1 stand for master and 2 for slave.

    If they have a different IP address and MAC address, the cloning has no effect.

    Did you put localhost for your master hostname? Please put only hostname or IP adresses for those field. Do not use localhost .

    This issue seems to be related to the distributed monitoring also, if you read the "Declare the new poller on the main configuration file" part : http://www.shinken-monitoring.org/wi...ibuted_shinken
    This is due to the fact that satellites send their configuration via network. The data sent contains the pyro URI which comes from the cfg file. If the URI has localhost in it, it will fail

  9. #9
    Junior Member
    Join Date
    Nov 2011
    Posts
    8

    Re: High availibility, probleme with restart of shinken

    I just replaced the localhost by the Ip of the server1, on both shinken-specific.cfg, then i tried to restart, but still the same error

  10. #10
    Junior Member
    Join Date
    Nov 2011
    Posts
    8

    Re: High availibility, probleme with restart of shinken

    Something is weird, I have the same config on server1 and server3, but when I restart Shinken on server3, there is no error unlike the one on server1

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •