Results 1 to 5 of 5

Thread: Spare arbiter and recover

  1. #1
    Junior Member
    Join Date
    Apr 2014
    Posts
    17

    Spare arbiter and recover

    Hi !

    I think I've noticed a strange behavior concerning spare arbiter and main arbiter on recovery.

    As you can see in the configuration below, I have 2 configured arbiters. One active and one spare. Each have a local access to all configuration files (rsync between servers)

    The main:
    Code:
    define arbiter {
        arbiter_name  arbiter1
        host_name    arbiter_FQDN
        address     XX.XX.XX.XX
        port      7770
        modules     CommandFile
        use_ssl     0
    }
    The spare:
    Code:
    define arbiter {
        arbiter_name  arbiter1-spare
        host_name    arbiter-spare_FQDN
        address     XX.XX.XX.XX
        port      7770
        spare      1
        modules     CommandFile
        use_ssl     0
    }
    When I want to test if the spare arbiter takes over when main arbiter fails, everything goes well. To do this, I just stop arbiter process.
    But, contrary to other process like scheduler or poller, when the main arbiter goes up again (the spare is active at this moment), it is not waiting for configuration and does not act as a spare. I think this is because the arbiter can read the configuration, sees he is not a spare and act according to it. However, the spare stays active and I have 2 active arbiters.
    To have only one active arbiter, I have to restart spare arbiter. This can be troublesome, as 2 arbiters may send different configuration.

    Should I open a new issue on Git Hub or did I just miss something about spare arbiter?

  2. #2
    Junior Member
    Join Date
    Aug 2014
    Posts
    9

    Re: Spare arbiter and recover

    Hello,

    Which shinken version do you have ?
    First I suggest you to enable DEBUG log so you can maybe provide more details.
    Personally I have just configured my spare to test if he is working correctly and it's the case.
    When I start again my main arbiter I have this log (on my spare arbiter):

    2014-08-27 09:47:28,165 [1409132848] Debug : Debug perf: ping [args:1.59740447998e-05] [aqu_lock:1.90734863281e-05] [calling:2.21729278564e-05] [json:4.41074371338e-05]
    2014-08-27 09:47:28,169 [1409132848] Debug : Debug perf: what_i_managed [args:3.19480895996e-05] [aqu_lock:3.38554382324e-05] [calling:3.88622283936e-05] [json:7.89165496826e-05]
    2014-08-27 09:47:28,173 [1409132848] Debug : HTTP: calling lock for have_conf
    2014-08-27 09:47:28,173 [1409132848] Debug : Debug perf: have_conf [args:0.000118017196655] [aqu_lock:0.000859022140503] [calling:0.000881910324097] [json:0.000914812088013]
    2014-08-27 09:47:28,177 [1409132848] Debug : Debug perf: ping [args:3.09944152832e-05] [aqu_lock:3.2901763916e-05] [calling:3.69548797607e-05] [json:5.00679016113e-05]
    2014-08-27 09:47:35,524 [1409132855] Debug : HTTP: calling lock for put_conf
    2014-08-27 09:47:39,164 [1409132859] Debug : Debug perf: put_conf [args:1.65340185165] [aqu_lock:2.24260997772] [calling:5.2905189991] [json:5.29283881187]
    2014-08-27 09:47:39,173 [1409132859] Debug : Received message to not run. I am the spare, stopping.
    2014-08-27 09:47:39,173 [1409132859] Debug : Debug perf: do_not_run [args:7.15255737305e-06] [aqu_lock:8.10623168945e-06] [calling:0.000357151031494] [json:0.000375032424927]
    2014-08-27 09:47:39,760 [1409132859] Debug : I wait for master
    2014-08-27 09:47:39,760 [1409132859] Info : Waiting for master death
    2014-08-27 09:47:39,761 [1409132859] Info : I'll wait master for 180 seconds
    2014-08-27 09:47:58,830 [1409132878] Debug : HTTP: calling lock for have_conf
    2014-08-27 09:47:58,831 [1409132878] Debug : Debug perf: have_conf [args:6.79492950439e-05] [aqu_lock:0.000437021255493] [calling:0.000452041625977] [json:0.000494003295898]
    2014-08-27 09:47:58,834 [1409132878] Debug : Received message to not run. I am the spare, stopping.
    The only difference with you on my configuration it's I specify "spare 0" for my main arbiter definition.

    Regards

  3. #3
    Junior Member
    Join Date
    Apr 2014
    Posts
    17

    Re: Spare arbiter and recover

    I use version 2.0.3.

    I follow your advice and set the 2 arbiters in debug mode. It seems I saw nothing... According to the logs, when main arbiter goes up again, it automatically takes back its old role :

    Code:
    2014-08-27 17:29:20,872 [1409153360] Info :  Starting HTTP daemon
    2014-08-27 17:29:20,873 [1409153360] Info :  Using a 8 http pool size
    2014-08-27 17:29:20,877 [1409153360] Info :  And arbiter is launched with the hostname:arbiter1 from an arbiter point of view of addr:arbiter1
    2014-08-27 17:29:20,877 [1409153360] Info :  And arbiter is launched with the hostname:arbiter1-spare from an arbiter point of view of addr:arbiter1
    2014-08-27 17:29:20,877 [1409153360] Info :  Begin to dispatch configurations to satellites
    2014-08-27 17:29:21,470 [1409153361] Info :  Dispatching Realm Manager
    .....
    And the spare goes back to its spare state:
    Code:
    2014-08-27 17:28:04,145 [1409153284] Debug :  HTTP: calling lock for have_conf
    2014-08-27 17:28:04,145 [1409153284] Debug :  Debug perf: have_conf [args:4.50611114502e-05] [aqu_lock:0.000291109085083] [calling:0.00029993057251] [json:0.000324010848999]
    2014-08-27 17:28:04,148 [1409153284] Debug :  Received message to not run. I am the spare, stopping.
    2014-08-27 17:28:04,148 [1409153284] Debug :  Debug perf: do_not_run [args:3.09944152832e-06] [aqu_lock:4.05311584473e-06] [calling:0.000328063964844] [json:0.000346183776855]
    2014-08-27 17:28:05,229 [1409153285] Debug :  HTTP: calling lock for have_conf
    2014-08-27 17:28:05,229 [1409153285] Debug :  Debug perf: have_conf [args:4.6968460083e-05] [aqu_lock:0.00040602684021] [calling:0.000421047210693] [json:0.000449895858765]
    It doesn't seem so clear without logs !

  4. #4
    Junior Member
    Join Date
    Aug 2014
    Posts
    9

    Re: Spare arbiter and recover

    Hi,

    Can you add in attachment both arbiters logs and tell me exactly when you restart your main arbiter ?
    Also, did you try to add "Spare 0" for the main arbiter on your arbiters configuration just for test if that change something.
    I think I have the same version on my side and I didn't have this error.

    Cheers

  5. #5
    They might also assign multiple arbiters to oversee certain planets or wars. They are the instruments of the prophets. So this could also be that maybe they have an Arbiter assigned to each of the prophets. Who knows, its just pieces of metal surely couldn't be hard to fabricate for a race like the Covenant.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •