Page 1 of 3 123 LastLast
Results 1 to 10 of 21

Thread: [RESOLVED] shinken-broker/Livestatus crash

  1. #1

    [RESOLVED] shinken-broker/Livestatus crash

    Hi there !

    Since my update to build #158, I've got problems with the broker and the livestatus module.
    This time I "cleaned" my install by deleting old .db . Ever since I'm getting this :
    Code:
    [1325676262] [broker-master] ERROR : the module Livestatus just crash! Please look at the traceback:
    [1325676262] [broker-master] Traceback (most recent call last):
     File "/usr/local/lib/python2.6/dist-packages/shinken/modules/livestatus_broker/livestatus_broker.py", line 781, in main
      self.do_main()
     File "/usr/local/lib/python2.6/dist-packages/shinken/modules/livestatus_broker/livestatus_broker.py", line 1021, in do_main
      response, keepalive = self.livestatus.handle_request(open_connections[socketid]['buffer'])
     File "/usr/local/lib/python2.6/dist-packages/shinken/modules/livestatus_broker/livestatus.py", line 88, in handle_request
      request.parse_input(data)
     File "/usr/local/lib/python2.6/dist-packages/shinken/modules/livestatus_broker/livestatus_request.py", line 93, in parse_input
      query.parse_input('\n'.join(wait_cmds))
     File "/usr/local/lib/python2.6/dist-packages/shinken/modules/livestatus_broker/livestatus_wait_query.py", line 71, in parse_input
      host_name, service_description = object.split(';', 1)
    ValueError: need more than 1 value to unpack
    
    [1325676263] [broker-master] Error : the external module Livestatus goes down unexpectly!
    [1325676263] [broker-master] Setting the module Livestatus to restart
    I'm not sure, but it seems related to thruk, when I request an external command or on some refresh (not all of them), the broker crashes. That's very weird !

    In my logs I've got lots of (for different realms sometimes) :
    Code:
    1325680792] Warning : 39 actions never came back for the satellite 'poller-corporate'. I'm reenable them for polling
    Checked the firewalls they're ok... double checked the shinken-specific.cfg, all ports number and ip addresses are ok...

    On launch, everything seems all right too :
    Code:
    [1325674430] I correctly loaded the modules : [PickleRetentionArbiter,CommandFile]
    [1325674431] All : (in/potential) (schedulers:1) (pollers:1/1) (reactionners:1/1) (brokers:1/1) (receivers:1/1)
    [1325674431] Internal : (in/potential) (schedulers:1) (pollers:1/1) (reactionners:1/1) (brokers:1/1) (receivers:1/1)
    [1325674431] Clients-B2B : (in/potential) (schedulers:2) (pollers:1/2) (reactionners:1/1) (brokers:1/1) (receivers:1/1)
    [1325674431] Bxxxxxxxx : (in/potential) (schedulers:1) (pollers:1/2) (reactionners:1/1) (brokers:1/1) (receivers:1/1)
    [1325674431] Hxxxxxxxx : (in/potential) (schedulers:1) (pollers:1/2) (reactionners:1/1) (brokers:1/1) (receivers:1/1)
    [1325674431] Rxxxxxxxx : (in/potential) (schedulers:1) (pollers:1/2) (reactionners:1/1) (brokers:1/1) (receivers:1/1)
    [1325674431] Clients-B2C : (in/potential) (schedulers:1) (pollers:1/1) (reactionners:1/1) (brokers:1/1) (receivers:1/1)
    [1325674431] Corporate : (in/potential) (schedulers:1) (pollers:1/1) (reactionners:1/1) (brokers:1/1) (receivers:1/1)

  2. #2
    Shinken project leader
    Join Date
    May 2011
    Location
    Bordeaux (France)
    Posts
    2,131

    Re: shinken-broker/Livestatus crash

    Is it producting for some specifi host or services? :
    No direct support by personal message. Please open a thread so everyone can see the solution

  3. #3
    Senior Member
    Join Date
    Oct 2011
    Posts
    139

    Re: shinken-broker/Livestatus crash

    wait queries are only used for external command such as rescheduling host/services
    i'have fixed a bug in livestatus module that wasn't using the correct separator (space in place of so i don't think the problem is there. May be an empty response from the livestatus module. It should be better to restart the broker module in debug mode and provide the broker-debug file.

  4. #4

    Re: shinken-broker/Livestatus crash

    Here is the broker debug when it is crashing at an external command.
    Gist is here : https://gist.github.com/1560427

  5. #5
    Senior Member
    Join Date
    Oct 2011
    Posts
    139

    Re: shinken-broker/Livestatus crash

    Ok
    Could you please give us the exact steps to reproduce the crash ?

    thanx

  6. #6

    Re: shinken-broker/Livestatus crash

    Sorry for the delay, I was struggling with (totally off topic) VLC and HTTP Live Streaming (Pain in the a** for not much worth)...

    @naparuba : My bad, I did not see your answer between mine and dguenault's one. No it's not specific to one host/service it is systematic.

    To reproduce, I just send an external command (a reschedule) from Thruk to one of my realm... nothing more. Then it instantly crashs.It happened as soon as I installed the latest release from github that day ( naparuba-shinken-0.8.5-158-g32b68cd.tar.gz ). There was no error during setup and it's a step I took already several times with no problem whatsoever.
    If I just do a passive submit result, the broker is OK... so it can swallow commands without crashing but no reschedule.
    For a moment I thought it was crashing because I had shinken files from an old install but after some file atime checking it's not the case.

    In my shinken-specific.cfg I added the CommandFile module to the arbiter.

  7. #7
    Senior Member
    Join Date
    Oct 2011
    Posts
    139

    Re: shinken-broker/Livestatus crash

    which setup method did you use and on which distro and version (redhat 5/6 debian ....) ?
    what was the error message ?

  8. #8

    Re: shinken-broker/Livestatus crash

    I used the 'python setup.py install --install-dir=/usr/bin' on a debian squeeze amd64.
    The error message is the same as in my first post, crashing about "ValueError: need more than 1 value to unpack".

  9. #9

    Re: shinken-broker/Livestatus crash

    Hi again.

    It's been a few days with this issue and I updated the livestatus_query.py file to the latest from GIT (with hope).
    And no luck, the missing value to unpack is there :'(
    Then I tried to replace the split at line 71 by the old one with the ' ' instead of the ';' ... no luck (I know it was stupid )

    I noticed one thing, it is only for service reschedule, if I reschedule a check_ping it's OK.

  10. #10

    Re: shinken-broker/Livestatus crash

    Last update, did pinpoint thruk as the culprit but I need a simple precision to be sure my test is ok.
    I'm testing external commands sent through a small shell script like this :
    Code:
    #!/bin/sh
    
    now=`date +%s`
    commandfile='/var/lib/shinken/rw/nagios.cmd'
    
    /usr/bin/printf "[%lu] SCHEDULE_SVC_CHECK;host-name-to-check;Camera Image;$now\n" $now > $commandfile
    I need to know if my script is ok because when I send command with it there is no crash in the broker. The same sent through thruk makes the crash.
    Inside the logs the two commands look exactly the same so I'm not sure about the script and I'm not seeing any result returning inside my logs but that was the same before the crashs happened.

    So ? Is this small line of shell is ok ?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •