Page 3 of 5 FirstFirst 12345 LastLast
Results 21 to 30 of 44

Thread: Discovering interfaces to monitor on a router/switch

  1. #21

    Re: Discovering interfaces to monitor on a router/switch

    After restarting memcached (it is hanging very often during my tests, don't know why), I have some information like

    Code:
    [1350464493] Error :  [SnmpBooster] Host not found: (--ip--)
    [1350464493] Error :  [SnmpBooster] Host not found: (--ip--)
    [1350464493] Error :  [SnmpBooster] Host not found: (--ip--)
    [1350464493] Error :  [SnmpBooster] Host not found: (--ip--)
    [1350464493] Error :  [SnmpBooster] Host not found: (--ip--)
    [1350464493] Error :  [SnmpBooster] Host not found: (--ip--)

    Doing a dump on memcache I can lots of information, very similar to what we have in the Defaults_unified.ini file.
    I couldn't find any of my devices-specific config there on memcache, tho.

    At some point of my debugging I saw SNMP bulk traffic, so something worked:

    Code:
    09:57:03.426118 IP poller-ip.33003 > monitored-equip-ip.161: C=MY-COMMUNITY GetBulk(27) N=0 M=25 .1.3.6.1.2.1.2.2.1.2
    09:57:03.442573 IP monitored-equip-ip.161 > poller-ip.33003: C=MY-COMMUNITY GetResponse(543) .1.3.6.1.2.1.2.2.1.2.1=[|snmp]                                
    09:57:03.448706 IP poller-ip.33003 > monitored-equip-ip.161: C=MY-COMMUNITY GetBulk(29) N=0 M=25 .1.3.6.1.2.1.2.2.1.2.508                                 
    09:57:03.464182 IP monitored-equip-ip.161 > poller-ip.33003: C=MY-COMMUNITY GetResponse(591) .1.3.6.1.2.1.2.2.1.2.509=[|snmp]                               
    09:57:03.466646 IP poller-ip.33003 > monitored-equip-ip.161: C=MY-COMMUNITY GetBulk(29) N=0 M=25 .1.3.6.1.2.1.2.2.1.2.533                                 
    09:57:03.482454 IP monitored-equip-ip.161 > poller-ip.33003: C=MY-COMMUNITY GetResponse(621) .1.3.6.1.2.1.2.2.1.2.534=[|snmp]                               
    09:57:03.484908 IP poller-ip.33003 > monitored-equip-ip.161: C=MY-COMMUNITY GetBulk(29) N=0 M=25 .1.3.6.1.2.1.2.2.1.2.558
    But that was only once

    Thanks

  2. #22
    Administrator
    Join Date
    Dec 2011
    Posts
    278

    Re: Discovering interfaces to monitor on a router/switch

    There seems to be something wrong with the communication between your memcache server and the snmp_poller.py module.

    Can you add logger.warning statement to your snmp_poller in the arbiter function as mentioned in the last post. This will tell you if it is writing successfullly to memcache. And check your arbiter log file.

    How many services have you added for SnmpBooster?

    Try setting the IP to 127.0.0.1 for your memcache server.

    Can you try with an alternate memcache implementation like memcachedb on a different port. Just update your shinken-specific.cfg file.

    Cheers,

    xkilian

  3. #23

    Re: Discovering interfaces to monitor on a router/switch

    Debug intensive, got this so far

    Code:
    [1350484467] Error :  [SnmpBooster] openglx log: no old datas for obj_key 10.226.40.112
    [1350484467] Error :  [SnmpBooster] Error adding : Host (--ip--) - Service if.ge-0_0_2.1001 - Error related to: 'TRIGGERGROUP'
    So it was trying to add a new host service into the memcached but failed, that's why my other node can't do it.

    I'll hack a little around. I'm not experienced in Python so it may take a while :P

  4. #24

    Re: Discovering interfaces to monitor on a router/switch

    Adding a [TRIGGERGROUP] (no child params) to the Defaults_unified.ini file (again, it was copied from the genDev script) changed the error.

    Now we have:

    Code:
    [1350485069] Error :  [SnmpBooster] openglx log: no old datas for obj_key (--ip--)
    [1350485069] Error :  [SnmpBooster] Error adding : Host (--ip--) - Service if.ge-0_0_4.50 - Error related to: 'Snmp_poller' object has no attribute 'datasource'

    So I tried to define datasource into the __init definition of Snmp_poller but probably got it wrong because didn't change anything... I said I'm not a Python guy


    I got lots of messages, maybe I shouldn't have used 400+ services per host at the first try? :-\ :-P

  5. #25
    Administrator
    Join Date
    Dec 2011
    Posts
    278

    Re: Discovering interfaces to monitor on a router/switch

    Try a single host with a chassis and an interface.

    Are you seeing an error in the Arbiter output, or you added a logger call that has a bug in it!

    See if that works well.


    xkilian

  6. #26
    Administrator
    Join Date
    Dec 2011
    Posts
    278

    Re: Discovering interfaces to monitor on a router/switch



    So I tried to define datasource into the __init definition of Snmp_poller but probably got it wrong because didn't change anything... I said I'm not a Python guy
    Not sure what you are doing there...

    All OIDs, DATASOURCEs, DSTEMPLATEs, TRIGGER, TRIGGERGROUP information are maintained in the Defaults_unified.ini file.

    You should never, ever, modify snmp_poller.py. Python knowledge is not required to use the poller!

    Okay, you added a TRIGGERGROUP to your Shinken service definition.

    In your Defaults_unified.ini file :

    Does the TRIGGERGROUP exist. Does the TRIGGERGROUP refer to existing TRIGGERs.

    Each TRIGGER can only make use of DATASOURCES which are collected as part of the DSTEMPLATE you are using.

    Each DSTEMPLATE refers to DATASOURCEs which are collected.

    Each DATASOURCE defines what to do with the collected data (scaling calculations, type of data, etc) and OIDs.

    Each OID refers to an SNMP oid.

    That. Is the long winded logical explanation which should help you identify if you made a mistake. SnmpBooster should tell you if something doesn't make sense, but it is not perfect.

    xkilian

  7. #27
    Administrator
    Join Date
    Dec 2011
    Posts
    278

    Re: Discovering interfaces to monitor on a router/switch

    Also note that if your memcache is restarted/crashes, all the host data is lost and the poller will complain about missing host keys!

    Use memcachedb to have a persistent memcached service so you do not run into this. Of course if your memcache is crashing you have other problems.

    xkilian

  8. #28

    Re: Discovering interfaces to monitor on a router/switch

    xkilian,
    Don't know what the problem was, but now it works.

    Changed to a smaller number of things to monitor (actually only chassis and chassis.device-traffic now) after a few tests and data gets back to check_mk finally.

    At least we caught some questions on documentation and stuff, that I see you have already committed to git.

    Thanks for your help and hope to see this feature soon in the main line Shinken.

    If you need a beta tester for next points, let me know. I'm very interested in this feature.


    Once more, thanks.
    Kindly,
    openglx

  9. #29

    Re: Discovering interfaces to monitor on a router/switch

    Oh, by the way, I have three interfaces registered and being monitored.

    Like this one:

    Code:
    define service {
      host_name      (--ip--)
      service_description if.em0
      service_dependencies ,chassis
      display_name     em0 Description: 1000.0 MBits/s ethernetCsmacd
      _display_order    888
      _ds_max       1000000000
      _ds_max_octets    125000000
      _dstemplate     standard-interface
      _inst        map(interface-name,em0)
      notes        1000.0 MBits/s ethernetCsmacd
      use         default-snmp-template
      register       1
    }

    Its state is OK on the LiveStatus and with frequent checks. Still, the output from the plugin is:

    Code:
    FROM CACHE: em0: Waiting an additional check to calculate derive - Waiting an additional check to calculate derive - Waiting an additional check to calculate derive - Waiting an additional check to calculate derive - Waiting an additional check to calculate derive - Waiting an additional check to calculate derive
    And I don't see any service performance data available for it.


    What could be wrong at this?


    Thanks

  10. #30
    Administrator
    Join Date
    Dec 2011
    Posts
    278

    Re: Discovering interfaces to monitor on a router/switch

    Is it actually getting data successfully from the network device. You can see with Wireshark if the SNMP query was successful.

    For data collection of DERIVE data types :

    1 - Get data, state unknown, use it for instance mapping return instance mapping in progress
    2 - Get data, state ok, return waiting for additional data to calculate derive
    3 - Get data, state ok, return performance data

    Cheers,

    xkilian

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •