Hi everybody ;D
Sorry for my english,
i need your help and some explain to configure Distributed Architecture with Realm.
Context:
I want to set a shinken central and monitor some remote site on the world by installing satellite.
I've seen that i have to use realm concept to do that
Configuration:
Shinken Central ===> (Arbitrer, Scheduler, poller, Broker and reactionner ) in realm bagneux for example
Satellite1 ====> (only Scheduler, poller ) in realm test1 for example
Satellite2 ====> (only Scheduler, poller ) in realm test2 for example
etc....
PS: i will use a common broker and common reactionner
1/ The configuration for realm bagneux and realm test1,test2,... need to be define only in my shinken central or i need too define in each satellite? (in /etc/shinken/realms/bagneux.cfg)
2/ idem for configuration of poller and scheduller of satellite only in shinken central or need too define in each satellite?
3/ idem for configuration of host of satellite, do i define all config only on the shinken central by using realm of satellite ? or on the remote satellite?
4/ is normal, that there is not realm for service definition? The services link to host with realm are enough?
from now i define all configuration only on the shinken central, but i don't know if it works because i see nothing on the log of the remote satellite ;(
i showw you my config:
shinken central:
Arbitrer.cfg
scheduler.cfgdefine arbiter {
arbiter_name arbiter-master
#host_name node1 ; CHANGE THIS if you have several Arbiters
address localhost ; DNS name or IP
port 7770
spare 0 ; 1 = is a spare, 0 = is not a spare
## Interesting modules:
# - named-pipe = Open the named pipe nagios.cmd
# - mongodb = Load hosts from a mongodb database
# - pickle-retention-arbiter = Save data before exiting
# - nsca = NSCA server
# - vmware-auto-linking = Lookup at Vphere server for dependencies
# - import-glpi = Import configuration from GLPI (need plugin monitoring for GLPI in server side)
# - tsca = TSCA server
# - mysql-mport = Load configuration from a MySQL database
# - ws-arbiter = WebService for pushing results to the arbiter
# - collectd = Receive collectd perfdata
# - snmp-booster = Snmp bulk polling module, configuration linker
# - import-landscape = Import hosts from Landscape (Ubuntu/Canonical management tool)
# - aws = Import hosts from Amazon AWS (here EC2)
# - ip-tag = Tag an host based on it's IP range
# - file-tag = Tag an host if it's on a flat file
# - csv-tag = Tag an host from the content of a CSV file
modules named-pipe
#modules named-pipe, mongodb, nsca, vmware-auto-linking, ws-arbiter, collectd, mport-landscape, snmp-booster, AWS
# Enable https or not
use_ssl 0
# enable certificate/hostname check, will avoid man in the middle attacks
hard_ssl_name_check 0
## Uncomment these lines in a HA architecture so the master and slaves know
## how long they may wait for each other.
#timeout 3 ; Ping timeout
#data_timeout 120 ; Data send timeout
#max_check_attempts 3 ; If ping fails N or more, then the node is dead
#check_interval 60 ; Ping node every N seconds
}
poller.cfgdefine scheduler {
scheduler_name scheduler-master ; Just the name
address localhost ; IP or DNS address of the daemon
port 7768 ; TCP port of the daemon
## Optional
spare 0 ; 1 = is a spare, 0 = is not a spare
weight 1 ; Some schedulers can manage more hosts than others
timeout 3 ; Ping timeout
data_timeout 120 ; Data send timeout
max_check_attempts 3 ; If ping fails N or more, then the node is dead
check_interval 60 ; Ping node every N seconds
## Interesting modules that can be used:
# - pickle-retention-file = Save data before exiting in flat-file
# - mem-cache-retention = Same, but in a MemCache server
# - redis-retention = Same, but in a Redis server
# - retention-mongodb = Same, but in a MongoDB server
# - nagios-retention = Read retention info from a Nagios retention file
# (does not save, only read)
# - snmp-booster = Snmp bulk polling module
#modules pickle-retention-file
modules
## Advanced Features
# Realm is for multi-datacenters
#realm All
realm Bagneux
# Skip initial broks creation. Boot fast, but some broker modules won't
# work with it!
skip_initial_broks 0
# In NATted environments, you declare each satellite ip[ort] as seen by
# *this* scheduler (if port not set, the port declared by satellite itself
# is used)
#satellitemap poller-1=1.2.3.4:1772, reactionner-1=1.2.3.5:1773, ...
# Enable https or not
use_ssl 0
# enable certificate/hostname check, will avoid man in the middle attacks
hard_ssl_name_check 0
}
define scheduler {
scheduler_name scheduler-test ; Just the name
address 10.38.231.190 ; IP or DNS address of the daemon
port 7768 ; TCP port of the daemon
realm Test
spare 0
}
broker.cfgdefine poller {
poller_name poller-master
address localhost
port 7771
## Optional
spare 0 ; 1 = is a spare, 0 = is not a spare
manage_sub_realms 0 ; Does it take jobs from schedulers of sub-Realms?
min_workers 0 ; Starts with N processes (0 = 1 per CPU)
max_workers 0 ; No more than N processes (0 = 1 per CPU)
processes_by_worker 256 ; Each worker manages N checks
polling_interval 1 ; Get jobs from schedulers each N seconds
timeout 3 ; Ping timeout
data_timeout 120 ; Data send timeout
max_check_attempts 3 ; If ping fails N or more, then the node is dead
check_interval 60 ; Ping node every N seconds
## Interesting modules that can be used:
# - booster-nrpe = Replaces the check_nrpe binary. Therefore it
# enhances performances when there are lot of NRPE
# calls.
# - named-pipe = Allow the poller to read a nagios.cmd named pipe.
# This permits the use of distributed check_mk checks
# should you desire it.
# - snmp-booster = Snmp bulk polling module
modules booster-nrpe
## Advanced Features
#passive 0 ; For DMZ monitoring, set to 1 so the connections
; will be from scheduler -> poller.
# Poller tags are the tag that the poller will manage. Use None as tag name to manage
# untaggued checks
#poller_tags None
# Enable https or not
use_ssl 0
# enable certificate/hostname check, will avoid man in the middle attacks
hard_ssl_name_check 0
# realm All
realm Bagneux
}
define poller {
poller_name poller-test
address 10.38.231.190
port 7771
realm Test
}
reactionner.cfgdefine broker {
broker_name broker-master
address localhost
port 7772
spare 0
## Optional
manage_arbiters 1 ; Take data from Arbiter. There should be only one
; broker for the arbiter.
manage_sub_realms 1 ; Does it take jobs from schedulers of sub-Realms?
timeout 3 ; Ping timeout
data_timeout 120 ; Data send timeout
max_check_attempts 3 ; If ping fails N or more, then the node is dead
check_interval 60 ; Ping node every N seconds
## Modules
# Default: None
# Interesting modules that can be used:
# - simple-log = just all logs into one file
# - livestatus = livestatus listener
# - tondodb-mysql = NDO DB support
# - npcdmod = Use the PNP addon
# - graphite = Use a Graphite time series DB for perfdata
# - webui = Shinken Web interface
# - glpidb = Save data in GLPI MySQL database
modules webui,livestatus,npcdmod
# Enable https or not
use_ssl 0
# enable certificate/hostname check, will avoid man in the middle attacks
hard_ssl_name_check 0
## Advanced
# realm All
realm Bagneux
}
realm/bagneux.cfgdefine reactionner {
reactionner_name reactionner-master
address localhost
port 7769
spare 0
## Optionnal
manage_sub_realms 0 ; Does it take jobs from schedulers of sub-Realms?
min_workers 1 ; Starts with N processes (0 = 1 per CPU)
max_workers 15 ; No more than N processes (0 = 1 per CPU)
polling_interval 1 ; Get jobs from schedulers each 1 second
timeout 3 ; Ping timeout
data_timeout 120 ; Data send timeout
max_check_attempts 3 ; If ping fails N or more, then the node is dead
check_interval 60 ; Ping node every N seconds
## Modules
modules
# Reactionner tags are the tag that the reactionner will manage. Use None as tag name to manage
# untaggued notification/event handlers
#reactionner_tags None
# Enable https or not
use_ssl 0
# enable certificate/hostname check, will avoid man in the middle attacks
hard_ssl_name_check 0
## Advanced
#realm All
realm Bagneux
}
Host config:define realm {
realm_name Bagneux
realm_members Test
default 1 ;Is the default realm. Should be unique!
}
define realm {
realm_name Test
default 0
}
satellite-test.cfg
From my log file of shinken central everything is okdefine host{
use linux-snmp,generic-host,host-pnp
contact_groups admins
host_name satellite-test
address 10.38.231.190
realms Test1
}
On the satellite :
poller.cfg
scheduler.cfg:define poller {
# poller_name poller-master
poller_name poller-test
# address localhost
address 10.38.231.190
port 7771
## Optional
spare 0 ; 1 = is a spare, 0 = is not a spare
manage_sub_realms 0 ; Does it take jobs from schedulers of sub-Realms?
min_workers 0 ; Starts with N processes (0 = 1 per CPU)
max_workers 0 ; No more than N processes (0 = 1 per CPU)
processes_by_worker 256 ; Each worker manages N checks
polling_interval 1 ; Get jobs from schedulers each N seconds
timeout 3 ; Ping timeout
data_timeout 120 ; Data send timeout
max_check_attempts 3 ; If ping fails N or more, then the node is dead
check_interval 60 ; Ping node every N seconds
## Interesting modules that can be used:
# - booster-nrpe = Replaces the check_nrpe binary. Therefore it
# enhances performances when there are lot of NRPE
# calls.
# - named-pipe = Allow the poller to read a nagios.cmd named pipe.
# This permits the use of distributed check_mk checks
# should you desire it.
# - snmp-booster = Snmp bulk polling module
modules booster-nrpe
## Advanced Features
#passive 0 ; For DMZ monitoring, set to 1 so the connections
; will be from scheduler -> poller.
# Poller tags are the tag that the poller will manage. Use None as tag name to manage
# untaggued checks
#poller_tags None
# Enable https or not
use_ssl 0
# enable certificate/hostname check, will avoid man in the middle attacks
hard_ssl_name_check 0
# realm All
realm Test
}
And from le log file of the satellite i see nothing on scheduller log and on poller.define scheduler {
scheduler_name scheduler-test ; Just the name
# address localhost ; IP or DNS address of the daemon
address 10.38.231.190 ; IP or DNS address of the daemon
port 7768 ; TCP port of the daemon
## Optional
spare 0 ; 1 = is a spare, 0 = is not a spare
weight 1 ; Some schedulers can manage more hosts than others
timeout 3 ; Ping timeout
data_timeout 120 ; Data send timeout
max_check_attempts 3 ; If ping fails N or more, then the node is dead
check_interval 60 ; Ping node every N seconds
## Interesting modules that can be used:
# - pickle-retention-file = Save data before exiting in flat-file
# - mem-cache-retention = Same, but in a MemCache server
# - redis-retention = Same, but in a Redis server
# - retention-mongodb = Same, but in a MongoDB server
# - nagios-retention = Read retention info from a Nagios retention file
# (does not save, only read)
# - snmp-booster = Snmp bulk polling module
#modules pickle-retention-file
modules
## Advanced Features
# Realm is for multi-datacenters
#realm All
realm Test
# Skip initial broks creation. Boot fast, but some broker modules won't
# work with it!
skip_initial_broks 0
# In NATted environments, you declare each satellite ip[ort] as seen by
# *this* scheduler (if port not set, the port declared by satellite itself
# is used)
#satellitemap poller-1=1.2.3.4:1772, reactionner-1=1.2.3.5:1773, ...
# Enable https or not
use_ssl 0
# enable certificate/hostname check, will avoid man in the middle attacks
hard_ssl_name_check 0
}
normally we must see some check?
all the remote check is reported on the scheduler of the shinken central