с использованием Monit 5.15 на FreeBSD 10.2:Monit ВЫПЛН не работает, когда контролируется процесс умирает
set daemon 5
set logfile syslog
set pidfile /var/run/monit.pid
set idfile /var/.monit.id
set statefile /var/.monit.state
set alert [email protected]
set mailserver localhost
set httpd port 2812 and
use address 192.168.40.72
allow 192.168.20.0/24
allow admin:monit
check process haproxy with pidfile /var/run/haproxy.pid
if failed host 192.168.40.72 port 9090 type tcp
then exec "/bin/sh -c '/bin/echo `/bin/date` >> /tmp/monit.test'"
Когда я запустить Monit с -VI и я убить HAProxy, у меня есть следующий вывод:
Adding net allow '192.168.20.0/24'
Adding credentials for user 'admin'
Runtime constants:
Control file = /usr/local/etc/monitrc
Log file = syslog
Pid file = /var/run/monit.pid
Id file = /var/.monit.id
State file = /var/.monit.state
Debug = True
Log = True
Use syslog = True
Is Daemon = True
Use process engine = True
Poll time = 5 seconds with start delay 0 seconds
Expect buffer = 256 bytes
Mail server(s) = localhost:25 with timeout 30 seconds
Mail from = (not defined)
Mail subject = (not defined)
Mail message = (not defined)
Start monit httpd = True
httpd bind address = 192.168.40.72
httpd portnumber = 2812
httpd ssl = Disabled
httpd signature = Enabled
httpd auth. style = Basic Authentication and Host/Net allow list
Alert mail to = [email protected]
Alert on = All events
The service list contains the following entries:
Process Name = haproxy
Pid file = /var/run/haproxy.pid
Monitoring mode = active
Existence = if does not exist then restart
Port = if failed [192.168.40.72]:9090 type TCP/IP protocol DEFAULT with timeout 5 seconds then exec '/bin/sh -c /bin/echo `/bin/date` >> /tmp/monit.test'
System Name = appsrv01
Monitoring mode = active
-------------------------------------------------------------------------------
pidfile '/var/run/monit.pid' does not exist
Starting Monit 5.15 daemon with http interface at [192.168.40.72]:2812
Starting Monit HTTP server at [192.168.40.72]:2812
Monit HTTP server started
'appsrv01' Monit 5.15 started
Sending Monit instance changed notification to [email protected]
'haproxy' process is running with pid 42999
'haproxy' zombie check succeeded
'haproxy' succeeded testing protocol [DEFAULT] at [192.168.40.72]:9090 [TCP/IP]
'haproxy' connection succeeded to [192.168.40.72]:9090 [TCP/IP]
'haproxy' process is running with pid 42999
'haproxy' zombie check succeeded
'haproxy' succeeded testing protocol [DEFAULT] at [192.168.40.72]:9090 [TCP/IP]
'haproxy' connection succeeded to [192.168.40.72]:9090 [TCP/IP]
'haproxy' process is running with pid 42999
'haproxy' zombie check succeeded
'haproxy' succeeded testing protocol [DEFAULT] at [192.168.40.72]:9090 [TCP/IP]
'haproxy' connection succeeded to [192.168.40.72]:9090 [TCP/IP]
'haproxy' process test failed [pid=42999] -- No such process
'haproxy' process is not running
Sending Does not exist notification to [email protected]
'haproxy' trying to restart
'haproxy' stop skipped -- method not defined
'haproxy' start method not defined
'haproxy' monitoring enabled
'haproxy' process test failed [pid=42999] -- No such process
'haproxy' process is not running
'haproxy' trying to restart
'haproxy' stop skipped -- method not defined
'haproxy' start method not defined
'haproxy' monitoring enabled
^CShutting down Monit HTTP server
Monit HTTP server stopped
Monit daemon with pid [48685] stopped
'appsrv01' Monit 5.15 stopped
Sending Monit instance changed notification to [email protected]
Линия EXEC никогда не выполняется, я не вижу никаких новых строк в /tmp/monit.test
Если я изменил отмеченный порт с 9090 на какой-то недействительный порт, скажем 9190 и начнем monit (haproxy работает!), я вижу:
Starting Monit 5.15 daemon with http interface at [192.168.40.72]:2812
Starting Monit HTTP server at [192.168.40.72]:2812
Monit HTTP server started
'appsrv01' Monit 5.15 started
Sending Monit instance changed notification to [email protected]
'haproxy' process is running with pid 50703
'haproxy' zombie check succeeded
Socket test failed for [192.168.40.72]:9190 -- Connection refused
'haproxy' failed protocol test [DEFAULT] at [192.168.40.72]:9190 [TCP/IP] -- Connection refused
Sending Connection failed notification to [email protected]
'haproxy' exec: /bin/sh
'haproxy' process is running with pid 50703
'haproxy' zombie check succeeded
Socket test failed for [192.168.40.72]:9190 -- Connection refused
'haproxy' failed protocol test [DEFAULT] at [192.168.40.72]:9190 [TCP/IP] -- Connection refused
'haproxy' exec: /bin/sh
Почему линия EXEC работает здесь, но не тогда, когда я убиваю -9 haproxy? То, что я пытаюсь сделать, это получить monit для запуска exec в случае сбоя haproxy. строка exec затем будет содержать команду для переключения CARP IP на другой хост. Сам haproxy контролируется с помощью zabbix, поэтому NOC может позже исследовать причину сбоя.
Привет, Доминик, я попробовал «если перезагружен», но он, похоже, не работает, «если 1 перезапуск в течение 1 цикла» работал для меня. В любом случае, спасибо! – xcodex