Linux
Watchdog Daemon - Installation
Back to PSC's home page
Back to Watchdog
Installation
In most Linux distributions you can install from the package
manager. For example on Ubuntu/Debian-based systems:
apt-get install watchdog
However, you should also consider installing the lm-sensors package
and running the sensors-detect script as that can help identify what
hardware your machine has, and then you can check to see if it has a
supporting watchdog driver module. Then take a look in the likes of
/etc/modprobe.d/blacklist-watchdog.conf to see if the
same/similar chip is mentioned, if so add that driver from /etc/modprobe.d/blacklist-watchdog.conf
to /etc/modules (or modprobe it) and you should then have hardware
support. Better still is to add it to /etc/default/watchdog
by editing the line watchdog_module="none" as that is loaded on
demand, and gets round the buggy behaviour of systemd not loading
explicitly listed modules that are blacklisted for auto-load.
NOTE: To replace the daemon with any special build you should stop
it first, as described below. It is also a good idea to rename the
original version and keep it so you can revert if anything goes
wrong when testing V6.0
Another way of identifying any hardware watchdog options is to use
the 'lshw' command (as root/sudo) to find the chip set(s) used, then
to search for drivers or documentation for those chips that allow
you to establish if they have watchdog timers, and if so what driver
should work.
If all else fails, consider loading the 'softdog' module to emulate
the hardware. It is not nearly as good, but better than nothing.
[top of page]
Starting and Stopping
Normally the daemons are started and stopped by the scripts such as
/etc/init.d/watchdog but the usual command to do this (as
root, or using sudo) is:
service watchdog start
service watchdog stop
However, this script is actually swapping execution of 'watchdog'
and 'wd_keepalive' in a similar manner to the system starting and
stopping.
NOTE: This swapping behaviour is essential in the unlikely case that
your kernel was compiled with the option CONFIG_WATCHDOG_NOWAYOUT,
or a WDT module was loaded with that option, as then you cannot turn
the watchdog off after starting it, so you always need to be running
something to stop a reboot. Swapping daemons is then a way of
allowing you to replace the binary and/or change the configuration
file safely.
The daemons are actually stopped by sending the signal SIGTERM
(usually 15) to them which is the "polite" way of requesting a
program to terminate. They trap this signal and when detected break
from the main polling loop and exit cleanly (closing the watchdog
hardware). Thus it can take up to the configured polling time
interval for this to stop the process.
NOTE: If you kill the daemon by another signal, such as sending
SIGKILL (usually 9, a non-ignorable signal) or SIGINT (usually 2,
typically from Ctrl+C when running in the foreground) then it will
not close the watchdog device and you can expect a hardware reboot
to occur shortly unless the daemon is restarted!
Typically to really stop the daemon (and not just run wd_keepalive
in its place) you can do this using pkill (again as root
or using sudo):
pkill watchdog
By default pkill sends SIGTERM, which is what you normally want. You
can start the daemon from the command line for testing, and
if you want to see the output of the daemon and any child
test/repair process in real-time (rather than looking at log files
such as syslog) you can use the foreground option, for example:
watchdog --foreground
This will stop it becoming a background daemon and so it will run
like a normal foreground process. To stop it open another terminal
window and send it SIGTERM using pkill (i.e. don't use Ctrl+C).
[top of page]
Last Updated on 26-Aug-2019 by
Paul Crawford
Copyright (c) 2014-19 by Paul S. Crawford. All rights reserved.
Email psc(at)sat(dot)dundee(dot)ac(dot)uk
Absolutely no warranty, use this information at your own risk.