This project provides a Flannel annotation watchdog for Kubernetes nodes. It continuously checks whether the flannel.alpha.coreos.com/backend-data annotation is present on the node. If missing (which typically indicates a Flannel restart or networking issue), it can optionally restart the Flannel container. A dry-run mode allows detection without restarting.
Download the script and the systemd service file, then install and enable the watchdog.
Use wget or curl to download the script and service files.
# Create a working directory
mkdir -p /opt/flannel-watchdog
cd /opt/flannel-watchdog
# Download the watchdog script
wget https://gist.githubusercontent.com/mattmattox/11689c8d2c6fc5fff7dbd78ed3baae69/raw/flannel-watchdog.sh
# Download the systemd service unit
wget https://gist.githubusercontent.com/mattmattox/11689c8d2c6fc5fff7dbd78ed3baae69/raw/flannel-watchdog.service
# Optional: Download an environment file template
cat > flannel-watchdog.env << 'EOF'
# Polling cadence (seconds)
LOOP_DELAY=30
# Recovery wait behavior
WAIT_TIMEOUT=180
WAIT_INTERVAL=5
# Dry run: 1=detection only, 0=restart containers
DRY_RUN=1
# Pattern to match flannel containers in docker ps
MATCH_PATTERN=flannel
EOFOr with curl:
curl -o flannel-watchdog.sh \
https://gist.githubusercontent.com/mattmattox/11689c8d2c6fc5fff7dbd78ed3baae69/raw/flannel-watchdog.sh
curl -o flannel-watchdog.service \
https://gist.githubusercontent.com/mattmattox/11689c8d2c6fc5fff7dbd78ed3baae69/raw/flannel-watchdog.serviceinstall -m 0755 flannel-watchdog.sh /usr/local/sbin/flannel-watchdog.shCopy the service unit into /etc/systemd/system/:
install -m 0644 flannel-watchdog.service /etc/systemd/system/flannel-watchdog.serviceOptionally, place an environment file in /etc/sysconfig/flannel-watchdog (or other path referenced by the service) with configuration overrides. For example:
cp flannel-watchdog.env /etc/sysconfig/flannel-watchdogAdjust DRY_RUN in the environment file to 0 when you want the watchdog to perform actual restart actions.
Reload systemd and enable the watchdog:
systemctl daemon-reload
systemctl enable --now flannel-watchdog.serviceCheck its status and logs:
systemctl status flannel-watchdog.service
journalctl -u flannel-watchdog.service -f| Variable | Description | Default |
|---|---|---|
LOOP_DELAY |
Seconds between annotation checks | 30 |
WAIT_TIMEOUT |
Seconds to wait for flannel recovery after restart | 180 |
WAIT_INTERVAL |
Polling interval during recovery wait loops | 5 |
DRY_RUN |
1 = detect only (no restart), 0 = enable restart | 1 |
MATCH_PATTERN |
Pattern to match flannel containers in docker ps results |
flannel |
Place an environment file (e.g., /etc/sysconfig/flannel-watchdog) to override these values without modifying the script.
By default, the watchdog runs in dry-run mode (DRY_RUN=1), which only logs detection events without restarting any container. You must set DRY_RUN=0 to allow automatic restarts.
- It expects
dockeras the container runtime; adapt if you usecontainerd/crictl. - Ensure the node has appropriate kubeconfig access to query annotations.