Keeping Your System Running with a Host OS Watchdog

When you deploy an application onto a device, two major parts come into play. The first part is resinOS (the host OS), which is the minimal underlying system tasked with managing the network connection and running Docker. The second is your application, which runs on the device as a Docker container. This setup means, that as long as the host OS is healthy and online, there is always a way to manage your application and ensure that it runs, updates, and communicates its logs, etc. In the latest release of our host OS, we are adding a watchdog feature, which helps keep the host OS in a healthy state!

This recent change of resinOS enables the hardware watchdog, and it's intended to be the first step towards a long term solution. The watchdog is set up with a 10 seconds timeout, meaning that if the host OS systemd or the kernel become unresponsive for more than that amount of time, the device will be hard reset. It is implemented by enabling RuntimeWatchdogSec=10 in systemd. If you are interested in watchdogs, you can see more on that in this blog post. The watchdog should protect devices from hardware lock-ups, as it does sometimes happen according to our customers, for reasons outside of the reach of software. Now you are more likely to stay connected, and thus able to push a new version of your application if needed.

This watchdog update has been released in the 1.24.0 version of the host OS, so far only enabled for Raspberry Pi, with more devices to follow as they are tested. The quickest way to upgrade to this version is to download the new host OS version from the dashboard, and reflash or reprovision your device! If you are on an earlier version, we highly recommend you to upgrade!

We'd love to hear if you have any reliability story you'd like to share with us, drop by our forums, talk to us on Gitter, or find more info in our Community Central!