So yesterday I was enjoying a bit of well deserved rest after the guests that came to @mlvanderwaal's birthday party had left, when I got a notification via Steemify regarding a message of fellow-witness @rival. Our server had started to miss blocks, which means it had failed.
Quickly I flipped open the MacBook to see what was the cause of this. As I couldn't figure that out straight away, I disabled the witness account (to avoid missing any more blocks) and started working through the logs. The moment of failure was easily spotted, but the cause is still a mystery. My best guess at the moment is network issues, as memory/cpu/diskspace/etc. were all within limits.
I will submit an issue on the Steemit github just in case it could be related to a bug in the code.
I cleaned the data and restarted the steemd (node) software, to sync it back up to the blockchain. As the backup nodes seemed unaffected, I switched the witness signing key to our main backup and I could finally go to sleep after a very long day.
The witness tool SteemTurbine I'm developing would have prevented missing any blocks, but currently it is running in debug-mode on our test witness servers, so unfortunately it didn't spot the issue on our main server. I actually planned to put SteemTurbine into production this week, but ofcourse Murphy with his dreaded law came around just before that...
Until SteemTurbine is running in production, I will be checking on the server manually several times a day to catch any issues as soon as possible.
Special thanks to @rival for watching my back. Working together with this awesome community can and will make this platform thrive!