It is imperative for active BPs to ensure their producing nodes are reliable, and that in the event of failure they can continue to sign blocks from standby nodes without any human intervention.
Currently, there aren't too many "ideal" solutions for this - various [issues](https://github.com/EOSIO/eos/issues/4025) have been raised on the EOS github to help make this process easier for us Block Producers, but until they have been shipped we must do the best we can with the tools currently at our disposal.
At Block Matrix we have been battle testing an automated failover solution using `keepalived` in the event of the `nodeos` processing being killed. We now have a lightweight solution in place, which auto promotes a backup node via the producer API. You can watch this in action here:
https://www.youtube.com/watch?v=OuB40yd0z4M
We have put together the code for this over on our [Github](https://github.com/BlockMatrixNetwork/eos-bp-failover), with some explanation around the process and a special addendum for AWS users to combat the multicast/unicast issue which will prevent a vanilla `keepalived` solution from working within their environment.
We have several improvements to this, catering for issues where `nodeos` continues to run but stalls or stops signing blocks - once we have the relevant updates from the EOS dev team we will extend our examples to include them.
Happy HA'ing to all BPs!
---
[Block Matrix](https://blockmatrix.network) are an EOS block producer candidate, producer name: `blockmatrix1`