*Fault tolerance &al is mentioned but I don't think it is discussed (maybe indirectly in config rec on p 9) In part, we can deal with this in the client by making them more robust as discussed above. Perhaps at the end of #5 a section about fault tolerance or high availability:
Fault tolerance is currently covered by a footnote in section 7... a bit minimalistic. A new section 5.x sounds like a good plan, although I think the text below is not intended for all kinds of responders but rather that of root CAs and transponders/global redirectors?
OCSP responders should be configured on a server with high availability capability: redundant, failure-correcting/responding hardware components. The OCSP responder system should be configured to automatically recover and continue from a single failure of disks supporting the current OCSP database, hardware security module, or other critical system component. This might be particularly important for OCSP responders that operate in whole or in part in transponder mode. In order to deal with site failures or network partitioning, OCSP service providers should provision multiple, topologically and geographcally dispersed OCSP responders with mirrored OCSP databases and configuration. If possible, WAN high availability capability should be employed.