Fault tolerance is a specialized kind of planning for change.An example of successful fault tolerance occurred when the Cassini spacecraft, currently on an extended 
mission to Saturn, unexpectedly ceased communication with ground operations on Nov. 2, 2010.Communication was restored within the hour, at 
which point human operators began a slow and careful check of the spacecraft followed by recovery 
activities such as clearing error log files and turning science instruments back on. Recovery actions were 
completed on November 24, and the Cassini spacecraft continues to be healthy at the time of this writing.Software is growing to handle not just failures (e.g., the Cassini bit flip), but also a range of 
other unexpected situations (e.g., novel usage scenarios) and contingencies (e.g., debris temporarily 
blinding the spacecraft camera).However, as Dvorak et al. note, extending the techniques of onboard 
fault protection to cover a wider range of software failures can also increase algorithmic complexity,
which makes the software harder to verify [2]Developing robust software that can 
handle failures and other conditions that threaten the mission begins with a thorough systems and 
software hazards analysis.Software requirements to protect the spacecraft from mission-critical failures 
are derived from these hazards analyses.As described at the Cassini 
website, the onboard fault-protection software quickly switched to a backup computer and shut off non essential power loads.It is not surprising, given the long life and 
hostile environments encountered by spacecraft, that failures occur.As spacecraft and their 
missions become more complicated, the number of unwanted scenarios that the software must handle 
increases.The software had been designed to configure the 
spacecraft to a safe but degraded state when communication was lost.It switched to an alternate (low-gain) antenna, pointing it toward the sun to 
improve the chances of communicating with operators.The fault protection software on spacecraft continually monitors for deviations between the expected 
spacecraft state and the actual spacecraft state.Fault recovery software is especially important because it 
is usually invoked when something has already gone wrong on the spacecraft.It was 
later determined that this was caused by a flipped bit.