Title: Reliability Modelling for Long Term Digital Preservation
1Reliability Modelling for Long Term Digital
Preservation
Panos Constantopoulos, Martin Doerr, Meropi
Petraki
Information Systems Laboratory
Institute of Computer Science
Foundation for Research and Technology - Hellas
Heraklion, Greece May 12, 2005
2The CIDOC CRMOutline
- Problem statement
- Approach
- Case studies
- Conclusion
3The CIDOC CRMProblem Statement
- All Digital Material is vulnerable to loss
- Cultural and scientific memory needs long-term
preservation - We would like to have the library of Alexandria
back... - A large museum may keep and describe a million
objects - It may not want to loose more than 10 objects
per year - 1 loss in 1000 years!
4The CIDOC CRMProblem Statement
- Risk factors
- Media decay and failure
- Access Component Obsolescence (format, H/W)
- Human and Software Errors
- External events
- Format Obsolescence
- Best studied. Measures are standards, technology
preservation, migration. - For knowledge in text form, textual databases,
vector graphics, bitmap images reasonably solved
with XML and extensive documentation.
5The CIDOC CRMProblem Statement
- Hardware Obsolescence
- Systematic, foreseeable.
- Reasonable Solution carrier migration.
- Human errors
- Stochastic failure. Can be reduced but not
avoided. - Solution replication and control
- Software errors
- Difficult to model and to foresee.
- Replication , multiple S/W platforms and
control.
6The CIDOC CRMProblem Statement
- External Events
- Stochastic failure.
- Solution replication and control
- Media decay
- Stochastic and systematic failure.
- Solution Preventive carrier migration,
replication and control
7The CIDOC CRMProblem Statement
- Summary
- In long terms, the basic strategy is carrier
migration, replication and control. - The expected life-time of information exceeds
any platform and technology. - The respective risk management has hardly been
addressed - The Gksan strategy longest human memories
known - People of the Haida and Qksan tribes in British
Columbia, resident there since Ice Age, keep
historical oral memories more than 10.000 years
back on land-ownership by - Distribution to multiple, selected human
carriers, annual quality control, and Totem poles
as mnemonic aids.
8The CIDOC CRMApproach
- Statistical modelling of long-term risk of data
loss due to media decay and failure and external
events. - Analyze risk factors of different configurations
- In models for long times, complex aging effects
average out. e.g. preventative replacement
results in constant average failure rate.
Long-term studies are simpler than short-term
ones! - Extrapolation of current technology
- Optimal strategy maintain constant failure rate
at any time. - This is independent of technology has to
be reevaluated at each technology change, and to
be maintained for each technology period. Random
processes have no memory
9The CIDOC CRMApproach
- Analytical models that allow for
- Dominant factor analysis
- Cost/benefit analysis (future work) to achieve
the politically set reliability goal. - memoryless Markov chains and fault tree
- Evaluation with program SHARPE.
10The CIDOC CRMCase 1 Mirror Disks
- Assumptions
- Two identical disks, constant failure rate ,
system failure if both are destroyed - MTTF 1/?, Mean time to failure
- MTTR 1/µ Mean time to repair,
- MTTFD 1/? Mean time to failure detection.
?
2?
?
?
1
1D
F
2
µ
11The CIDOC CRMCase 1 Mirror Disks
120d 4m 360d 12m 740d 2yrs
12The CIDOC CRMCase 1 Mirror Disks
- Results
- MTTF 3yrs, MTTR 50hrs, MTTFD 14days
- MTTF total 106,46 yrs
- MTTFD MTTR0 gt MTTF ? !
- The dominant factor is only the time to detect
failure and to repair! Any quality of the disk
can be compensated by faster detection and
repair, in the realistic limits. - Any uncontrolled media will loose the data in
the long term. - gt cost/benefit analysis to be done!
13The CIDOC CRMCase 2 Mirror Disks Backup Tape
- MTTF 1/?, Mean time to failure, MTTR 1/µ
Mean time to repair, - MTTFD 1/? Mean time to failure
detection, 1,2 disk, 3 tape.
14The CIDOC CRM Case 2 Mirror Disks Backup Tape
Coming closer !
15The CIDOC CRM Case 2 Mirror Disks Backup Tape
- Adding Fire !
- At least another backup needed in a third room
16The CIDOC CRMCase 3 Distributed carriers
- Assumptions Data are distributed to N
independent systems with mirror disk and tape
each. - Question Which percentage of my data will exist
after 1000 years? (Binomial model)
17The CIDOC CRMCase 3 Distributed carriers
- If all data are on one system
- High probability to preserve all data
- High probability to loose all data
- If all data are on many individual systems
- Some data will be lost by sure
- Some data will survive by sure
- Conclusion
- Optimal strategy may combine both modes!
18The CIDOC CRMConclusions
- Some results seem not to be very intuitive
- The influence of failure detection and repair
time - The effect of data distribution
- The effect of external events
- Long-term risk modeling allows for
simplifications, that allow for analytical
models. - Analytical models can effectively turned into
decision support tools and combined with
cost/benefit models - Future work A practical decision support tool