Title: Node Recovery
1Node Recovery
- Day 3, Session 6
- MetaArchive Distributed Digital Preservation
Workshop
Presented by Chris Helms
2Session Learning Objectives
3Restoring a lost cache
- Hardware/Software fault recovery
- Archival Units
- Archival Unit Configuration
4Restoring a lost primary node
- Preparing for Configuration Node Recovery
- Recovering the Title Database
- Recovering the Keystore
- Reconfiguring the Nodes in the Network
5Restoring data from the cache
- Viewing Cached Content
- Using your LOCKSS node as a cache proxy
- Proxy configuration options
- Cached AU retrieval and proxy testing
6Restoring a Lost Cache
- By design, LOCKSS stores harvested material in a
cache directory. LOCKSS will maintain the caches
integrity through peer polling and voting with
other members of your LOCKSS private network.
Occasionally an outside occurrence such as a
power failure, problematic hardware, or a piece
of errant coding can corrupt a file system
leaving the LOCKSS cache in an unusable state.
7Archival Units
- Archival Units may be likened to a run of
journals or a web-based collection containing
images, videos, and text. Long term access to
these AUs is insured via LOCKSS harvesting and
the polling of loyal peers. The concept of having
loyal peers insures that data may be gained from
other peer nodes if the original content is no
longer available from the publisher.
8Archival Configuration
Adding Titles (manual)
9Title Selection
10Archival Configuration
Adding Titles (backup file)
Default cache config file BatchAuConfig
11Restoring a Lost Primary Node
- In order for LOCKSS to harvest and preserve data,
certain configuration files must remain available
via a primary node. These are the title database
and public keystore. As a convenience, these
files are stored within the conspectus database
and are readily available from any participating
LOCKSS node.
12Recovering the Title Database
13Recovering the Keystore
14Reconfiguring the Nodes in the Network
Creation and setup of the new web root directory
and Apache configuration files.
Modify the following line
Apache graceful restart
15Update the lockss.xml file
16Run hostconfig on all member nodes and modify the
Configuration URL
17Restart LOCKSS
- /etc/init.d/lockss stop
- /etc/init.d/lockss start
18Restoring Data From the Cache
- One of the key benefits of participating in a
LOCKSS private network is having reliable,
long-term access to your material. When available
access to the original content is no longer
available, the LOCKSS cache proxy server can be
used to serve its harvested content. This
harvested content may be pulled from any node
participating in your LOCKSS private network. As
we have seen in the previous section, harvested
content may span archival units, key
configuration files, and plugins.
19Viewing cached content
20(No Transcript)
21(No Transcript)
22(No Transcript)
23Using your LOCKSS node as a cache proxy
- A LOCKSS node runs a cache proxy that will serve
cached content if the original content is no
longer available. If the original content is
still available, the LOCKSS node will proxy the
request to the original site.
24Proxy access control
25Proxy Options
Notes 1. The audit proxy serves only cached
content, and never forwards requests to the
publisher or any other sites. By configuring a
browser to proxy to this port, you can view the
content stored on the cache. All requests for
content not on the cache will return a 404 Not
Found error. 2. The ICP server responds to
queries sent to other proxies and caches to let
them know if this cache has the content they are
looking for. This is useful if you are
integrating this cache into an exiting network
structure with other proxies and cache that
support ICP.
26Proxy Info
27Excerpt from a PAC file
28Adding a PAC file to Mozilla Firefox
1. Windows Tools/Options MAC OS X
File/Preferences 2. Advanced/Network/Settings
29Questions ?