Preservation of Electronic Mail - PowerPoint PPT Presentation

About This Presentation
Title:

Preservation of Electronic Mail

Description:

Preservation of Electronic Mail. Druscie Simpson. NC State Archives. November 19, 2004 ... Email generates about 400,000 terabytes of new information each year ... – PowerPoint PPT presentation

Number of Views:139
Avg rating:3.0/5.0
Slides: 23
Provided by: dsi84
Learn more at: https://ils.unc.edu
Category:

less

Transcript and Presenter's Notes

Title: Preservation of Electronic Mail


1
Preservation of Electronic Mail
  • Druscie Simpson
  • NC State Archives
  • November 19, 2004

2
E-mail The Digital Divide Also Multiplies
3
E-mail as a Burden
  • The Radicati Group and Merrill Lynch estimate
    that email is growing at a rate of 300 annually.
    The Age (July 8, 2003)
  • The real problem not more email, but larger and
    larger attachments, generating an average of 5MB
    of email content daily. The Age (July 8, 2003)
  • Email generates about 400,000 terabytes of new
    information each year worldwide
  • About 31 billion emails are sent daily, on the
    Internet and elsewhere, a figure which is
    expected to double by 2006 (source International
    Data Corporation (IDC). The average email is
    about 59 kilobytes in size, thus the annual flow
    of emails worldwide is 667,585 terabytes. (How
    Much Information 2003, UC Berkeley)

4
       What do I do with ALL that e-mail?!     
  • Why are we so interested in E-Mail and Digital
    Records?
  • Emails far reaching effects

5
Loss of Corporate Knowledge
  • Imagine youre new in the office. All of the
    information to do your job was on your computer.
    Your predecessor deleted the information before
    leaving or it was password protected. You dont
    have the password.

6
Legal Implications
  • If it is in an email and it sent from, received
    by, or is stored on a government computer, it is
    a legal record
  • Never put anything in an e-mail you dont want on
    the front page of the local paper.
  • Always CYO cover your office.)

7
Users have several options for keeping their
saved e-mails
  • They may leave it on the mail providers server
  • They may leave it on a web-based mail server such
    as Hotmail or Yahoo
  • They may store it in their e-mail client such as
    Outlook, Eudora, Netscape
  • They may store it on the file system of their PC
    as individual .eml files (MS Outlook Express
    Electronic Mail)

8
  • In each of these circumstances the actual byte
    stream used to represent the e-mail message is
    slightly different.
  •  While an e-mail server and e-mail client are
    obliged to communicate with each other using
    standards (SMTP, POP3, and IMAP) they are not
    required to store the e-mail using any sort of
    standard.

9
We will be looking for a solution that will have
the widest possible use
  • Start with an IMAP server
  • Enhance server with the ability to take the
    contents of its message store and create the
    desired standard XML files called XMTP
  • Using XMTP, SMTP messages can be transformed via
    XSLT into HTML pages for viewing. XMTP has been
    used to implement a telemedicine consultation
    system using SMTP e-mail and HTML
  • In the testing phase, but not launched yet
  • http//sourceforge.net/projects/smtp/

10
  • IMAP seems to be the only protocol that supports
    moving and copying e-mail messages from place to
    place while preserving the e-mail messages
    native format.
  • This means that no matter where the e-mail
    message ends up, almost any IMAP compliant e-mail
    client can send it to an archives server.

11
How?
  • Have the user send e-mail directly to a server
    hosted by the NC State Archives
  • Have the user send e-mail to an enhanced IMAP
    server maintained by their agency
  • This would enable the agency to be able to
    locally access the archives e-mail messages
  • IMAP server could then send snapshots to or send
    us the XMTP files on electronic media via USPS

12
  • Have the user collect and send .pst files to the
    NC State Archives
  • Archives will open them with Outlook and move
    them to the enhanced IMAP server (process would
    be automated)
  • Archives should also be able to access packages
    of e-mail in other formats since Outlook can
    convert from Eudora, Netscape, etc.
  • Once loaded into Outlook, the e-mail packages
    would then be sent to the IMAP server.

13
  • Any strategy based on the interception of the
    data stream is out since we want to collect the
    e-mail message only after the user has been given
    a chance to cull and organize them.

14
  • Our proposal is to use hmailserver (a source
    forge open source project) which is an IMAP
    server that uses MySql or Microsoft SQL server as
    its message store.
  • http//www.hmailserver.com

15
  • The hMailServer installation contains a minimal
    MySQL-installation, so if you don't already have
    a database server in your network, MySQL is
    installed automatically when you install
    hMailServer.
  • The XML creation utility could interface directly
    with the message store instead of the IMAP
    protocol.
  • Hmailserver comes with an attendant com component
    that can be used to access the data store

16
Life of an e-mail message
  • E-mail message is sent to the users mail server
  • User downloads the message to his/her mailbox
  • User optionally places the message into a folder
    on his/her local system
  • User creates a folder on the Archive IMAP
    server
  • User moves the mail from his/her inbox or
    specified folder to the folder on the Archives
    IMAP server
  • An administrator requests that the IMAP server
    create one or more XML files containing the
    users e-mail
  • XML files are saved as a preservation copy

17
Access to Email 1
  • Load the XML into ENCompass
  • Utilize the IMAP server by enhancing it to
    provide web access to its native store similar to
    the user interface provided by Lurker
  • http//sourceforge.net/projects/lurker

18
Access to Email 2
  • Utilizing Documentum by enhancing it to ingest
    the XML produced by the IMAP server.
  • Documentum server would be used purely as an
    e-mail repository, not as a document management
    application.
  • Utilize Documentum as a document management
    application to interfile e-mail messages into
    named record series

19
Access to Email 3
  • Move e-mail messages into a Share Point Portal
    server
  • Use Outlook to collect the message from the IMAP
    server and send them to SPP.
  • Switch-to-Switch Protocol. Protocol specified in
    the DLSw standard, used by routers establish DLSw
    connections, locate resources, forward data, and
    handle flow control and error recovery.?
  • XML files would serve purely as a preservation
    copy.

20
This Particular Project
  • Take 6 gigabytes of e-mail from Governor Jim
    Hunts administration (1993-2001 bulk dates
    1997-2001) and make it accessible and
    preservable.
  • E-mail has been appraised and culled to create
    the core for preservation
  • E-mail is in Microsoft Outlook .pst files and can
    be accessed only by using the correct version of
    Outlook
  • Create/utilize programs to move the e-mails out
    of Microsofts proprietary .pst format into a
    non-proprietary and stable XML format

21
  • Also want to write software that is more
    universal in scope and can be used with most
    electronic records.
  • Hire a programmer to write code to convert the
    .pst files from their format to XML format
  • Take the converted XML files and load them onto
    our server and make them available to the public
    via the web and searchable through our online
    catalog system (ENCompass/MARS)

22
Wish us luck!
  • We are very excited to have this opportunity to
    explore this potential solution
  • We hope to take what we learn and apply it to the
    collection of other electronic government
    resources that are archival
  • Well keep you posted!
Write a Comment
User Comments (0)
About PowerShow.com