NDNP and CONTENTdm: A Match? - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

NDNP and CONTENTdm: A Match?

Description:

Improve products of United States Newspaper Program (USNP) using current technologies ... and 'best practices' for newspaper digital reformatting and access ... – PowerPoint PPT presentation

Number of Views:99
Avg rating:3.0/5.0
Slides: 24
Provided by: ericsch4
Category:

less

Transcript and Presenter's Notes

Title: NDNP and CONTENTdm: A Match?


1
Ohio Newspaper Digitization Project
- Since 2008
Read all about it!
NDNP and CONTENTdm A Match?
Todays Lead Article The Ohio Newspaper
Digitization Project Read All About It
2
Connecting to collections Ohio Newspaper
Digitization Project
Read all about it!
- Since 2008
Two Men Suspected In Presentation
  • Todays presenters
  • Eric W. Schnittke, Project Coordinator
  • Phil Sager, Digital Projects Developer

3
The National Digital Newspaper Program
  • Enhance access to all American newspapers
  • Improve products of United States Newspaper
    Program (USNP) using current technologies
  • Establish standards and best practices for
    newspaper digital reformatting and access
  • Develop geographically-diverse program that
    benefits all US communities
  • Use multi-phased approach for research and scaled
    development
  • http//www.neh.gov/projects/ndnp.html

4
Connecting to collections Ohio Newspaper
Digitization Project
Read all about it!
- Since 2008
GOSSIP Ohio Newspaper Digitization Project
  • The Ohio Historical Societys Involvement With
    NDNP
  • Chosen in the summer of 2008
  • Digitize 100,000 pages
  • Covering 1880-1922
  • Ten Ohio Regions, Advisory Board
  • Vendors for digitization, duplication
  • Start with sample roll, ramp up

5
Connecting to collections Ohio Newspaper
Digitization Project
Read all about it!
- Since 2008
Whats the Buzz? Chronicling America!
  • www.loc.gov/chroniclingamerica/
  • LCs website with participants entries
  • Highly Searchable
  • Eleven states So far.
  • 1880-1910
  • Start with sample roll, ramp up

6
(No Transcript)
7
http//www.loc.gov/chroniclingamerica/
8
(No Transcript)
9
How Hard Is It To Do Online Newspapers?
  • Can range from the
  • Crude to the sophisticated
  • Easy to the complicated

10
Questions to Think About
  • How much time, effort, money do you have to
    expend?
  • How strictly would you like to adhere to the
    lastest best practices and standards?
  • There is a certain amount of tradeoff

11
NDNP Specification
  • NDNP Spec gold standard, with respect to
    newspapers on microfilm
  • Digitization standards
  • Metadata creation
  • http//www.loc.gov/ndnp/pdf/NDNP_200911TechNotes.p
    df for more info.

12
  • Although in several ways it stops short of the
    ideal, for example
  • Page-level (vs. article level representation)
  • No added-value descriptive metadata beyond
    title/edition
  • E.g. manually keyed text such as birth and death
    notices, etc.)

NDNP Specification
13
  • Two Choices
  • 1. OCLC NDNP loader software
  • Advantages
  • We wouldnt have to be involved much in the
    conversion process
  • Yet still have some control over what assets and
    metadata are processed.

Loading NDNP-formatted output into CONTENTdm
14
  • Two Choices
  • 1. OCLC NDNP loader software
  • Disadvantages
  • New version still being worked on for CDMv5 (OHS
    NDNP data will be test-case)
  • Fee-based
  • License software (c.f. OCR license)?

Loading NDNP-formatted output into CONTENTdm
15
  • Two Choices
  • 2. Vendor-prepared display files, plus tab file
  • Advantages Free
  • Disadvantages
  • Probably will require more back-and-forth to get
    mappings correct for tab file
  • Batch upload with Project Client
  • May rule out option of JP2-based word-bounded
    highlighting

Loading NDNP-formatted output into CONTENTdm
16
  • What metadata?
  • Basic descriptive and structural metadata (e.g.
    title, issue, pagination, etc.)?
  • Technical and administrative metadata?
  • METS ALTO
  • ALTO Analyzed Layout and Text Object
  • METS schema extension
  • Used to wrap word coordinate metadata (and other
    page layout data)
  • Any hope of using without the OCLC loader
    software?

Vendor discussion points
17
  • What files?
  • Searchable PDFs only?
  • JPEG2000s only
  • Both?
  • Implications for
  • Storage and backup
  • Online user experience

Vendor discussion points
18
  • Looked at two methods
  • 1. Multi-page PDF method
  • Advantages
  • Like the Adobe Reader 9 viewer interface
  • Flexible format for printing

Ohio Jewish Chronicle (non-NDNP)
19
  • Looked at two methods
  • 1. Multi-page PDF method
  • Disadvantages
  • Dont Like the old Adobe Readers (e.g. v6)
  • Need to perform second in-document search to see
    highlighted hits
  • Slower to load
  • Vendor cost per page higher

Ohio Jewish Chronicle (non-NDNP)
20
  • Second method
  • 2. TIFFs ? JP2s with OCR
  • Advantages
  • Viewer is all server-side
  • Can take advantage of OCR and word-bounding
    functionality built into the Project Client
  • Vendor cost per page lower

Ohio Jewish Chronicle (non-NDNP)
21
  • Second method
  • 2. TIFFs ? JP2s with OCR
  • Disadvantages
  • Viewer not as good (as Reader 9)
  • (Institutions trying alternatives like Zoomify,
    etc.)

Ohio Jewish Chronicle (non-NDNP)
22
  • NDNP spec is excellent, but demanding
  • However, the difference in cost per page is
    substantial to produce NDNP formatted output
  • Preference is to outsource microfilm
    digitization and metadata creation
  • Might be hard to convince some institutions
    (including our own) that the NDNP way is the best
    way to go

Future Newspaper Efforts
23
Connecting to collections Ohio Newspaper
Digitization Project
Read all about it!
- Since 2008
CLASSIFIEDS Questions? Comments?
Wiki http//ohsweb.ohiohistory.org/ondp Ohio
Memory http//www.ohiomemory.org Email us
at eschnittke_at_ohiohistory.org psager_at_ohiohistory.
org
Write a Comment
User Comments (0)
About PowerShow.com