JSTOR Advanced Technology Research Group - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

JSTOR Advanced Technology Research Group

Description:

Working in collaboration with other researchers to provide ... Analysis from previous illustration showing reticulated venation. What we are investigating ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 25
Provided by: burns1
Category:

less

Transcript and Presenter's Notes

Title: JSTOR Advanced Technology Research Group


1
JSTOR Advanced Technology Research Group
  • Working in collaboration with other researchers
    to provide access to advanced technologies within
    the same workspace as the literature and primary
    source material.
  • This will enhance discovery and further
    encourage the use of the materials, enabling new
    scholarship and the creation of new knowledge.

2
JSTOR
  • As a digital library weve done quite well.

3
(No Transcript)
4
(No Transcript)
5
Research Support
  • JSTOR has always worked with researchers by
    providing datasets and supported them in
    analyzing our data in their research.
  • We continue to that and will rarely refuse a
    reasonable request (we handle1-2 dozen requests a
    year for usage, citation and content data).
  • Recently we decided to become a more active
    participant in the use of JSTOR data for research.

6
Evolving JSTORs Technology
  • JSTOR is, for many scholars, their digital
    bookshelf (or part of it).
  • The real work takes place at the workbench, not
    the bookshelf
  • Workbenches are trade specific (though tools
    need not be).
  • We need to recognize the diversity of practice
    and, as yet, neither we nor the practitioners
    really understand what is needed for each digital
    practice.
  • JSTORs Showcase is where we bring technology and
    scholarship together and develop digital
    workbenches for our constituents.
  • It is an open, ongoing digital workshop with
    shared tools and materials, where we try to build
    workbenches with and for other scholars.
  • It will be extended to interwork with other
    facilities and provide APIs to our functionality
    and resources.

7
The plan
  • We will host tools and technologies from the
    community (including JSTOR), quickly and openly,
    working on JSTOR content (and others where
    available).
  • Showcase will be a step toward real offerings,
    but our betas will be as useful and usable as we
    can manage.
  • We will actively solicit and respond to feedback,
    so that our workbenches will evolve.
  • We will provide a place where researchers can
    expose their work to users for the mutual benefit
    of both.

8
Active projects
  • DfR Simple text mining and corpus exploration
  • Visualizations of JSTOR usage, participants
  • Topic mapping ( Blei / Princeton )
  • Document Remastering from camera Images aka
    Decapod (Breuel/Kaiserslautern,
    Treviranus/Toronto)
  • Open Annotation Collaboration (Cole, Von de
    Sompel, Cohen, Sanderson et al)

9
Data for Research - Examples
  • The long s
  • The British Empire
  • The golden age of social sciences

10
(No Transcript)
11
(No Transcript)
12
(No Transcript)
13
Foresite
  • A collaborative program with University of
    Liverpool and HP Labs, Bristol
  • Build a relationship graph of the entire JSTOR
    corpus.
  • Explore using an acetate overlay model.

14
Foresite OAI-ORE Explorer. With U. Liverpool
HP Labs
15
Decapod
  • A collaborative program with University of
    Kaiserslautern University of Toronto. Funded by
    the Andrew Mellon Foundation.
  • Building a small, inexpensive, easy to operate,
    1-click paper to document digitization rig.
  • Apply state of the art document understanding and
    usability to allow small institutions to digitize
    their collections.

16
Decapod
  • 1-Click, paper to remastered document.
  • OSS software.
  • state-of-art document understanding.
  • Mobile friendly (reflow).
  • Operator friendly.
  • Budget Friendly.
  • Based off Ocropus Fluid
  • Partners DFKI/Kaiserslautern, ATRC/Toronto,
    JSTOR

17
Open Annotation Collaboration
  • A Mellon funded project, starting in May 2009,
    with the over arching goals to
  • Facilitate the emergence of a Web and
    Resource-centric interoperable annotation
    environment that allows leveraging annotations
    across the boundaries of annotation clients,
    annotation servers, and content collections.
    Interoperability specifications will be devised.
  • To demonstrate through implementations an
    interoperable annotation environment enabled by
    the interoperability specifications in settings
    characterized by a variety of annotation
    client/server environments, content collections,
    and scholarly use cases.
  • To seed widespread adoption by deploying robust,
    production-quality applications conformant with
    the interoperable annotation environment in
    ubiquitous and specialized services and tools
    used by scholars (eg. JSTOR, Zotero, and MONK).
  • The partners in this project are
  • University Library and Graduate School of Library
    and Information Science, University of Illinois
    at Urbana-Champaign
  • Center for History and New Media, George Mason
    University
  • Maryland Institute for Technology in the
    Humanities, University of Maryland
  • eResearch Laboratory, School of Information
    Technology Electrical Engineering, The
    University of Queensland
  • Research Library, Los Alamos National Laboratory
  • JSTOR
  • Contacts
  • Tim Cole UIUC
  • Clare Llewellyn JSTOR

18
What we are investigating.
  • Corpus Analysts Workbench
  • Topic Mapping with LDA (Blei / Princeton)
  • Extract topic signatures and trace them through
    time.
  • Evolution of Ideas (Blei Gerrish / Princeton)
  • Identify ideas.
  • Concept extraction using Associative Rule Mining
    (Sanderson / Liverpool)
  • Citation strategies and the impact of those
    strategies (Adamic / UM)
  • Citation and Similarity structures in Corpora
    (Bergstrom West / UW )
  • Eigenfactor and similar techniques.

19
What we are investigating (cont.. )
  • Linguists Workbench
  • Tools for Linguists using SEASR (Llora / UIUC,
    ODonnell / U. Mich.)
  • Association Rule Mining from the journal
    Evolution (Sanderson / Liverpool).
  • Oldest English Words analysis (Pagel / Reading)

20
What we are investigating
  • Computer-Assisted analysis and mineable Knowledge
    bases
  • Art Auction Catalogues
  • Using accredited crowd-sourcing to review,
    correct machine-generated corpus of documents
    and lot records.
  • Computer Aided Transcription of manuscripts.
  • Using modern image-matching techniques to
    scattershot transcribe documents _and_ build a
    database of script signatures for mining.
  • Digital Staining of Plant Specimens.
  • Use feature extraction, analysis and
    false-coloring to highlight morphological
    structures and create a database of signatures
    for mining.

21
Base Image
22
Analysis from previous illustration showing
reticulated venation.
23
What we are investigating
  • Community Specific / Miscellaneous
  • Tools for Secondary Schools (SEASR) (Llora /
    UIUC)
  • Reading level synopsis generation.
  • NEH/NSF/SSHRC/JISC Digging into Data program
    (community support participant in 1 proposal)

24
Summary
  • Thank you
  • John Burns john.burns_at_jstor.org
  • Main site http//www.jstor.org
  • Showcase http//showcase.jstor.org
  • Data for Research http//DfR.jstor.org
Write a Comment
User Comments (0)
About PowerShow.com