SCOPE An XML Based Publishing Platform - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

SCOPE An XML Based Publishing Platform

Description:

8th International Symposion on Electronic Theses and Dissertations, ETD2005, Sydney ... easily convertible to. presentation formats (HTML, PDF) other XML structures ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 31
Provided by: uwem3
Category:

less

Transcript and Presenter's Notes

Title: SCOPE An XML Based Publishing Platform


1
SCOPEAn XML Based Publishing Platform
  • Uwe Müller, Manuel Klatt
  • Humboldt-Universität zu Berlin
  • Electronic Publishing Groupu.mueller,
    manuel.klatt_at_cms.hu-berlin.de

2
Background
  • Humboldt University 800 1.000 dissertations /
    year
  • Germany duty to publish dissertations
  • Humboldt U. ¼ dissertations published
    electronically
  • conference proceedings
  • series (university series, preprint series,
    technical reports )
  • electronic journals
  • Open Access campaign (Pre- / Postprints)
  • XML as central strategy

3
(No Transcript)
4
Why XML?
PDF
  • Standardized format
  • Long term preservation
  • easily convertible to
  • presentation formats (HTML, PDF)
  • other XML structures
  • qualified full text retrieval
  • contains structural and contextual information
    in a machine readable format

digital signature
Office document
digital signature
XML
digital signature
HTML
5
XML Restrictions to deal with
  • XML source does not contain layout information
  • rather linear structure
  • XML is not used as Authoring System
  • authors use their 'own' systems
  • Microsoft Word
  • LaTeX
  • Open Office / Star Office
  • Framemaker
  • Word Perfect

6
How to overjump the gap?
  • get the authors where they are
  • instructions and guidelines for authors
  • usage of style files (e.g., dissertation-hu.dot)
  • manuals, support hotline, regular courses
  • different conversion processes
  • SGML author (plug in for MS Word lt 97)
  • Open Office / Star Office
  • exploit genuine XML format
  • MS Office 2003
  • XML according to DiML DTD
  • common pitfalls tables, pictures

7
(No Transcript)
8
Conversion Process Using OO (Example)
example.doc
front.html
chapter1.html
chapter2.html
Open Office
example.sxw (zip file) . . . . . . . .
chapter3.html
.gif
.jpg
example.html
content.xml
front.xml
example_stl.xml
chapter1.xml
chapter2.xml
chapter3.xml
example.xml
9
(No Transcript)
10
Principal Structure of a DiML document
  • ltetdgt
  • ltfrontgt..title...author...abstract...lt/frontgt
  • ltbodygt
  • ltchaptergt
  • ltsectiongt
  • ...
  • lt/bodygt
  • ltbackgt..bibliography...appendix...vita...lt/backgt
  • lt/etdgt

11
From flat structure to Hierarchy
  • only two types of styles in Word
  • paragraph styles
  • character styles
  • e.g., in case of the first occurring Heading 1
    paragraph style the converter has to know
  • Heading 1 is the beginning of a chapter
  • Heading 1 implies a head element
  • the element chapter can only occur in body
  • lt/frontgt
  • ltbodygt
  • ltchaptergt
  • lthead id"anyID"gtIntroductionlt/headgt

12
(No Transcript)
13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
One Core Multiple Views
  • HTML generation (static or dynamic)
  • performance problems with XSLT and huge documents
  • solution division of XML sources into components
    (easier and fast to process)
  • PDF Print on Demand (http//www.proprint-service
    .de)
  • Current problems
  • changing Office systems and versions
  • ongoing implementations and adaptations necessary
  • but might be restricted to XSL coding

19
XML Based Publishing
  • characterized by
  • complex processes and workflows
  • many dependent tools and manual work steps
  • relatively high human effort
  • different processes for different publications,
    but with a lot of equal steps and properties
  • ongoing development changing versions
  • Basic Idea1. Raise concrete process description
    to an abstract level2. Implement integrated
    workflow system

20
SCOPE
  • support for authors and editorsprovide an
    integrated publishing platform
  • XML based
  • aiming at technological aspect of publishing
    processes
  • tool management
  • platform for distributed publishing
  • generic framework for different processes
  • Service
  • Core for
  • Open
  • Publishing
  • Environments

21
SCOPE goals
  • elementary Publication Components (Document
    Models, Authoring Tools, Conversion Scripts,
    Digital Signatures )
  • Management System to organize and administer the
    Publication Components
  • modelling of relations and dependencies
  • version management
  • Publishing System
  • management and storage of documents
  • Workflow System
  • modelling of recurrent processes (technical
    validation, conversion processes, reviewing,
    conference organisation )

22
Publication Components
Publication Components
23
Documents and Publication Comp.
  • Documents can occur in different formats
  • Publication Components can convert formats into
    each other and change properties
  • PCs automatic and manual "tools"

24
Publication Components (PC)
  • Main Properties (Metadata)
  • source occurrence (base type properties) or
    property
  • target occurrence (base type properties) or
    property
  • parameters
  • necessary environment / interfaces
  • modules / used files
  • Examples
  • autohring tools, conversion scripts, word macros,
    XSLT scripts, PDF checker,
  • Management system to register PCs and metadata
    (CVS based)

25
Metadata System
  • special publication component
  • basic information on each document
  • adaptable and configurable in terms of
  • data model
  • management processes (forms )
  • presentation styles (browsing, search )
  • configuration via XML files and style files
  • data entry forms can be used
  • internally
  • by extern data managers (editos, by login)
  • by normal authors (document upload)

26
Publishing Process
  • Formal process model
  • assembled with the help of the PC management
    system(Enquiries to the database )
  • realized and monitored by workflow system
  • abstract state machine
  • PCs atomic actions
  • integration of external workflow components
    (e.g. GAPWorks)
  • web based distributed access

27
(No Transcript)
28
(No Transcript)
29
SCOPE Service
  • Support for authors and editors
  • tools, adaptations, advisory service
  • Hosting centralized technology for distributed
    publishing
  • institutions within university
  • small research institutions, smaller universities
  • editorial boards of electronic journals
  • also single publication series, technical
    reports
  • Technology Transfer
  • Publication Components
  • modular structure but also HU specific
    components

30
Thank you
  • Questions?
  • u.mueller_at_cms.hu-berlin.de
  • manuel.klatt_at_cms.hu-berlin.de
  • http//edoc.hu-berlin.de/
Write a Comment
User Comments (0)
About PowerShow.com