Title: Introducing the Major Parts of a CMS
1Introducing the Major Parts of a CMS
- Chapter 7 of Content Management Bible
2In This Chapter
- A high-level view of CMS features
- Collecting content
- Managing content
- Publishing content using templates
3A CMS Overview
- A content management system (CMS) is responsible
for the collection, management, and publishing of
chunks of information known as content
components. - Notice that although logically separate, the
three parts of the system can involve large
physical overlaps - The management system can serve as part of the
collection system - The management system can serve as part of the
publication system - The publication system can serve as part of the
collection system
4A Schematic Overview of a CMS
5The Collection System
6An Overview of the Collection Process
7An Overview of the Collection Process (Cont.)
- Authoring Here someone creates the content from
scratch. - Acquisition Here you gather the content from
some existing source. - Conversion Here is where you strip unnecessary
information from the content and change its
markup language. - Aggregation You edit the content here, divide it
into components, and augment it to fit within
your desired metadata system. - Collection services These services are CMS
programs and functions that aid the collection
process. For example, the collection services
produce the Web forms into which you enter the
content for components
8Authoring
- Authoring refers to the process of creating
content from scratch - A CMS can help the author work efficiently and
effectively by doing the following - Providing an authoring environment (either a full
application or extensions to the author's native
environment). - Providing a clear purpose and audience for the
author's efforts. In a CMS, you direct authors to
create particular content components that already
are defined in terms of their basic purpose and
audiences. - Providing aids for including standard
information. The CMS can easily fill in the
create-date and author's name, for example, to
save the author effort. - Providing templates that break down the content
the author creates into its constituent elements.
You may provide authors with a Microsoft Word
template (a DOT file), for example, that already
includes places to type a title, summary, and
body for the component's author to create. - Providing workflow, status, and version control
for content that's in process.
9The authoring process is an essentially human
process of creation and revision
Low volume buthigh quality
10Acquiring
- Acquiring is the process of gathering information
that wasn't originally created for your CMS - This process might be partly manual or fully
automated. - Two types of acquired content
- Syndications Syndications are sources that are
designed for reuse. - The information is delivered in a generally
useful binary format (XML, for example) - The information is already segmented and has
metadata attached. - Source files any sort of preexisting information
that a computer stores. - Include non-electronic sources such as paper
photographs, analog video, and printed text that,
after you digitize them, end up in files as well. - Generally aren't designed for reuse and require
work on your part to transform into a usable
form.
11Acquiring is the process of gathering information
that was created for some other purpose
High volume but low quality
12Converting
- If the information that you create or acquire
isn't in the format or structure that your system
requires, then you must convert it to match the
accepted standards of the content system - The conversion process consists of the following
three logical steps - Stripping remove and discard unneeded
surrounding information such as page headers and
footers, unnecessary content, and unwanted
navigation. - Format mapping change the information's binary
format to a standard one that the CMS supports.
In addition, you separate its rendering format
from its structure. - Structure mapping make the information's
structure explicit or change it as necessary. - The result of a conversion process is information
that conforms as well as possible to the
standards you develop for format and structural
tagging
13Aggregating
- Aggregation is the process of bringing disparate
information sources into one overall structure
through - Editorial processing styling, consistency, and
usage - Segmentation processing break the information
into chunks (content components) - Metatorial processing add metadata to content
- The metadata that you apply to the content
enables the system to effectively store and
retrieve it - This system consists of the rules that you create
for how to supply metadata values for each new
piece of content that you bring into the system - Aggregation is often a part of the conversion
process
14Aggregating comprises editorial processing,
segmentation, and metatorial processing
15Editorial Processing
- All professional publishing groups use an
editorial framework to guide their work - Correctness rules check generally accepted
standards (punctuation, word usage, grammar) - Communication rules ensure that the content
projects a specific image and targets a specific
audience - Voice of the content (active, passive, first
person, third person), other stylistic rules,
ways to communicate with the intended audience
(such as the right vocabulary to use). - Consistency rules ensure that you apply all the
other rules evenly across the entire content base
and that, after you define it, you always use a
term the same way - Editors can attempt loosening the constraints on
the content, so that you can use it in the widest
variety of contexts
16Segmentation
- Segmentation is the process of dividing content
into convenient and useful chunks (components) - The process that you use to create components
depends on the source of the components, as the
following list indicates - Components that you author you can create as you
author them. - Components that you acquire you can segment
after, or as part of, the conversion process. - How well marked each component is in its source?
17Segmentation (Cont.)
- Components are generally marked in source files
in the following ways - By a file boundary files for office location
fact sheets - By a database record boundary database full of
employee data ? Employee Profile components - By explicit markup international restaurant
menus from a good XML syndication source (ltMenugt)
- By implicit markup printed product catalog ?
each new product is marked in the catalog by a
hard page break - Not marked Suppose that the print-product
catalog that you receive marks each new product
with the font Times New Roman, 12-point, Bold
18Collection Services
- Help get content into the repository
- Authoring components directly into the CMS
repository - Loading previously created components into the
repository one at a time or in bulk - Authoring components
- Web-based form
- Allow an author to type blocks of text, enter
metadata, and upload images and other media - Tight control and validation
- Application such as Microsoft Word
- Usually one file one component
- Flexibility
- Need to create template that authors must follow
(example file, DOT)
19The Management System
20An Overview of the Management System
- Responsible for the long-term storage of content
components and a range of other resources - Capable of
- Details about your content, including what kinds
of components you have and where in its
life-cycle each is. - How well utilized your staff is and what
bottlenecks are coming up. - How you're using components in publications and
which content is unused or ready for removal. - Who has access to what content and who's
contributed the most
21An Overview of the Management System (Cont.)
- A management system contains
- Repository a place to store the content
- Administration an administration system for
setting up and configuring the CMS - Workflow defined sets of steps for doing the
work necessary on the content to get it ready to
be published - Connections a set of connections (hardware and
software) to other systems within the
organization, ranging from networks and servers
to data repositories
22The Repository
- The repository is the set of databases, file
directories, and other system structures (for
example, custom settings for the CMS) that store
the content of the system as well as any other
data associated with the CMS - Components and other CMS resources come into the
repository via the collection services, and the
publishing services extract them - The repository can contain the following
components - Content databases and files
- Control and configuration files
23The repository contains the CMS databases as well
as other storage and retrieval mechanisms
24Content Databases and Files
- Content databases and files hold the system's
content components. - Content databases can consist of standard
relational databases or XML object databases (or
hybrid) - RDB one table per component class, one row per
component instance, and one column per component
element - XML component classes, instances, and elements
all have their own tags, which a DTD brings
together into a complete system
25Content Databases and Files (Cont.)
- Content files hold content outside of any
database - You can store media as binary files and link them
to database records. - Files such as word-processing files and
spreadsheets, as well as other files such as PDF
files, which are intended for use by the CMS in
their existing format, you can store and
distribute as files. - In this case, the CMS is functioning more as a
DMS than a CMS - Rather than using a database at all, the main
content storage facility may consist of one or
more XML files that the CMS services manage.
26Control and Configuration Files
- Control and configuration files are the
non-content files that you manage within the CMS
repository. - Input and publishing templates
- Staff and end-user data files and databases
(profiles) - Rules files and databases hold the definitions
of component types, workflows, and
personalization routines. - Meta information lists, content index files, and
databases augment the metadata that you store
directly in the content files and DB - Log and other control files and structures (such
as system catalogs and registries) - Scripts and automated maintenance routines
programs that the CMS uses to help manage content
27The Administration System
- Responsible for setting parameters and structure
of CMS - The administration system affects all the parts
of the CMS - Collection system
- Staff configuration roles and access rights
- Metatorial configuration metadata fields and
lists - System configuration structure and workflows of
the CMS - Management system
- Database administration tasks user maintenance
and permissions, backup, and archiving - Content-specific tasks creating content types,
performing metadata reviews, and creating
workflows.
28The Administration System (Cont.)
- The administration system affects all the parts
of the CMS (Cont.) - Publishing system
- Ensure that all the HW/SW for displaying content
is working according to plan. - Example for Web publications ? ensures that the
Web server, application server, content
management application objects, databases, and
other associated programs are always running and
never overtaxed
29The administration system affects all three parts
of a CMS
30The Workflow System
- Responsible for coordinating, scheduling, and
enforcing schedules and staff tasks. - The workflow system affects all three parts of
the CMS - Collection
- Workflows for content collection, creation, and
aggregation tasks - In most cases, the workflow follows a particular
kind of content from creation until it's ready
for publication - Management
- Workflows for standard administrative tasks such
as backup and archiving - Workflows for reviewing, changing, and verifying
the usefulness of content. - Workflows for scheduling data-mining and
synthesis tasks - Workflows for managing the connection between the
management system and other non-CMS systems that
provide data to the CMS.
31The Workflow System (Cont.)
- The workflow system affects all three parts of
the CMS (Cont.) - Publishing
- Publication cycles and their associated workflows
to ensure that, each time you create a
publication, it's as good as is humanly possible.
- Example your Web site operates on a daily
publishing cycle, during which you update news
and special announcements. - You may create a workflow that includes steps
such as those for reviewing all pending content,
performing test builds of all affected pages,
testing personalization rules against new
content, and changing status to publish.
32Connections
- Connect the management system to various
infrastructure and data systems - The organization's LAN and WAN environments
- The organization's user-management system
- Company metadata systems
- Enterprise data systems
33The Publishing System
34An Overview of the Publishing System
- Responsible for pulling content components and
other resources out of the repository and
automatically creating publications out of them - Publishing templates programs that build
publications automatically - Publishing services a set of tools for
controlling what is published and how it is
published - Connections tools and methods used to include
data from other (non-CMS) systems in finished
publications - Web publications the most common output for most
CMS - Other publications other non-Web publications,
including electronic, print, and syndications
35Publication System
36Publishing Templates
- Publication templates are files that guide the
creation of a publication from the content stored
in the repository. - CMS templates are programs that specify
publication-building logic - Templates include the following components
- Static elements text, media, and scripts that
pass directly through to the publication without
further processing. - Calls to publication services retrieve and
format components and metadata from the
repository and perform other necessary functions
such as running personalization rules, converting
content, and building navigation. - Calls to services outside the CMS integrate
publications into a wider organizational
infrastructure by calling in enterprise data and
functionality and other Web services.
37The publication template is a program that builds
publications from the content in the repository
38Publishing Services
- Publishing services are the application logic and
business services that a CMS provides that aid in
the creation of publications from the content and
metadata in the repository. - Load and execute templates These services
process the personalization, conversion, content
extraction, and navigation-building calls that
the templates make to create a publication. - Provide publication-specific services These
services include output to PDF for print
publications or incremental updates to a Web
site. - Provide a bridge to non-CMS services These are
services that you need to call and that provide
data that you need to include in publications.
39Publishing Services (Cont.)
- How to trigger publishing services?
- Dynamic publication (e.g. live Web site) invoke
the publishing services via a request from a
browser and produce a single page - Static publications (e.g. static Web sites and
other publications) a staff member or a
prescheduled automation routine triggers the
publication services, which then produce a
complete publication. - You may create part of a publication by using
services that aren't part of the CMS by having
the publishing services call them as independent
software objects. - Non-CMS services generally provide e-commerce
transactions and access to enterprise data and
other resources not under the control of the CMS.
40Connections
- Maintains connections to other (non-CMS)
enterprise data systems - Examples ERP application data, user data
- Data from these systems can be read live from the
source at the time that you create a publication,
and you lay it out appropriately on the
publication page at that time as well. - You can also load enterprise data periodically
from the source to the CMS repository ("acquired"
content)
41Web Publication
- Web publications are Internet, intranet, and
extranet sites that a CMS produces. - Dynamic Web publication the CMS produces these
sites one page at a time, in response to user
clicks. - The user's click passes a page request to the Web
server that triggers the CMS publishing services
to do the following - Load a template
- Pass it any parameters that came along with the
user's request - Execute the code in the template to produce a
finished page - Pass the finished page back to the Web server for
display in the user's browser
42Web Publication (Cont.)
- Static Web publication the CMS produces them all
at once and serves them as HTML files. - The CMS administrator triggers a build of the
static site using some user interface in the CMS.
- The CMS then calls the appropriate publishing
services and templates to produce all the pages
of the site. - Especially in dynamically built sites, the CMS
publishing services are embedded within a Web
server and an application server. - The Web server is software that takes care of the
basic function of receiving requests from the Web
user and sending her the results. - The application server is software that provides
caching, database connection pooling, and other
services that help the CMS scale and increase its
performance.
43The Web-publishing system can produce a fully
dynamic site
44Other Publications
- You can potentially use the same templating
engine that produces Web publications to create
other publication formats - Publications that aren't destined for the Web
include the following - Print publications
- FrameMaker (MIF) file (goes to a printer for
publication), QuarkXPress, and (PDF) - Electronic publications Static Web sites that
you distribute on CD-ROM or any other type of
CD-ROM- or network-based multimedia system. - Microsoft Help files, e-mail files,
- Syndications sets of content components that you
publish for distribution and reuse in
publications outside your CMS. - The most useful format for syndication is XML,
but the most common format is ASCII, with a
header that contains metadata for each syndicated
component