Title: Classification and Taxonomy
1Classification and Taxonomy
Greg Argo
2Brief origins of the organization of information
- Large amounts of information became difficult to
store and retrieve. - Although the classes used vary wildly across
cultures, grouping based on the class level is
nearly universal. - Organizational structures provide the context in
which humans transform information into
knowledge. - Its not just handy, its essential.
3- Humans classify with a pronouncedly mental
scalpel that helps us carve discrete mental
slices out of reality because reality is not
made up of insular chunks unambiguously separated
from one another by sharp divides, but, rather,
of vague, blurred-edge essences that often spill
over into one another. - -Eviatar Zerubavel (1991)from The fine line
Making distinctions in everyday life
4- Cognitive scientists have noticed that much of
our mental commerce with an environment deals
with classes of things rather than with unique
events and objects. - -Mark Stefik (1995) from Introduction to
knowledge systems - For example, the people seen below could probably
all be placed in both the class Cognitive
Scientists and the class Nerds. Can you think
of other possible classes? Possible
relationships? Clinical vs. academic cognitive
scientists? Beards and nerds?
5Why consider classification and taxonomy together?
- Both are methods for grouping objects or ideas
sharing useful, although sometimes superficial,
similarities - Both group to make retrieval easier
- Both are very basic and pervasive elements of
information architecture - It is often difficult to tell them apart
- It is often unnecessary to tell them apart
6Why tell them apart then?
- To become knowledgeable about the different
limitations and possibilities in their
interaction - Differential demand on and payoff for users
- It is important to understand the specific
qualities by which each can achieve
organizational objectives
7Specific qualities presented as keywords and
key-dichotomies
- Organization
- Retrieval
- Controlled vocabulary/thesauri
- Ambiguous vs. Exact
- Searching vs. Browsing
- Content-based vs. User-based
- Descriptive vs. Navigational
- Precision vs. Recall
- Structures vs. Applications
- Concise vs. Broad
8Classifications, Taxonomies, and Ontologies -
Classifications
- Relationships expressed are not essential, but
are based on arbitrary, external attributes
(color, genre, format, geography, subject,
alphabetical order) - Created broadly from the top-down, based on
conceptual frameworks - Created by subject experts
- Usually dont change significantly after their
creation - Generally applicable to specific domains
9Classifications, Taxonomies, and Ontologies -
Taxonomies
- Relationships expressed are usually essential,
based on internal properties of the related
pieces of information - Created concisely from the bottom-up from actual
content - Created by multidisciplinary teams
- Are process-oriented, and so are updated
frequently - Oftentimes can be used and reused in different
situations and environments - Relationships commonly represented hierarchically
- Can be include many classifications connected
together
10Example of internal properties of taxonomic
relationship
- All zippers are clothes fasteners
- Not all clothes fasteners are
- zippers
- Because of the essential nature
- of their relationship, zippers is
- a sub-class of clothes fasteners,
- and clothes fasteners is a
- superordinate class of zippers
11Taxonomic Hieracrhy
12Classifications, Taxonomies, and Ontologies -
Ontologies
- Like taxonomies, relationships expressed are also
essential - Scope is more overarching due to inclusion of
supplemental information - Descriptions and definitions of concepts and
their corresponding relationships - Can include many sub-class taxonomies connected
together
13Classifications, Taxonomies, and Ontologies
- Classifications guide users to a body of
information - Taxonomies guide users through a body of
information - Ontologies guide users in becoming proficient in
the retrieval of and understanding of a
particular body of information
14Classification
- To classify something is to identify it as a
member of a known class - On the Web, information architects organize
classification schemes into either exact or
ambiguous schemes - Classification problems begin with data and
identify predetermined classes as solutions
15Exact classification schemes
- Items are categorized mutually exclusively
- Useful to users who know exactly what they are
looking for - By definition, are easier to create and maintain
than ambiguous schemes - Alphabetical, chronological, geographical
16Alphabetical schemes
- Directories and lists
- User must have a good idea of what they are
searching for and be able to spell it - On the Web, usually utilized deeper in the scheme
inside of sub-sites
17Chronological schemes
- Have an intuitive advantage for users because
they are organized in the same linear scheme in
which humans experience the dimension of time - Yearbooks, historical sites, and news headline
sites - Ebay offers results organized by a few different
types of chronologies
18Geographical schemes
- Have intuitive appeal to rich spatial faculties
and needs of users in their experience of reality - Geographical divisions coincide with governing
bodies which restrict and encourage behaviors
through law and language - Requires knowledge of geographical divisions and
map reading on the part of the user
19Ambiguous classification schemes
- Items are categorized into intellectually
meaningful groups - Useful to users who dont know quite what
information they are searching for - Facilitate iterative, serendipitous learning
- Audience-based, Subject-based, Task-based
- Each should be based on scheme specific research
and development processes (e.g. user and task
analyses)
20Audience-based classification schemes
- Makes sense if the informational domain caters to
clearly delineated audiences - Homepage becomes a filter that leads to sub-sites
organized some other scheme - Suggests customization/personalization
- Recommendations are sometimes powerful, sometimes
failures
21IA research for audience-based classification
schemes
- Map services and applications to their
appropriate group - Discern what types of technology-use are
associated with specific populations - Find points of overlap between audience
categories - User research sessions, usage statistics, search
log analysis, focus groups, critical incident
reports
22Subject-based classification schemes
- Most immediately recognized are the library
classification schemes (DDC, LC) - When used in IA, they generally work best when
hybridized with other types of schemes - Are challenging to implement because different
words, symbols, and idioms mean different things
to different people - Breadth of subjects included should be decided
early on because these parameters will affect
much of the rest of the IA and content work for
the Web site
23IA research for subject-based classification
schemes
- Solicit development team to write down each
content item that will be part of site - IAs perform card sorting exercise to establish
initial subject categories - Take it to the user
- Further card sorting
- Survey with questions about navigation
- Continually refine
24Task-based classification schemes
- Useful for action and transaction related Web
sites - Rarely drive a Web site on their own, but are
typically embedded deeper as part of a hybrid
scheme - Desire of businesses to remove labor costs will
likely increase their ubiquity
25IA research for task-based classification schemes
- The field of usability arose from the need to
research the success and value of tools and their
applications - Traditional usability tests are a good fit
- Analyses of video-taped sessions, navigation
logs, heuristic reviews, surveys, critical
incident reports
26Taxonomies
- Information architects have two major types to
utilize descriptive and navigational - They contrast well and each excels for different
organizational and user needs - Central ideas include creating hierarchies,
controlled vocabularies, and variant/preferred
term and synonym relationships - Build on classifications by supporting
applications and many different types of content,
including images, email, search engines, process
funnels, and site registration
27Descriptive taxonomies
- Operate outside of a users immediate awareness
- Supplement information retrieval during keyword
searching - IAs create controlled vocabularies and synonym
rings which they use to maintain consistency
across applications and departments - By analyzing emerging content and search logs,
IAs maintain currency and map alternative
terminology used by searchers back to the
preferred form
28Controlled vocabularies in descriptive taxonomies
- Done by attaching tags to content with metadata
derived from controlled vocabulary usage logs - The resulting thesaurus with related and variant
terms makes a descriptive taxonomy more robust
29Using the controlled vocabulary to increase
recall or precision
- A users search can be expanded to increase
recall by mapping the search term to its variants - Or a users search can be narrowed to increase
precision by mapping a users term to the
preferred term in the controlled vocabulary
30More about descriptive taxonomies
- Created from the bottom-up
- Are called descriptive because they are derived
directly from the content that is being used - Data management vocabularies allow workers in
disparate domains to report information using the
same terminology - Makes it easier for management to mine
information from this data in the future
31Navigational taxonomies
- Have a lot of overlap with exact and ambiguous
classification schemes - In contrast to descriptive taxonomies,
navigational taxonomies command the users
conscious awareness - Allow the user to guide the seeking process
themselves by browsing instead of searching
32Navigational taxonomies contd
- Created from the top-down based on mental models
of users - Hierarchical structures visually imply sequences
of events and relationships - These relationships provide context similar to
words in a sentence - Works best when users are unsure of what they are
seeking
33Breadth vs. Depth
- Breadth is how many categories are contained in
each level - Depth refers to how many levels are contained in
the hierarchy - Too broad and shallow causes user too many
choices and not enough content - Too narrow and deep causes user to click more
than they will stand for - It is best to err on the side of broad and
shallow to allow for add-ons and to avoid
restructuring the home page
34Summary
- Distinction is more pronounced in theory than in
practice because both are essentially controlled
vocabularies structured by logical relationships - Generally, as one moves from classifications to
taxonomies to ontologies, the structures,
relationships, and supplemental descriptions
become more complex
35Summary contd
- Since humans seem to perform all three of these
innately, it matters less what they are called
than how their elements can be tailored to
specific scenarios to improve retrieval of
information, consistency of communication, and
creation of knowledge
36References
- Adams, K. (2000). Immersed in structure the
meaning and function of taxonomies.Internetworking
, 3.2. Retrieved October 25, 2004
from http//www.internettg.org/newsletter/aug00/a
rticle_structure.html - Brown, J., Duguid, P. (2002). The social life
of information. Boston Harvard Business School
Press. - Conway, S., Sligar, C. (2002). Unlocking
knowledge assets. Redmond, Washington Microsoft
Press. - Edols, L. (2001).Taxonomies are what? FreePint,
97, 9-11. Retrieved October 25, 2004 from the
FreePint Web site http//www.freepint.com/issues/
041001.pdf - Goodall, G. (2003). Business taxonomies and
bibliographic objective Facetation. Retrieved
October 25, 2004 from http//www.deregulo.com/fac
etation/pdfs/businessTaxomies_goodall.pdf - Nielsen, J. (2001). Designing web usability.
Indianapolis, IA New Riders Publishing. - Rosenfeld, L., Morville, P. (2002). Information
architecture for the World Wide Web. Cambridge
Sebastopol, CA O'Reilly.
37References contd
- Shank, P. (2004). Get organized or get lost.
OnlineLearningMag. Retrieved October 25, 2004
from http//www.onlinelearningmag.com/onlinelearn
ing/magazine/article_display.jsp?vnu_content_id11
08349 - Stefik, M. (1995). Introduction to knowledge
systems. San Francisco Morgan Kaufmann. - Svenonius, E. (2001). The intellectual foundation
of information organization. Cambridge, MA The
MIT Press. - Taylor, Arlene G. (1999). The organization of
information. Englewood, CO Libraries Unlimited. - van Duyne, D. K., Landay, J. A., Hong, J. I.
(2003). The design of sites. Cambridge
Addison-Wesley. - van Rees, R. (2003). Clarity in the usage of the
terms ontology, taxonomy and classification.
CIB73 2003 Conference Paper. Retrieved October
25, 2004 from http//vanrees.org/research/papers/c
ib2003.pdf - Zerubavel, E. (1991). The fine line Making
distinctions in everyday life. New York
Free Press.