The growing pains of a controlled vocabulary - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

The growing pains of a controlled vocabulary

Description:

The growing pains of a controlled vocabulary. Karen Loasby. Information Architect. Bbc.co.uk ... Karen Loasby. Information architect ... – PowerPoint PPT presentation

Number of Views:188
Avg rating:3.0/5.0
Slides: 27
Provided by: Han6
Category:

less

Transcript and Presenter's Notes

Title: The growing pains of a controlled vocabulary


1
The growing pains of a controlled vocabulary
  • Karen Loasby
  • Information Architect
  • Bbc.co.uk

2
Introduction
  • Karen Loasby
  • Information architect
  • Worked for BBC for 4 years on search, navigation,
    metadata and content management projects
  • 2 years previously for the Guardian newspaper
    archiving the paper and arranging content on the
    website
  • MSc in Information Science from City University,
    London

3
Agenda
  • Background
  • The problem
  • Formal classification vs. Folk tags
  • Our middle ground
  • What happened
  • Learning points
  • Questions

4
Background
  • Content management project
  • Regional websites
  • Need for metadata
  • Authors around the UK

5
(No Transcript)
6
Problem
  • Faceted classification system
  • Authors to tag
  • Central control
  • But
  • Journalists are the specialists know the domain
    and the vocabulary.

7
Formal classification
  • Pre-determined terms
  • Centralised control
  • Rich relationships

8
Folk tags
  • What it is then?
  • Folksonomy, ethnoclassification, social
    classification, social categorisation and so on

9
Comparing approaches
  • Formal
  • High maintenance
  • Consistent/predictable
  • Rich relationships
  • Can be artificial
  • Folk
  • Low maintenance
  • Quirky/surprising
  • Less added value
  • Real user language

10
A role for both
  • Where we are using folk tagging
  • And where we wont
  • Trust Authority
  • High value to business
  • Missing motivation from users
  • Broad domain/user base
  • To avoid tryanny of minority

11
An experimental middle ground
  • Centralised control of terms
  • But encouraging absorption of user language
  • Higher maintenance than folk tags
  • Cheaper than professional cataloguing

12
BBC Experience
Terms are OK
Terms are OK
Search or browse for terms
Semi-automatic classification
Terms suggested from the CVs
Send suggestion to the CV team
The suggested terms do not describe the content
Send suggestion to the CV team
Add to CV as a variant term or preferred term
CV team evaluate suggestion
Say no to the term change the classification
on the content object
13
Operational system
  • 8000 requests in 10 months
  • From 160 journalists
  • Average per user of 50 terms
  • However this varied wildly. Our top user has
    suggested 476 terms

14
Graph showing variationbetween teams
15
Growth in the CVs
  • Up 15000 terms in 10 months
  • Most growth in person/proper names
  • People, venues and organisations
  • Up by 50 to 35,000

16
Growth of facets
17
Types of terms
  • Mostly good
  • Only 200 terms actually rejected
  • Synonyms vs. entirely new terms
  • New for names (only 2 synonyms)
  • Synonyms for subject (15 synonyms)
  • Location needed colloquial terms

18
Resourcing
  • Handling the requests from journalists
  • First 3 months one IA
  • Subsequently 2 to 3 junior IAs
  • Too much how to reduce?

19
Lessons learned
  • Success with the journalists
  • They suggested terms!
  • Got the faceted classification
  • Began to suggest terms in our format
  • Some did engage at a detailed level

20
Lessons Learnt
  • Difficulties for journalists
  • System looks as if totally automatic as part of a
    content management system
  • Journalists are people too
  • Users struggling with a content object tagging
    system rather than page based

21
Example
Subject Pregnancy
22
Lessons Learnt
  • Difficulties for journalists, cont.
  • They find it boring
  • Makes it harder for the aim of finding and
    re-use to apply
  • Needed to do more pre-emptive work for them

23
Lessons learnt
  • Number of terms suggested depends on
  • Type of facet
  • Dynamism of content
  • Scope of the content
  • Enthusiasm of users

24
Next?
  • High value facets still need control
  • Make use of the metadata(!)
  • Sell the message
  • Federated management
  • Earlier in production
  • And for folk tagging?

25
  • Thanks to the IA team for their analysis work
  • Jon Carey
  • Adil Hussein
  • Christine Rimmer

26
Thank you
Questions or comments? Karen Loasby karen.loasby_at_
bbc.co.uk
Write a Comment
User Comments (0)
About PowerShow.com