Title: Weblogs for Research(ers)
1Weblogs for Research(ers)
- Anjo Anjewierden
- Human Computer Studies laboratory
- Faculty of Science
- University of Amsterdam
- http//anjo.blogs.com
- Many thanks to Lilia Efimova, Rogier Brussee,
Robert de Hoog, Stephanie Hendrick - and the blogosphere in general
2What is a weblog (1)?
- Most common descriptive definition a weblog is
- a personal journal,
- updated regularly,
- published on the internet and
- posts (entries) appear in reverse chronological
order
3What is a weblog (2)?
- Weblogs are social as they encourage others to
participate using two mechanisms - Posts have an explicit point of reference called
a permalink - Permalinks make it possible for people to link to
each others posts share and discuss - Readers, possibly without a weblog, are invited
to join as all posts have a comment link
4Anatomy of Weblogs
5Weblog Research is about
- Humans who share findings, thoughts, ideas and
sometimes feelings in their weblogs - Computers which make it possible to create
weblogs, read weblogs, and to comment and to link - Studies which analyse why and how people blog
about what and to whom - Laboratory weblog researchers need a stable
environment in which to conduct their research
6Do we want to research weblogs
- Blog (short for weblog, we-blog) was word of the
year 2004 by Merriam Webster. To blog, blogger,
blogging, blogosphere, etc. - Communications of the ACM (CACM) carried a
special issue on weblogs (December 2004) - Unfiltered and Public For the first time we get
access to a large body of material on a
particular person, written by that same person - Research relevance Social studies, Knowledge
Management (for professional weblogs), education,
linguistics and even Semantic Blogging
(combining Semantic Web and blogging) has been
coined - Compare Digital Cities research by Beckers / Van
den Bersselaar (at SWI)
7BlogTrace the Laboratory (1)
- Weblogs are represented as HTML pages
- Complex layout, difficult to find the posts
- Manual research is extremely labour intensive
- There is a serious lack of tools that support
weblog research
8BlogTrace the Laboratory (2)
- BlogTrace spider makes data collection and
research a lot easier - Automatically extracts posts from the HTML
- Generates the link structure of the weblog and
represents it as RDF/OWL - Generates an RSS feed that contains all posts for
a weblog - Implemented using induction algorithms, which
learn what are posts and what is layout
9Ontologies used in BlogTrace
- DC Dublin core (names, dates, descriptions)
- FOAF Friend of a friend (documents, people)
- RSS 1.0 (RDF) Really simple syndication
(representation of full posts) - Link ontology, for example a link (href in HTML)
becomes - Link linksourceDocument lthttp///gt
- Link linktargetDocument lthttp///gt
- Link linkanchorText interesting site
- Etc.
10(No Transcript)
11Weblogs can now be studied
- Even using Semantic Web technology (RDF/OWL)
linkWeblogPostLink rdfssubClassOf
linkSimpleLink rdfscomment "A
WeblogPostLink is a SimpleLink if and only if
both the source and the target documents
are weblog posts (RSS items)." rdfslabel
"WeblogPostLink" owlintersectionOf
(linkSimpleLink a owlRestriction
owlonProperty linksourceDocument
owlsomeValuesFrom rssitem a
owlRestriction
owlonProperty linktargetDocument
owlsomeValuesFrom rssitem ).
linkWeblogPostLink rdfssubClassOf linkLink
rdfscomment "A WeblogPostLink is a Link if and
only if both the source and the target documents
are weblog posts (RSS items)"
owlintersectionOf (linkLink a
owlRestriction owlonProperty
linksourceDocument owlsomeValuesFrom
rssitem a owlRestriction
owlonProperty linktargetDocument
owlsomeValuesFrom rssitem ).
12Some Weblog Research Questions
- Weblog communities
- Do they exist?
- How can they be defined and found?
- What is the social structure?
- What are the conventions in the community?
- Text analysis of weblogs
- What do people blog about (terms, topics)?
- Do they share terminology?
- Can personal conceptualisations be extracted?
- Conversations
- Can linked weblog posts be seen as conversations?
- Can we identify when there is a knowledge flow?
13Implementations and Papers
- Weblog communities
- Visual Settlements
- Graphically displays weblog community linkage
based on a weblog is a city metaphor - Community determined by Virtual Settlements
paper (Efimova Hendrick, 2005) - Text analysis of weblogs
- Sigmund (Anjewierden, Brussee Efimova, 2004)
- Co-occurrence based statistical algorithm that
identifies concepts and their relations for a
weblog - Conversations
- Knowledge flows (Anjewierden, De Hoog, Brussee
Efimova, 2005) - Hypothesis chance of a knowledge flow is greater
when the sender and receiver share
conceptualisations
14Visual Settlements
- Idea
- Can we compress a weblog to a single picture?
- Such that we can use the picture to compare it to
other weblogs in a community - And, of course, learn something
- Inspiration
- Maps in general
- Books by Edward Tufte on Information Design
- The Visual Display of Quantitative Information
(1983) - Envisioning Information (1990)
- Beautiful Evidence (2005 forthcoming)
- (Discovered Tufte by blog reading)
15My blog as a Visual Settlement
16Anatomy of Visual Settlements
Without links in the community (house)
I link to someone (Im at work)
Someone links to me (Im in the park)
Size number of words in the post
Layout if I link to earlier posts they are close
Time early post in center, radiate outwards
17(No Transcript)
18(No Transcript)
19(No Transcript)
20Sigmund
- Idea
- Using co-occurrence to determine whether terms
are related - Related terms might point to conceptualisations
of the blogger - And, these conceptualisations might be shared by
other bloggers - Supported by
- Tools that are part of my regular research on
methods to support ontology development from
documents - In particular term extraction and named entity
recognition
21Making a Difference
- Idea
- In a community of bloggers it is likely
terminology is shared - Finding the shared terms is interesting (see
Sigmund) - But a blogger is a person and not a web page
- So, what makes them different?
- Implementation
- Run Sigmund on all blogs in a community
- Find terms that are common for a particular blog
and not common for others in the community - Example Making a Difference post
22Knowledge Flows
- Idea and Motivation
- When bloggers link to a post of other bloggers
- Could it be a knowledge flow?
- Motivated by potential use as a knowledge
management tool - Implementation
- Use Sigmunds co-occurrence algorithm
- Term overlap in linked posts is the main metric
- Make a distinction between shared and agreed
terms (used by both bloggers) and private terms
(used by one of blogger)
23Knowledge Flows
- Idea and Motivation
- When bloggers link to a post of other bloggers
- Could it be a knowledge flow?
- Motivated by potential use as a knowledge
management tool - Implementation
- Use Sigmunds co-occurrence algorithm
- Term overlap in linked posts is the main metric
- Make a distinction between shared and agreed
terms (used by both bloggers) and private terms
(used by one of blogger)
24(No Transcript)
25(No Transcript)
26Weblogs for Researchers
- Experiment (Metis project)
- Six researchers (previously non-bloggers) started
a weblog to get hands-on experience - Two gave up rather early
- One thinks about underpants when blogging
- Three (includes myself) continue after the
experiment finished - Evaluation
- Posts are not emails (everybody can read them!)
- Posts are not academic papers
- Developing a blogging style (how and about what
you blog) is difficult and different for everybody
27Conclusions (1)
- Blogging as a tool for researchers
- Try it!
- Works for me, both reading and writing
- By sharing ideas on your blog, you may get help!
28Conclusions (2)
- Enormous amount of data (paradise for someone
like me) - Tempting to continue my own weblog research
- If others have better ideas than I have, and some
do, I gladly return to my role as supporting
others to do their weblog research