Introduction to Xaira Part One: All about Xaira - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Introduction to Xaira Part One: All about Xaira

Description:

XML Aware Indexing and Retrieval Architecture. The XML ... How do you pronounce 'Xaira'? Its designers pronounce it like 'Sarah' We pronounce it like 'Zirah' ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 20
Provided by: Andr407
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Xaira Part One: All about Xaira


1
Introduction to XairaPart One All about Xaira
  • Andrew Hardie

2
What is Xaira?
  • XML Aware Indexing and Retrieval Architecture
  • The XML-aware version of SARA for the BNC corpus
  • Several programs, including the Index Toolkit and
    the Client

3
How do you pronounce Xaira?
  • Its designers pronounce it like Sarah
  • We pronounce it like Zirah
  • Other pronunciations may vary

4
Why are we talking about it?
  • Andrew and Richard have been beta-testers for
    Xaira for several years
  • Andrew wrote the help file

5
What sort of program is Xaira?
  • Xaira is an analysis program for indexed corpora
  • Searching indexed vs. non-indexed corpora
  • Indexing retrieval
  • Xaira does both

6
Indexing
7
Retrieval
8
Xaira contains
  • The Indexer itself
  • Xaira-tools
  • Easy user interface for corpus set-up and using
    the indexer
  • The Xaira client
  • Sophisticated corpus analysis system
  • Wordlist, concordance, collocation
  • Structured searching

9
Client, server?
  • Why does Xaira describe itself as a client?
  • Xaira splits the work between
  • one program that you use to build the search (the
    client), and
  • one program that actually looks in the index and
    finds the solutions (the server)
  • But you can just use the client like any
    concordancer software
  • the user never deals directly with the server

10
What is special about Xaira?
  • Xaira is based on XML
  • XML is based on Unicode
  • Thus Xaira can be used with any language in any
    alphabet
  • But Xaira has been specially designed to aid
    multilingual analysis
  • e.g. allows Unicode keyboard setup for any
    language

11
Do I need a Unicode corpus?
  • Yes!
  • ( but ASCII counts as valid UTF-8)
  • Both UTF-8 and UTF-16 are OK
  • (If in doubt, ask Andrew about variant text
    encodings)

12
Does my corpus need to be XML?
  • No!
  • Xaira can add basic XML to a corpus of plain-text
    files
  • Xaira can also upgrade SGML to XML
  • TEI XML is perfect for Xaira
  • warning Xaira will reject ill-formed XML or
    SGML files.

13
First, index your corpus
Access the commands you need to set up and run
the indexer from the Tools menu
Messages from the different tools appear here
(you dont need to worry about them)
14
The Tools Menu
Tools for preparing your corpus and its header
Tools for telling Xaira how to handle the XML
markup in your corpus
The indexer itself
15
Scared?
  • Using Xaira-tools to prepare a corpus manually
    can be a bit complex
  • Instructions http//www.oucs.ox.ac.uk/rts/xaira/D
    oc/
  • But dont despair there is a wizard!
  • File gtgt Index Wizard

16
The index wizard
17
The index wizard
18
The index wizard
19
Live Indexing!
Write a Comment
User Comments (0)
About PowerShow.com