Send Me a Disk, Ok? - PowerPoint PPT Presentation

1 / 51
About This Presentation
Title:

Send Me a Disk, Ok?

Description:

Soundex surname and user choice of # of letters in first name (LG) ... To define and communicate the meanings of family history data. Genealogical Data Model ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 52
Provided by: beausha
Category:
Tags: disk | send

less

Transcript and Presenter's Notes

Title: Send Me a Disk, Ok?


1
Send Me a Disk, Ok?
  • -Sharing Genealogical Information With Your
    Relatives

Beau Sharbroughbeau_at_sharbrough.netPO Box
3170Grapevine TX 76099-3170
2
Thank you.
  • To the CIG. Im grateful for the invitation to be
    here.
  • To Russ and Birdie Holsclaw. They took care of me
    the past three days, sharing their home, their
    cars, their community, and their son Will.
  • To Roger Ebert. I was starting to worry about my
    weight.

3
General Topics
  • Five steps to understanding what theyre saying
  • Discussion of software developers methods of
    merging files
  • Significance of GENTECH Genealogical Data Model

4
Five Steps to Combining Your Research
5
Step 1. Determine What Form the Data Is in.
  • Which program do they use?
  • What type of disk drives do they have?
  • What general field usage have they adopted?

6
Step 2. Exchange Pedigree and Group Sheet
Examples.
  • Look for detail, accuracy, thoroughness.
  • Are there full or partial dates?
  • Do the citations for US places include counties?
    Streets? Cemetery names?
  • Are nicknames used in place of real names?
  • Are sources cited?

7
Step 3. Agree on Usage of Fields.
  • RESIdes or ADDRess?
  • Will you both use CHRIsten?
  • How will you document sources?
  • How will you document the research of others?

8
Step 4. Convert Your Information. Nobody Can
Avoid This Step.
  • Agree with your relative what information you
    will convert and how
  • Normally, this means saying things like, "Ill
    put in the counties after I get it from you"

9
Step 5. Exchange Only the Individuals You Want.
  • NEVER just import the whole family on top of the
    information you already have.
  • Computer routines for merging data are improving,
    but not complete or effective yet.

10
There Are No Effective Routines for Merging Data
Sets at Present.
  • The problems of
  • Identity
  • merging methods and
  • data formats
  • are too new for generalized solutions to be
    available in the marketplace
  • Good theoretical solutions dont even exist

11
Merging Data Sets
12
Customers who just assume that someone will know
what they want and have it ready when they
recognize that need had parents that spoilt them
rotten.
13
WHY?
  • Family history record-keeping is increasingly
    becoming a digital process.
  • Linking ones information to the information
    already gathered by other family members and
    researchers is becoming more and more common.

14
We Have to Put Our Information Together Somehow
15
A Few Basics
  • Computer programs store the data that we enter in
    FILES
  • Each genealogical program stores the information
    in its own way, called a PROPRIETARY FORMAT
  • Most programs can also read and write in GEDCOM
    format

16
A word about exchange
A
B
Export Routine
Import Routine
Possible Intermediate Format
17
A Few Basics
  • Merging is copying
  • From a SOURCE
  • To a TARGET
  • Sometimes called the SURVIVING INFORMATION

18
MERGING DATABASES
  • merging the files into a single one
  • merging the duplicated individuals
  • merging the rest
  • sources
  • repositories

19
The database merging process is evolving
  • More input sources
  • More freedom to choose the features you like.
  • GenBridge

20
Freedom has a price
  • Enter a name
  • Program wont break it up
  • Enter a place
  • Program wont break it up

21
Legacy Trick
  • You can open two family files at the same time,
    and copy and paste a person and their descendents
    from one set into another, like grafting a tree
    branch from one tree to another.

22
Making automatic citations
  • Legacy individual level
  • TMG and FTM field level

23
The Current Merging Art
  • Merging Databases
  • Merging Individuals
  • Merging the Rest
  • Spotting Duplicates

24
Merging Individuals
  • If you want to merge duplicates, most programs
    will make you choose which tags to keep and
    throw the rest away.

25
MERGING INDIVIDUALSThe old way
  • Copy the info
  • Delete one of the people
  • Type the info into the new one

26
MERGING INDIVIDUALSThe middle way
  • View both persons
  • Select what you want
  • The program does the rest

27
MERGING INDIVIDUALSThe future way
  • Computer spots likely dups
  • Recommends them to you
  • You control the process

28
Merge Sources for most popular software
  • Their own files
  • GEDCOM
  • In some cases, files from other programs
  • In some cases, CD and internet databases

Still, it ends up being like pouring two cans of
paint together.
29
Merging the Rest
  • Most programs dont even import and merge place
    tables, source tables, etc.
  • I dont know of any program that recognizes the
    same source in two separate datasets.

30
Merging The Rest
  • source citations, master sources, repositories,
    and places
  • Most programs just combine the tables, creating
    duplicates
  • LG will combine a source, with exact spelling
  • UFT and FTM merge master sources
  • PAF and TMG merge master sources and repositories

31
Limits to Storage
  • Some programs have really limited storage, and
    only store conclusions
  • If you have two birth dates, they put your
    favorite one in and throw the other away, or
    store it in a note.
  • Some programs have a lot of storage, and let you
    make your own tags such as executrix.

32
SPOTTING DUPLICATES
  • Some programs have merging routines based on
  • Soundex
  • Spelling of name
  • Birth date
  • TMG and Legacy use a large variety of match
    choices

33
Spotting duplicates
  • Soundex for names (AQ)
  • Exact spelling or soundex (PAF 3.0)
  • Exact spelling and exact birth date (FTM)
  • Many name compares (TMG and UFT)
  • Soundex surname and user choice of of letters
    in first name (LG)
  • Warn if duplicate name entered (most)

34
Merging tips
  • Match on parent soundex reduces false positives
    (Gaylon Findlay)
  • If your program wont let you choose initials,
    but has a number-of-letters, try that with 1.
  • Beware of people about whom you know very little.
  • Beware of blank dates.

35
Signs that you can merge better today than you
could before
  • More formats allowed
  • Easier individual merging
  • Identifying routines are becoming more
    sophisticated
  • More storage of conflicting data allowed
  • More variety in the software marketplace

36
Signs that we arent getting there yet
  • No formal studies on known datasets to quantify
    false positives and false negatives
  • No implementation of information sciences in
    commercial products
  • No implementation of AI in commercial products
  • No formal discussion of algorithms

37
MERGING SUMMARY
  • Users can merge from a wider variety of data
    formats than in the past.
  • Users can merge individuals more easily.

38
MERGING SUMMARY
  • Routines to help identify candidates for merging
    are becoming quite sophisticated.
  • More programs store conflicting data today.

39
Its also encouraging that they are not all doing
the same thing.
  • The resultant diversity and innovation offer us
    more chances to connect Where-Weve-Been to
    Where-Were-Going than weve ever had before.

40
The GENTECH Genealogical Data Model
  • Purpose To define and communicate the
    meanings of family history data.

41
Genealogical Data Model
  • Request for Comment
  • Project by genealogists and developers to
    describe genealogy processes.
  • Describes the relationships between the various
    kinds of family history information.
  • Overview of what genealogists do
  • Not a genealogy program.
  • Not a database design
  • Not a document saying what genealogists SHOULD do.

42
Every genealogist says that they do research
differently.
  • The GDM describes the process that they do
    differently.

43
Stop Starting with Conclusions
  • Dont start with conclusions, start with evidence.

44
Some features of Evidence in the GDM
  • REPOSITORY
  • SOURCE
  • REPRESENTATION TYPE
  • REPRESENTATION
  • CITATION

45
CONCLUSIONS
  • ASSERTIONS about
  • PERSONA
  • EVENTS
  • CHARACTERISTICS
  • GROUPS
  • ASSERTIONS

46
XML is eXtended Markup Language
  • ltTITLEgtThe Title of My Booklt/TITLEgt
  • ltNAMEgtJonathan Sharbroughlt/NAMEgt
  • ltBIRTHDATEgtcirca 1734lt/BIRTHDATEgt
  • ltBIRTH PLACENorth Carolina DATEcirca 1734gt

47
Future digital research
  • programs publish pedigrees and registers in some
    XML format
  • repositories publish records in the same format
  • local links, remote sources
  • external authorities

48
A new culture
  • most quoted sites - authorities
  • many link sites - hubs
  • links define culture, tribe, families

49
The digital future of family history is a virtual
library where it is ...
  • Easy to find the conclusions
  • Easy to identify the evidence
  • Easy to identify the thought process that links
    them.

50
Missing ingredients
  • agreement on LexML standard
  • wide acceptance of LexML standard
  • wide implementation of LexML

51
Send Me A Disk, Ok?
  • Dos and Donts
  • Merging Technique
  • GENTECH GDM

Beau SharbroughPO Box 3019Grapevine TX
76099-3019beau_at_sharbrough.netwww.sharbrough.net
Write a Comment
User Comments (0)
About PowerShow.com