Title: Metadata
1Metadata What is it, and why we need it
By Roman Olschanowsky roman2u_at_sdsc.edu
2Create some metadata
- Take 5 minutes, right now, to think about YOUR
data, and do some brainstorming. - Write down a definition of metadata, and any
ideas you have about metadata regarding your
files and/or datasets.
3Metadata - data about data?
- System metadata (most file systems)
- Developed for OS, not very helpful to you
- Size, owner, permissions, timestamps,
- Standardized metadata
- File headers jpeg, mp3, DICOM(s),
- Dublin Core Title, Creator, Subject, Date,
- User defined metadata
- XML (Whatever I want !!!)
- Database (Whatever I want !!!)
- SRB (Whatever I want !!!)
4System Metadata
- Q If all I have is a plain file system, how do I
do metadata? - A Organization, build a meaningful hierarchy
Patient (Roman)
label
mri
surf
Log File
Surface File
Label File
brain
wm
filled
aseg
norm
transforms
flash
Transform File
Slice File
Slice File
Slice File
Slice File
Slice File
Flash File
Parameter_maps
5A good hierarchy - Is this enough?
- I now have 1000s of patients.
- Dr. Suchandsuch asks me How many of your
patients have a cranial thickness greater than .5
inches? - We can dig through all the images and measure the
thicknesses, but now where to store the results? - 50 are greater than .5 inches
- Great! Now how many of those are male, and were
scanned with a GE system? - Sir, 75 male and GE, other 25 male too but
scanned with different systems (fictional
numbers)
6Standardized Metadata
- Dublin core What is the bare minimum metadata
that needs to be present? - Everybody's idea of bare minimum is different
- Whats left isnt very useful Format Power
Point File - File Headers
- Very useful
- (Think of them as system metadata for that file
type) - Width 10px bite rate 128 Kbps Scanner GE
- But, the more files you have the slower it gets!
- Who decides what that header is? Does everybody
actually follow that standard?
7User defined metadata
- Finally, a place to store my cranial thickness
attribute. - XML
- Great! Its not platform or application specific.
- But, its usually slow, and with lots of
overhead. - Database
- Great! Its fast and it gives me my answers, more
flexible (primary / foreign keys) - But, its expensive (Labor, licenses) Worst Its
separate from the data, things can become out of
sync. - SRB
- Great! Its fast and its apart of the same
system as the data. - But, what if I take the data out of the system?
How does the metadata leave too?
8BIRN Human Collection and Metadata hierarchy
Analyses on many subjects across institutions
BIRN_ID Timestamp
XML file
XML file
Analyses on a subject across institutions and
studies
VisitID?
XML file
XML file
Analyses on many series of a subject within an
institution
StudyID?
XML file
XML file
Analyses on muliple Series done at 1 institution
Image/Scanner Parameters?
XML file
XML file
Analyses on images from this Series
XML file
XML file
XML file
XML file
XML file
XML file
XML file
Freesurfer
LDMM
Original is a pointer to the corresponding
original scanner format
XML file?
XML file?
9Directory Hierarchy
SRB Metadata
XML elements (non-structural)
HID Database
Notes
BIRN
Should analyses that cross multiple data levels
be split out to separate hierarchy?
Human
All Analysis collections are writeable so that
users can create their own analysis collections
(snapshots)
Research Project (Name__ID)
ltprojectgt
Project ID
nc_experiment
Analysis
Subject (BIRN ID)
ltsubjectConstgt
BIRN ID Timestamp
nc_humanSubject
Analysis
Institution Visit (Visit__Site ID_Visit )
Visit ID Institution ID
ltsubjectVargt
Analysis
nc_expComponent
Study (Study__ID )
ltscannergt
Analysis
Study ID
Series (Series__localID)
Series Number Scanner Parameters?
nc_expSegment and protocol section
Analysis
Separate the native data and analysis for easier
access control and separation (Brians email)
Native
Analysis
Native Data Represents an upload of the
original data Analysis Represents a different
analysis (either partial or full)
research and derived data sections
ltacqProtocolgt ltexpProtocolgt ltdatarecgt
Image Parameters?
Snapshot 1
Snapshot 1
DICOM
AFNI
Analysis Sub Tree
Analyze
Derived versions of an individual series should
remain with that series?
Snapshot N (Ver__SER)
Snapshot N
10All problems solved?
- Why are you calling it skull thickness?
- Its suppose to be cranial thickness!
- You have to query on brain, not purkinje cell
- But, a purkinje cell is part of the brain
shouldnt the system know that?
11Ontologies
- For AI systems, what "exists" is that which can
be represented. When the knowledge about a domain
is represented in a declarative language, the set
of objects that can be represented is called the
universe of discourse. - We can describe the ontology of a program by
defining a set of representational terms.
Definitions associate the names of entities in
the universe of discourse (e.g. classes,
relations, functions or other objects) with
human-readable text describing what the names
mean, and formal axioms that constrain the
interpretation and well-formed use of these
terms. - Formally, an ontology is the statement of a
logical theory.
12Distribution of Ryanodine receptor in cerebellum?
Brain
has a
- Navigates down domain map
- Situates result in context of domain map
Cerebellum
has a
Purkinje Cell Layer
has a
Purkinje cell
is a
neuron
13 ANATOM Domain Map
- Rule-based ontology map
- Encodes conceptual and semantic relationships
using F-logic
14Integrated Knowledge Map
15Scared?
- Do
- Design a file hierarchy
- Agree on a Standard Vocabulary
- Add metadata in the right places, and several
places - You can always add or change things later,
doesnt have to be perfect the first time - If its there you will use it!
- What metadata do other people want?
- Automate the process! (scripts and or workflows)
- Do not
- Wait. Its harder to add metadata after the fact.
- Do things manually, see 7 above
- Attempt an ontology, professionals are working on
them already! (Unless its already in your
approved grant)
16Review your notes
- Take another 5 minutes to go over your notes
about metadata - Any big changes you would do?
- Write down any changes, additions, and
revelations. - Share with us some of your discoveries.
17Thanks!
- Questions?
-
- www.sdsc.edu/srb
- srb_at_sdsc.edu