Ethic Core August 19th, 2004 - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Ethic Core August 19th, 2004

Description:

... 'The Uniform System of Screw-Threads' to Britain's Institute of Civil Engineers ... 'On a Uniform System of Screw Threads' to the Franklin Institute, Philadelphia ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 24
Provided by: hw87
Category:

less

Transcript and Presenter's Notes

Title: Ethic Core August 19th, 2004


1
Ethic Core August 19th, 2004
Jan-Eric Litton, Karolinska Institutet,
Stockholm, Sweden
2
Karolinska Institutet Dept. of Medical
Epidemiology and Biostatistics and KI-Biobank
3
Sharing Data
ID MURA_BACSU STANDARD PRT 429 AA.
DE PROBABLE UDP
-
N
-
ACETYLGLUCOSAMINE 1
-
CARBOXYVINYLTRANSFERASE
DE (EC 2.5.1.7) (ENOYLPYRUVATE TRANSFERASE) (UDP
-
N
-
ACETYLGLUCOSAMINE
DE ENOLPYRUVYL TRANSFERASE) (EPT).
GN MURA OR MURZ.
OS BACILLUS SUBTILIS.
OC BACTERIA FIRMICUTES BACILLUS/CLOSTRIDIUM
GROUP BACILLACE
AE
OC BACILLUS.
KW PEPTIDOGLYCAN SYNTHESIS CELL WALL
TRANSFERASE.
FT ACT_SITE 116 116 BINDS PEP (BY
SIMILARITY).
FT CONFLICT 374 374 S
-
gt A (IN REF. 3).
SQ SEQUENCE 429 AA 46016 MW 02018C5C
CRC32
MEKLNIAGGD SLNGTVHISG AKNSAVALIP ATILANSEVT
IEGLPEISDI ETLR
DLLKEI
GGNVHFENGE MVVDPTSMIS MPLPNGKVKK LRASYYLMGA
MLGRFKQAVI GLPG
GCHLGP
RPIDQHIKGF EALGAEVTNE QGAIYLRAER LRGARIYLDV
VSVGATINIM LAAV
LAEGKT
IIENAAKEPE IIDVATLLTS MGAKIKGAGT NVIRIDGVKE
LHGCKHTIIP DRIE
AGTFMI
4
Do we need standards ?
  • There is no need for standards
  • unless/until we start to integrate
  • In other words
  • I dont care about electrical plug socket
  • until I travel and wish to integrate my laptop
    into the plug socket in my hotel

5
A historical essay The Machine Screw
  • Principle discovered around 400 BC
  • Limited use until machine tools made mass
    production possible (18th cent.)
  • Every machine shop and foundry made unique sizes
    and thread dimensions
  • 1841 Joseph Whitworth presented The Uniform
    System of Screw-Threads to Britains Institute
    of Civil Engineers
  • 1864 William Sellers proposes On a Uniform
    System of Screw Threads to the Franklin
    Institute, Philadelphia
  • Enabled interchangeable parts and tooling for
    mechanization and mass production
  • 1945 British and American standards merged

6
Point-to-point integration of data
  • Application includes subprogram
  • to each different data source
  • Operations on data must be
  • processed by an application
  • Lots of coding efforts
  • Fully dependent of
  • data resources

7
The load-in-database approach of integrating
data
  • Data are loaded in the database
  • Data need filtering, cleaning,
  • transformation
  • Data must be refreshed
  • Scripts must be written
  • Time consuming to refresh data
  • Up-to-date data can not be
  • guaranteed

ODBC - JDBC
8

Federated data approach
  • Data stay untouched
  • Integrates
  • heterogeneous local or
  • remote data sources
  • through wrappers
  • Just need to know what
  • data should be available
  • to whom and how to access them
  • It makes all data look
  • like its one virtual database
  • hiding the data layer complexity

9

Federated data approach
  • Federated data technology
  • Discovery Link (DB2)
  • Distributed SQL (Oracle)
  • Major problems
  • Remote connection
  • Speed
  • Security

10

Federated data approach
  • Relational wrappers list
  • DB2 family
  • Informix (Informix Client SDK)
  • Oracle (SQLNet or Net8 client)
  • MSQL Server (ODBC driver version 3.0 or later)
  • Sybase (Sybase open client)
  • Teradata (Teradata CLI)
  • ODBC (ODBC driver version 3.x)
  • Non-relational wrappers list
  • Lotus Extended search
  • Excel
  • Flatfile (CSV format)
  • XML (1.0 specifications)
  • OLE DB
  • BLAST
  • Documentum (Documentum client API)
  • Entrez (version 1.0)

11
Web vs. File Exchange
  • File Exchange
  • Files pushed - duplicated
  • Multiple data management system
  • Configuration control issues
  • Sporadic communication
  • WEB
  • Data pulled as needed - when and
  • how much
  • Access via single data management source
  • Continuous communication

12
Ontologies
  • Controlled vocabulary means
  • only one controlled term is used for a given
    concept
  • Data Model
  • Data structuring mechanism in which an ontology
    is expressed

13
  • Database completion
  • a common, secure database established in
  • Europe for all relevant scientific information
    in GenomEUtwin
  • Ten first months
  • a database structure established
  • Shared approaches to
  • Infrastructure
  • Core facilities
  • Bio-informatics
  • Harmonized methods

14
  • EUid number (EUIDNUM) 752000021210
  • The EUidnumber consists of four parts
  • ? Country code 3 digits ISO 3166
  • ? Randomized number 7 digits
  • ? Identification number 1 digit
  • Check sum 1 digit
  • Shared approaches to
  • Infrastructure
  • Core facilities
  • Bio-informatics
  • Harmonized methods

15
  • Shared approaches to
  • Infrastructure
  • Core facilities
  • Bio-informatics
  • Harmonized methods

16
  • Shared approaches to
  • Infrastructure
  • Core facilities
  • Bio-informatics
  • Harmonized methods

17
  • Shared approaches to
  • Infrastructure
  • Core facilities
  • Bio-informatics
  • Harmonized methods

18
Key requirements
  • 1. Genotype and phenotype data are kept in
  • separate databases
  • 2. Phenotype data must be in full control of
    national
  • centers
  • No common data repository for the phenotype data
  • Study units can access the data only upon
    agreement
  • 3. Anonymous Genotype data is less sensitive and
  • can be collected into one repository
  • Only EUTWINIDs can be used (no local person,
  • sample and family identifiers)
  • Limited access
  • National centers can see their own data
  • Study units can access the data as whole
  • 4. Secure database infrastructure
  • Shared approaches to
  • Infrastructure
  • Core facilities
  • Bio-informatics
  • Harmonized methods

19
Phenotypes
Genotypes
Distributed SQL
  • Shared approaches to
  • Infrastructure
  • Core facilities
  • Bio-informatics
  • Harmonized methods

Stockholm
Genotypes/institution specific slice
GT
GT
GT
LIMS / Instrument databases
Access control
Phenotypes
Helsinki
Uppsala
Tracking info
Samples and data
Samples
Samples and sample data
Tracking info
National centers
20
GenomEUtwin NET
SQL
VIRTUAL PRIVATE NETWORK
21
  • Shared approaches to
  • Infrastructure
  • Core facilities
  • Bio-informatics
  • Harmonized methods

22
what we learned, so far..
  • Database core and epi-core and stat-core and
    etic-core must work hand in hand
  • To many involved don't know what a database is
    and why we are using it
  • To many are still thinking flat-file
  • Federated database
  • Must apply from scholar program from IBM, else
  • A fulltime skilled dba at each center working
    close to epi- and stat-core
  • Security
  • VPN vs SSH
  • Shared approaches to
  • Infrastructure
  • Core facilities
  • Bio-informatics
  • Harmonized methods

23
jan-eric.litton_at_meb.ki.se
Write a Comment
User Comments (0)
About PowerShow.com