Title: Slide template for use in presentation
1 Catalogue Biolink NL Catalogue 2.0
2Deliverables of Biolink NL
- Improve the access to biomaterials and data by
setting up - A national infrastructure for linkage of biobanks
to registries (including technical, governance,
and ethical/legal aspects). - A web based Portal including procedures for
linkage and a central catalogue of linked
biobanks/registries that can be searched for
specific patient populations. The catalogue
should contain minimal harmonized dataset and
also informed consent information.
3What level of catalogue do you mean?
Biolink NL
- Level 1 List of Biobank Cohorts
- available biobank sample collections with a
minimal set of properties such as size, data
topics and material types. - Level 2 Data item dictionary
- information on data items available in each
cohort such as questionnaire questions, lab
measurements, diagnoses or GWAS - Level 3 Group level information (chunked)
- aggregate information such as number of
available samples for high blood pressure cases. - Level 4 Individual level data per Item
- complete set of (anonymised) data in a biobank
for which most biobanks will have in-house
database. - Level 5 Identifiable data per Individual
- services to link to data from other sources and
registries such as the national cancer registry,
through pseudonimization service.
Biolink NL
Biolink NL
4Which catalogues and why
5Catalogue level 5
feasibilities
Privacy
Patient/data selection
Flexible/ usability
.....
Legal demands
Valid results
Preconditions
Requirements
.....
Data linkage
Demands Bio- databanks!
Catalogue Subject level
Semantic integration
Up-to-data
6What will the catalogues look like?
Level Catalogue Content External access Privacy demands Gover-nance Frequent updates Remarks
2 Contents/linkage possibilities Y 0 ?
3 Key numbers Y Fed by catalogue V
5 Linked subject level data N Few variables per subject 2) Version 1 Mondriaan infrastructure
7Getting started with RP Catalogue 2.0
- Who does what
- Programmer (Bart, 1.0 fte)
- Coordinator and biobank liaison (David, 0.5 fte)
- Standardization/bouwstenen, workflow, power user
(Linda, 0.3 fte) - RP5 / Mondriaan integration and policies (Willem,
0.2 fte) - Integration PSI, CTMM, ....
- What data to use
- What biobanks are willing to share data in public
domain - Ask biobanks that are in BBMRI-NL catalogue
(David/Morris) - Ask studies in CTMM (Linda/David)
- Ask PSI board (Linda, ...)
- Other sources? (e.g. BBMRI-SE)
- What data items to share
8Eerste concrete acties
- Aansluit team (David Willem)
- Willem neemt contact op met biobanks
- Assistance in selectie en mapping van data items
- Procedure voor level 2, 3, 4 en 5 data overdracht
- Technisch faciliteren van de import (techneut
naar techneut) - User interactie ontwerp team (Morris Chantal?)
- Alternatieve ontwerpen (wizard vs filtering)
- Toegangsbeperkingen en anonimiteit
- Harmonisatie team (Linda)
- Afspraken maken over data items en hun codering /
formats - Mapping van variabelen (protocollen en features)
tussen bronnen - Governance sustainability team (Willem)
- Business case voor catalogus
- Wat voor informatie mag wel/niet gedeeld worden
- Afspraken over uitwisseling / samenwerking
- Open source? Project place (NFU sponsor?)
9Interesse
- Pilot projecten
- Inladen van Mondriaan level 3
- Inladen van CTMM level 3
10Deelname aan level 3 en 5
Biobank
Periodieke uitspoel van gepseudonimiseerde
microdata
Level 5 catalogus
aggregatie
Level 3 catalogus
11Deelname aan alleen level 3
Biobank
aggregatie
DB
Level 3 catalogus
12Uitdagingen
- Wie beheert de catalogus
- Samenwerking en inzet middelen Biolink NL in
Catalogus 2.0 - Inrichten efficiënte aansluit procedure
- Kennis opbouw data
- Impact minimalisatie voor biobanken
- Langere termijn
- Sustainability
- Samenwerking inbedding NFU/BBMRI
13Privacy level 5
Compliant with e.g. WBP and case law CBP
14Governance result of 2 years of Mondriaan
experience catalogue 3 and 5
- Sources set the limits
- Scientific use only
- Independent, non-profit governance
- Different sources have different limits
- Sometimes level V is less scary than level III
- Control ownership
- Whats in it for me?
- Catalogue design consequences
- Level 3 catalogue level vs source level
- Level 5
- No public access,
- No release of data without approval by a source
- Release of answers.
- Industry
15Mondriaan catalogue (level 5)
- In production September 2010
- Access approx 10 pers. (UMCU/UMCG/RUG)
- Semantic integration
- Compliant with privacy laws
- Established procedures on data-handling and
ownership - Data available in catalogue
- GP data 0,5 M subjects (Utrecht Almere)
- EPIC- NL (40K subject)
- IADB (0,5 M subjects Groningen)
- Psychiatric data provence of Utrecht
16Mondriaan catalogue (level 5)
- Data expected 2012-13
- SFK public pharmacy data of 14 M subjects
- Achmea (5 M subjects)
- UPOD
- Smart
- Vumc GP network
- LifeLines
- LINH
17Mondriaan Meta-catalogue
Researcher
Mondriaan
City/ region
Catalogue
Data exchange HDR-Mondriaan
Gronin- gen
Data delivery
Sneek
Zwolle
Almere
VUmc
Twente
Leiden
EPIC
Nijm- egen
SFK
Zeeland
Ehoven
Ehoven
S- Limburg
Researcher
18Proces data pseudonimisatie/ anonimisatie
data-upload naar Mondriaan
De TTP verzorgt de pseudonimisatie en koppeling
van data
Bron- pseudoniem
2) De TTP stuurt bronpseudoniemen naar de
client. Koppelingen zijn nog niet mogelijk met
deze pseudoniemen. Dit i.v.m. privacy.
Versleutelde NAW
1) NAW-gegevens worden versleuteld bij de
bron en naar TTP gestuurd. De medische data
worden niet naar de TTP gestuurd. .
DB
DB
Gepseudonimiseerde (medische) data van de bron
3) De client stuurt alleen de bronpseudoniemen
Medische data naar Mondriaan
19Catalogue on patient level be careful!
20Koppelen
- Custodix
- Kanskoppeling
- 3 stappen
- normalisatie inputgegevens hashing
- Blocking
- Matching
- Matching combinatie van fellegi-sunter
algoritme(s) en scriptable matchers - Matches non-matches
- Calibratie
- ZorgTTP
- Wel normalisatie hashing inputgegevens
- Geen koppeling, maar uniform versleutelde
identifiers of combinaties daarvan