Title: Computational Challenges in CMB Data Analysis
1Computational Challenges in CMB Data Analysis
- Paolo Natoli
- Dipartimento di Fisica
- Università Tor Vergata
Credits M. Botti, F. Massaioli (Caspur) G. de
Gasperis, A. Traficante (Tor Vergata)
2The Cosmic Microwave Background
- The CMB is a EM relic from the hot Big Bang
- Its anisotropy traces the evolution of the LSS of
the Universe - Target of many high precision experiments in the
last two decades - O(10) balloons and ground based (BOOMERanG, CBI,
DASI,...) - Two flown NASA satellites COBE (1989) and WMAP
(2001) - The ultimate CMB satellite, Planck (ESA), is
just around the corner... - It is widely regarded as a goldmine of
cosmological information
3Increasing complexity
COBE, early 1990, 10 K pixels
WMAP, early 2000, 1M pixels
Planck, launch 2008, gt 10 M pixels
4Why is data analysis challenging?
- Large Problem
- Many Gb/channel/day
- 105 to 107 pixels
- Detectors inject noise
- Noise is often correlated (long memory, or 1/f
) - Systematics (non statistical noise) play an
increasingly important role in high accuracy
experiments.
5Size of dataset is an issue
- Need to observe large fractions of the sky to
decrease sampling variance - observe up to few arcmin resolution, to gather
most cosmological information
Maps with many resolution elements
(pixel) typically in the million range
6But noise is the true enemy!
- Good CMB detectors have noise sensitivities in
the range 100 to 300 - Experiments must spin (or chop) quickly to
modulate the signal (exp. low modes, dipole often
used for calibration) in the frequency range were
amplifiers are stable enough.
cosmological signal is largely subdominant in
the timestream, have to resort to non trivial
statistical techniques to dig it out
7Pixel Space Map Making
- Go from Time-Ordered data to pixel values
- Do it (reasonably) quick, without loosing (too
much) information
8A simple data model
sky
noise
pointing (sky reading)
signal
9A trivial solution, in principle
But one that is computationally unfeasible!
10A workable solution
- First implementation, aimed at Planck
- Natoli, de Gasperis, Gheller Vittorio, 2001,
AA - Tested at CINECA
- Polarized ROMA
- de Gasperis, Natoli et al., 2005 AA
- Tested at Caspur/Nersc
- These activities evolved into ROMA (ROMA Optimal
Map Making Algorithm) code, a suite of codes
aimed at polarized CMB map making and noise
estimation
11ROMA
- A collaboration has begun with Caspur in 2005.
- This resulted in an almost complete rewriting of
the code - Written in Fortran 95, uses MPI for message
passing, cfitsio for I/O and FFTW3 to perform
fast convolutions - Implemented on a number of architectures
- SGI Origin, IBM SP3/SP5, Linux clusters, up to gt
1000 PEs - On several supercomputing centers
- CINECA, NERSC/LBL, Caspur, LFI DPC
12Scalings 1/2
13Scalings 2/2
100 Msamples
1.9 GHz Pwr5
2.4 GHz Opteron
14The Hybrid code
- ROMA requires a substantial amount of PE to work
on a year long dataset, mostly because of the
need to store the timeline in memory - However, long term correlations are weak and they
mostly pollute the common mode (monopole) of a
map - Hence the Hybrid code (Traficante et al. in
preparation) - Divide the dataset in many submaps, each of which
is porcessed by ROMA - Estimate the comon modes and subtract them from
the maps - Co-add the maps weighting by noise
15Planck 30 GHz dataset ( 4 Gsamples)