Title: Glenn%20Carver
1p-TOMCAT training course
- Glenn Carver
- Centre for Atmospheric Science,
- University of Cambridge
2What you will hear about on this course
- p-TOMCAT
- How to run it
- How it works in parallel
- How to make code changes
- ASAD
- How it works
- How to make changes
3p-TOMCAT basics
- In this section you will hear about
- What kind of model p-TOMCAT is
4Basics History lesson
- The parallel p-TOMCAT model is derived from
TOMCAT, the tropospheric offline chemical
transport model (CTM) -
- TOMCAT was developed over the recent years by a
number of people at the Centre for Atmospheric
Science at the University of Cambridge - TOMCAT itself originally derived from SLIMCAT
the stratospheric offline CTM, developed
originally by Martyn Chipperfield (now at Leeds).
5Basics What is a chemical transport model?
- Its a global three-dimensional model
- A CTM takes wind, temperature and humidity fields
as input and transports tracers around the globe - Wind and temperature input may be from
meteorological analyses or from another model - Global circulation models (e.g. Unified Model)
are quite different these compute their own wind
and temp. fields - CTMs are very good for comparisons with
observations - They are not so good for coupled
chemistry-climate research
6Basics Why parallel TOMCAT?
- High Performance Computing assumes many
processors - Had no choice but to parallelize TOMCAT
- p-TOMCAT is able to use multiple processors to
- Reduce execution time for long integrations
- Run at higher resolutions horizontally and
vertically - We can also use the increase in compute power to
increase the complexity of the model e.g.
addition of more chemistry, better numerical
schemes
7Basics Whats in p-TOMCAT
- p-TOMCAT is made up of different modules
- Each module represents a physical process
- p-TOMCAT comprisesabout 40,000 linesof fortran
Clouds
Transport
Chemistry
Aircraft
Lightning
Emissions
PBL
?
Strat
8Basics More information
- p-TOMCAT website
http//www.atm.ch.cam.ac.uk/tomcat/ - Includes
- Job scripts
- Documentation
- Test results
- Performance information
9p-TOMCAT grid
- In this section youll learn more about
- The grid the model uses
- The parallelization of the model
10p-TOMCAT resolutions
- Use Gaussian grids (from spectral models)
- Longitudes are regular latitudes are nearly so
- Typical resolutions are
- T21 64 lon x 32 lat (5.6 degrees)
- T42 128 lon x 64 lat (2.8 degrees)
- T106 320 lon x 160 lat (1.1 degrees)
- NLON (3M 1 or 2), NLAT NLON / 2
- TOMCAT is not a spectral model so you should not
say you are running the model at T21, T42 etc.
Use the grid resolution
11How does it work in parallel?
- There are several paradigms for parallel
computing - p-TOMCAT uses domain decomposition
- Each cpu runs a copy of the model
- Each cpu works on a lat-lon portion of the globe
- But! More processors does not always imply more
performance
12Domain decomposition
- Each processor gets same size patch
LongitudeLON
nlonmx
nlatmx
LatitudeLAT
13Domain decomposition
- Total number of grid points LON x LAT
x NIV (mod_paradi.f90) - Each processor has NLONMX x NLATMX x
NIV - Where NLONMX LON / NPROCI
- And NLATMX LAT / NPROCK
- NPROCI is no. of cpus along longitudes, NPROCK
along latitudes (mod_slcons.f90)
14Domain decomposition
- Problem comes when code requires values from
other processors gridpoints - E.g. U(i1) - U(i-1)
- A processor must communicate with its neighbour
CPU 2
CPU 1
15Halos
- We deal with this by creating halos rather than
communicate whenever it would be needed in the
code - A halo is a copy of data, it is not computed by
the processor
Halo of cpu2
Halo of cpu 1
CPU 1
CPU 2
16Halos
- Halos are exchanged when interior points are
needed by code - Different halos are required by different
algorithms - Advection code needs N,S,E,W special treatment
at the poles - Flight track code need N,NE,E,SE,S,SW,S,NW
- Tropopause calc needs N,S,E,W
- However, halos introduce communication costs and
should be avoided or minimised as much as possible
17Halo array example
- An array with a halo might be declared
ozone(0nlonmx1,0nlatmx1) - But! It is always the case that the processor
computes ozone(1nlonmx,1nlatmx)
- The halos are copies and only for reading
cpu1ozone(nlonmx1,) cpu2ozone(1,)
Halo of cpu2
Halo of cpu 1
CPU 1
CPU 2
18Advection halo arrays
- Advection scheme halos are more complicated
- Arrays that need halos are declared as e.g
SM(nimnnimx,nkmnnkmx,niv)where nimn
-nhalmx 1 nimx nlonmx nhalmx nkmn
0 nkmx nlatmx 1 - where nhalmx is greater than 1 and depends on the
resolution - Next well see why advection needs nhalmx gt 1
19Model timestep
- Numerical stability normally requires
?t lt ?x / Umax - This presents a problem approaching the poles.
As ?x decreases, so must ?t - This is the well known pole problem
20Polar latitudes
- To ensure stability for the x-direction
advection, the model groups grid cells together
near the poles.
N
S
Whats the deliberate mistake?
21Polar latitudes
- Grouping depends on timestep, Umax and
resolution - Grouping implies model loses zonal resolution
approaching the poles (kind of) - Amount of cells grouped increases with resolution
- Careful if timestep is not reduced as resolution
is increased, model will start grouping at mid or
low-latitudes. Check job log!
22Polar latitude halos
- Cell grouping implies bigger halos at these lats
- E.g. if 8 gridpts are grouped into 1, it means
well need 8 gridpts either side of our processor - This will increase cost of communication at poles
- Hence the need for nhalmx. This variable will
change with different resolutions and timesteps
but we want it as small as possible and the
timestep as big as possible! - Optimum values of nhalmx, nproci nprock have
been set in tom.build. Check calc_slcons program
output on website for other configuration
possibilities based on choice of ?t
23p-TOMCAT Parallel computing
- In this section youll learn more about
- MPI
- How the model uses MPI
- When you should use it if adding code
- Parallel performance
24Message passing
- p-TOMCAT uses MPI Message Passing Interface
- MPI is an international standard and available on
all HPC systems - MPI assumes processors are completely separate
apart from a communications link - In practice, MPI will use a shared memory to
speed up comms but this is transparent to the user
CPUs
Memory
25MPI
- MPI requires the addition of subroutines calls to
do the communication - You will need to use MPI if
- Your scheme needs neighbouring processor
gridpoints (e.g. gradients, local searchs of
gridpoint values) - Zonal means
- Global sums
- Reading a file and broadcasting to all processors
- You do not need it for any operation in the
vertical. Each processor has all vertical levels
for each gridpoint it deals with
26Parallel performance
- Rarely get perfect parallel performance due to
communication costs, I/O and differences in time
through code (load imbalances) - Speedup S T1 / Tp
- Where T1 is time on a single processor, Tp is
time on p processors - Efficiency E S / P
- e.g. efficiency of 50 means on average we only
used half the processors during the run. Ideally
we want gt 60-70.
27TOMCAT performance
2 cpus 95
4 80
8 65
16 55
32 40
28Load balancing
- Another reason for efficiency to reduce as
processors increase is because of load balancing - The time through the code depends on various
factors - Day / night affects time through photolysis and
chemistry code - Polar night / day affects time through photolysis
chemistry code - Polar x-advection costs less because cells are
grouped - Presence of convection alters cost of convection
code - The net effect is that some processors might be
idle whilst waiting for others to catch up - This is a difficult problem to solve. It is a
particular problem at high resolution and may
well need restructuring the grid for the model to
work efficiently at T106 and above on 128
processors
29Reproducibility in parallel
- Getting reproducible results is a big issue for
parallel models because A B / B A on
computers - Computing a zonal mean requires each processor to
compute a mean for nlonmx pts and then sum those
means across all processors - Splitting the domain differently e.g. 4x4 or 2x8
will therefore change the result - p-TOMCAT results were sensitive to this. V1.0
uses special subroutines for zonal global means
to avoid this which you must use
30Reproducibility across computers
- p-TOMCAT results are also sensitive to the
compiler used - Optimization levels O2 and O3 on green give the
same results - You will get differences between green newton
Relative difference in zonal meanozone between
runs on green .v. newton
31How to use p-TOMCAT
- In this section, youll find out
- How to compile and run the model
- How to add in new code
- How to control what the model does
32Getting started at Manchester
- Suggest you create an experiment directory in
/santmp/user (but remember files on /santmp are
deleted if not used gt 14days) - /hold/year/user is the tape store where results
can be saved - Use tar and gzip to package up files into one
file for more efficient storage e.g. tar cvf
expt.tar myexptdir gzip expt.tar and then copy
it to /hold
33Compiling p-TOMCAT
- We compile (build) and run p-TOMCAT in separate
steps - Download the tom.build script from the p-TOMCAT
website to your experiment directory e.g.
/santmp/emgdc/run1 - First make it executable chmod ux tom.build
- Then ./tom.build on wren in the expt
directory.It will create a directory src with
all the code in and leave an exectuable tomcat
in the expt directory
34Makefiles
- tom.build will put a Makefile in the src
directory for you - A Makefile is a way of telling the compiler
what options to use and what order the files
should be compiled - If you need to change the Makefile, please ask if
you are not sure how they work. You only change
it if - Altering the compiler options
- Adding new files of code to the model
- We use a parallel compilation, compiling several
files at once - On Green use the command pmake tomcat
- On newton use the command make -j 4 tomcat
35tom.build changes you might make
- Set the model resolution is all you might need to
do if you just want to run a vanilla flavour of
the model - Other changes you might want to make
- Model version number
- Changes to domain decomposition
- Changes to number of tracers, species in
chemistry - tom.build creates a small file called config.
Do NOT edit it! Its used to pass model config to
tom.run script
RES21 NIV31
These parameters are at the top of the job
36Editing code nupdate
- To make changes to the model code we use
nupdate a preprocessor. It takes input of the
form - id run1
- d mod_paradi.11,12
- integer, parameter lon 320, lat
160 - i mod_paradi.14
- ! This is just a comment
- b mod_paradi.14
- ! This comment comes before the other
comment - To see the line numbers, check the tomcat listing
in /ohome/emtomcat/public/lib/p-tomcat.
1.0.list - tom.build runs nupdate twice, for tomcat ASAD
37Editing code adding your own changes to
subroutines
- Put all your nupdate changes in a file(s), rather
than include the changes in tom.build directly.
Much easier to manage this way. - Edit tom.build to readcat gt tomcat.mods
ltltEOFread newcode.upread newcode2.up - To alter the ASAD code, similarly add read after
the linecat gt asad.mods ltlt EOF - Remember to use f90, free source format
38Adding new code
- If you have hundreds of lines of new code,
nupdate can be cumbersome and error prone
(although if you want to..). - I would suggest you
- Run tom.build once to create the src directory
Makefile - Edit the code directly (copy the original first!)
- Edit the Makefile if you need to
- Run pmake (or make -j 4 on newton) by hand
- Disadvantages is that its harder to move to new
versions and code management requires more effort
39Adding new files of code
- If you need to add new .f90 files
- Consider using f90 modules. They have a number of
advantages. - You must edit the Makefile
- Add your .f90 file to the list of files to be
compiled - Add the dependency (if any) to the makefile.
- New code or bugfixes should be sent to Glenn for
inclusion - It will be expected that the code will be
properly tested - Test results should accompany code
- For more details email or see Glenn or Nick
40Compiling the code
- Normally you would not change the compiler
options in the Makefile apart from when testing.
Uncomment the second set of optionsF90FLAGS_IRI
X64-64 -r8 -extend_source -O2 \
-DEBUGtrap_uninitializedONverbose_runtimeONdi
v_check3 \ -LISTall_optionsON
-I/opt/mpt/mpt/usr/include use these while
developingF90FLAGS_IRIX64 -g -64 -r8
-extend_source \ -I/opt/mpt/mpt/usr/include
\ -DEBUGtrap_uninitializedONverbose_runtime
ONdiv_check3 -check_bounds - -check_bounds should be used for brand-new code
but will slow the model down a lot. Just use it
once. - After editing the Makefile, do a make clean
followed by a pmake tomcat to recompile the model
41Running p-TOMCAT
- Download the tom.run script from the website to
your experiment directory - You MUST change the line EXPTltyour
experiment directorygt - To run either submit as a batch job by bsub lt
tom.run (not bsub tom.run!!)or ./tom.runto
run interactively on wren (max. of 4 processors
only)
42Files you need
- To run p-TOMCAT you will need the following files
in your experiment directory - chch.d - this lists the
chemical species - ratb.d, ratt.d, ratj.d - the reactions used
- depvel.d - deposition
velocities - henry.d - henrys law
coefficients (wet dep.) - An binary file with initial fields for models
tracers - These are available off the website or on wren in
the emtomcat account - The model reads other files which are held in the
emtomcat account. You can supply your own if you
wish by changing the namelist variables which set
the directory the model looks in for these files.
These files will be moved for v1.1
43Reaction rate data
- The reaction coefficients used in the rat?.d
files are collated from various sources - IUPAC kinetic data (www.iupac-kinetic.ch.cam.ac.uk
) - JPL kinetic data (jpldataeval.jpl.nasa.gov)
- Master Chemical Mechanism (MCM)
(www.chem.leeds.ac.uk/Atmospheric/MCM/mcmproj.html
) - Note! Some reaction rates do not have simple
Arrhenius forms and are dealt with specifically
in the ASAD code (see bimol.f trimol.f) - p-TOMCAT uses rates current at 2000
- TOMCAT runs on the Fujitsu used different rates
so no long runs have yet been done using these - The ratefiles will be updated periodically. Due
for an update soon
44p-TOMCAT initial files
- New initial files can be created from past runs
- Very early runs had low methane so use more
recent runs by Fiona (ACTO), Nick (POET) or
Richard (MOZAIC) - Note! Reaction rates have been altered since
these runs so allow time to spin the model up - Programs exist to manipulate initial files
- Change the resolution
- Change the date
- Add more tracers
- Default initial file available for 1/3/1997
45Analysis files
- p-TOMCAT currently uses modified ECMWF
operational analyses which are stored on wren in
/v/lrkd/ECMWF/T42 - Operational analyses changed levels
- 31 levels up to March 1999
- Then 50 levels up to Oct 1999
- 60 levels thereafter
- We plan on using ERA-40 analyses to avoid
inconsistencies in the operational analyses at
some point - Though convection is known to change dramatically
- It is possible to run the model with higher
resolution analyses
46Anatomy of the tom.run script
- Check the batch queue options at the topBSUB
-J p-tomcat job nameBSUB -o tomcat_log.J
job log fileBSUB -m green machine to
run on 'green', 'fermat, 'newt64iBSUB -N
mail me when doneBSUB -n 16
no. of processorsBSUB -W 200
run time (wall clock time hrsmins) - You can override any of these on the command
line bsub -m fermat lt tom.run - No. of processors must match nprocinprock in
config file - And dont forget to change EXPT
/santmp/emgdc/run1 - Remember we are charged on the no. of cpus used!
47Namelist variables
- The model uses namelists (f90) to control its
options, unlike old TOMCAT where you had to
modify the code - There are three namelists used by the model
switches / chem_switches /
pbl_switches / - Normally you will only need to change switches
to control what the model does
48How namelists work
- A variable in a namelist has a default value in
the model and therefore need not be changed at
all - Namelist variables can have comments after
switches nsteps 480, ! Number of
timesteps - The model will check user settings are sensible
- For a full description of the switches, see
mod_switch.f90. - For further information about namelists, see a
f90 book!
49Main model switches
- The main model switches you might want to change
areswitches initial_file
EXPT/init_tRES_97030100 dt0 1800.0,
! Dynamical timestep (secs) nsteps 4,
! Number of steps in the run nso1 12,
! Frequency of model output in steps
nfrf 0, ! Frequency at which restart
files are created ifrqnc -14, ! Frequency
at which new netcdf files are created - Then there are switches to turn off components
such as the emissions, chemistry etc. See
mod_switches.f90 for more details - Note that the frequency at which the netCDF files
are created depends on the resolution and the
frequency at which you output - At T21 with 6hrly output, create a new netcdf
file every 14 days - At T42 with 6hrly output, create a new one every
day - Dont create netCDFs bigger than 2Gbytes
50Types of model output
- Restart file
- This is a full precision (real8) dump of the
models main variables. - You can use this file to restart the run as if
the run had not been stopped - You can also use a restart file to create an
initial file for the model - Vertical diffusion scheme (PBL scheme) physical
restart file - Netcdf output
- Reduced precision (real4) for plotting /
diagnostics (Ferret, GrADS, IDL) - Usually contains all the models main variables
other diagnostics fields - You can add additional fields to this file by
altering the model code - Binary output (PDG)
- Full precision, fortran binary output file with
the main model variables - Old versions of TOMCAT used this as the main
output file, but netcdf files are smaller and
easier to use - Still useful for doing exact comparisons between
different model versions or diagnostics that need
high precision - Can also be used to create initial files for new
runs
51How to restart a model run
- Batch job time limits mean long runs will need to
be split - A restart file contains the same as an initial
file but also includes the first and second order
moments for the advection scheme to be able to do
an exact restart - To do a restart
- Rename the restart file e.g. p-TOMCAT.RESTART to
init_restart otherwise the model will overwrite
it - Set the switch initial_file init_restart
- Set the switch nrstart 1 in the namelist
52p-TOMCAT ASAD chemistry code
- In this section, youll learn more about
- ASAD
- How it works
- How to change things
- A userguide for ASAD is available for more
details on the ACMSU website (www.acmsu.nerc.ac.uk
)
53ASAD Introduction
- ASAD is a black-box, a piece of software
designed to solve a chemical scheme without
writing any code (or very little) - Chemistry is specified using input text files.
Changes can be made very quickly - Though be aware reactions with non-standard rate
formula may need to be coded explicitly in ASAD - ASAD is used in box models, p-TOMCAT and the UM
so chemical schemes can be moved between models
by copying the text files - ASAD was developed some years ago by Glenn, Paul
Brown and Oliver Wild and is being used
world-wide.
54ASAD Chemistry
- Two subroutine calls are needed to use ASAD in
p-TOMCAT asad_init, asad_step - ASAD uses text files to describe the chemical
species and reactions chch.d - Chosen
Chemistry ratb.d - bimolecular
reactions ratt.d - uni- termolecular
reactions ratj.d - photolysis
reactions rath.d - heterogeneous
reactions (not used) - ASAD code will call non-ASAD routines to compute
photolysis rates, wet dry deposition and
emission rates (photo.f, wetnew.f and drydep.f)
55ASAD Changing the reactions
- Change the rate coeffs in the appropriate
ratefile 1 HO2 NO OH NO2
3.10E-12 0.00 -270.0 - Reactions can be disabled by commenting out with
a or set the rates to zero - If you change the number of reactions you must
change the appropriate parameter in the ASAD code
by altering tom.buildcat gt asad.mods ltlt
EOFid rund comparamc.84,85 parameter (
jpctr 32, jpspec 52, jpnl 2NIV )
parameter ( jpbk 89, jptk 15, jppj 27, jphk
0 )
Files are ratb.d ratt.d
ratj.d rath.d
56ASAD the chch.d file
- 1 'O(3P)' 1 'FM' 'Ox' F F F 'Atomic
oxygen (ground state)' 'pptv' - 2 'O(1D)' 1 'FM' 'Ox' F F F 'Atomic
Oxygen (excited state)' 'pptv' - 3 'O3' 1 'FM' 'Ox' T F F 'Ozone'
'ppbv' - 4 'NO' 1 'FM' 'NOx' T F F 'Nitric
Oxide' 'pptv' - 5 'NO3' 1 'FM' 'NOx' T T F 'Nitrate
Radical' 'pptv' - 6 'NO2' 1 'FM' 'NOx' T F T 'Nitrogen
Dioxide' 'ppbv' - 7 'N2O5' 2 'TR' ' ' T T F 'Dinitrogene
Pentoxide' 'ppbv' - 8 'HO2NO2' 1 'TR' ' ' T T F 'Peroxynitric
Acid' 'ppbv' - 9 'HONO2' 1 'TR' ' ' T T F 'Nitric
Acid', 'ppbv' - 10 'OH' 1 'SS' ' ' F F F 'Hydroxyl
Radical' 'pptv' - 11 'HO2' 1 'SS' ' ' F T F 'Hydroperoxyl
Radical' 'pptv' - 12 'H2O2' 1 'TR' ' ' T T F 'Hydrogen
Peroxide' 'ppbv' - 13 'CH4' 1 'TR' ' ' F F T 'Methane'
'ppbv' - 14 'CO' 1 'TR' ' ' T F T 'Carbon
Monoxide' 'ppbv' - 15 'HCHO' 1 'TR' ' ' T T T
'Formaldehyde' 'ppbv' - 16 'MeOO' 1 'SS' ' ' F T F 'CH3OO'
'ppbv' - 17 'H2O' 1 'CF' ' ' F F F 'Water
Vapour' 'ppbv'
ASAD just needs these
TOMCAT uses these
57ASAD the chch.d file
- 1 'O(3P)' 1 'FM' 'Ox' F F F 'Atomic
oxygen (ground state)' 'pptv' - 2 'O(1D)' 1 'FM' 'Ox' F F F 'Atomic
Oxygen (excited state)' 'pptv' - 3 'O3' 1 'FM' 'Ox' T F F 'Ozone'
'ppbv' - 4 'NO' 1 'FM' 'NOx' T F F 'Nitric
Oxide' 'pptv' - 5 'NO3' 1 'FM' 'NOx' T T F 'Nitrate
Radical' 'pptv' - 6 'NO2' 1 'FM' 'NOx' T F T 'Nitrogen
Dioxide' 'ppbv' - 7 'N2O5' 2 'TR' ' ' T T F 'Dinitrogene
Pentoxide' 'ppbv' - 8 'HO2NO2' 1 'TR' ' ' T T F 'Peroxynitric
Acid' 'ppbv' - 9 'HONO2' 1 'TR' ' ' T T F 'Nitric
Acid', 'ppbv' - 10 'OH' 1 'SS' ' ' F F F 'Hydroxyl
Radical' 'pptv' - 11 'HO2' 1 'SS' ' ' F T F 'Hydroperoxyl
Radical' 'pptv' - 12 'H2O2' 1 'TR' ' ' T T F 'Hydrogen
Peroxide' 'ppbv' - 13 'CH4' 1 'TR' ' ' F F T 'Methane'
'ppbv' - 14 'CO' 1 'TR' ' ' T F T 'Carbon
Monoxide' 'ppbv' - 15 'HCHO' 1 'TR' ' ' T T T
'Formaldehyde' 'ppbv' - 16 'MeOO' 1 'SS' ' ' F T F 'CH3OO'
'ppbv' - 17 'H2O' 1 'CF' ' ' F F F 'Water
Vapour' 'ppbv'
Short species name. Appears in netcdf file
Long species name. Appears in netcdf file
Units in which species is output to netcdf file
ASAD just needs these
TOMCAT uses these
58ASAD the chch.d file
No. of odd atoms for a family species
- 1 'O(3P)' 1 'FM' 'Ox' F F F 'Atomic
oxygen (ground state)' 'pptv' - 2 'O(1D)' 1 'FM' 'Ox' F F F 'Atomic
Oxygen (excited state)' 'pptv' - 3 'O3' 1 'FM' 'Ox' T F F 'Ozone'
'ppbv' - 4 'NO' 1 'FM' 'NOx' T F F 'Nitric
Oxide' 'pptv' - 5 'NO3' 1 'FM' 'NOx' T T F 'Nitrate
Radical' 'pptv' - 6 'NO2' 1 'FM' 'NOx' T F T 'Nitrogen
Dioxide' 'ppbv' - 7 'N2O5' 2 'TR' ' ' T T F 'Dinitrogene
Pentoxide' 'ppbv' - 8 'HO2NO2' 1 'TR' ' ' T T F 'Peroxynitric
Acid' 'ppbv' - 9 'HONO2' 1 'TR' ' ' T T F 'Nitric
Acid', 'ppbv' - 10 'OH' 1 'SS' ' ' F F F 'Hydroxyl
Radical' 'pptv' - 11 'HO2' 1 'SS' ' ' F T F 'Hydroperoxyl
Radical' 'pptv' - 12 'H2O2' 1 'TR' ' ' T T F 'Hydrogen
Peroxide' 'ppbv' - 13 'CH4' 1 'TR' ' ' F F T 'Methane'
'ppbv' - 14 'CO' 1 'TR' ' ' T F T 'Carbon
Monoxide' 'ppbv' - 15 'HCHO' 1 'TR' ' ' T T T
'Formaldehyde' 'ppbv' - 16 'MeOO' 1 'SS' ' ' F T F 'CH3OO'
'ppbv' - 17 'H2O' 1 'CF' ' ' F F F 'Water
Vapour' 'ppbv'
Name of family tracer
Emission, dry wet dep on/off switches
Type of species FM family member, TR tracer,
SS steady state, CF constant field, CT
constant value
ASAD just needs these
TOMCAT uses these
59Changing the tracers or species
- If you change the number of advected tracers you
will need to change NTRA in the tomcat parameters
and JPCTR in the ASAD parameters in tom.build AND
supply a new initial file - The advected tracers are the families (e.g. Ox)
and TR species listed in the chch.d file. ASAD
also prints out the tracers in the job log file - Other non-ASAD chemistry can be included but the
model will still expect a chch.d file for the
netcdf output
60p-TOMCAT model code
- In these next slides we will look at
- Internal structure of the model
- How the parallel aspect affects the code
- How to make code changes
61p-TOMCAT flowchart
- The main timestep loop is in main.f90
- Top level subroutines for advection, chemistry
etc have outer loops over tracers and latitude - Inner loops are over longitude
- Note dependency exists between some schemes e.g.
dry deposition scheme needs diffusion coefficient
from vertical diffusion (PBL) scheme
62Guidelines for adding code
- All new code must use free format fortran 90
source (.f90) - Do NOT use common blocks, use fortran 90 modules
- Comment your code sensibly
- Use IMPLICIT NONE and declare all your variables
- Use INTENT() to indicate arguments
- Use f90 array operations wherever you can
- Avoid f90 pointers, they do not optimise well
- Large local arrays should have save attribute
to stop them being allocated in memory everytime
subroutine is called - Do NOT use global arrays unless for I/O in which
case allocate on process 0 only to avoid each
process having a big array - F90 derived types can be used to aid clarity
63Code changes new arrays
- Arrays should be declared with (nlonmx,nlatmx)
NOT (LON,LAT) - However, loop limits should use mylon, mylat and
NOT nlonmx, nlatmx. This is because in the future
processors may get different numbers of
gridpointse.g. - real array(nlonmx,nlatmx,niv)
- do k 1, niv
- do j 1, mylat
- do i 1, mylon
- array(i,j) sm(i,j,k)
- end do
- end do
- end do
- Better still as
- real array(nlonmx,nlatmx,niv)
- array(1mylon,1mylat,)
st(1mylon,1mylat,) - note st in the model uses a halo so we must
specify indicies
Note ordering of the loops here!
64Aborting the model with endrun
- As the model is running in parallel it is
important to stop the model cleanly if there is
an error or some other reason to stop. - This also makes sure the netCDF file is closed
cleanly and the latest model output is not lost - A subroutine endrun should be called to stop
the model - e.g. if ( j gt nlatmx ) call
endrun(.false.) - You will call it with .false. to indicate
something went wrong. The only time its called
with an argument of .true. is at the end of the
program
65p-TOMCAT Files for input
- The model will always check that a required input
file exists and any new code is required to do
the same. - Use the checkexists() logical function to do
this e.g. - if ( .not. checkexists('/santmp/emgdc/input
,subname)) call endrun(.false.) - The arguments to checkexists are
- 1st filename to check
- 2nd subroutine name from which it was called
(for the error message written by checkexists) - The function will return .true. If the file is
found, .false. otherwise
66p-TOMCAT input/output
- By convention, all file based I/O is done on
process 0 - Messages can be output from any process but
beware of flooding the log file with ncpus
worth of messages all the same e.g. if ( .not.
checkexists('/santmp/emgdc/input,subname))
then - if (myproc 0 ) print ,could not find
the file call endrun(.false.)endif - What would happen if the if (myproc 0) was
removed? - Note here that endrun MUST be called by all
processes
67p-TOMCAT channel numbers
- Channel (or unit) numbers are assigned
dynamically in p-TOMCAT - You must not make up your own unit numbers
- Use the function getunit() to find the next free
channel number if ( myproc 0 ) iunit
getunit() - When youve finished with it, return it to the
list of free channels by if (myproc 0 )
ierr freeunit(iunit) - If you need to keep a unit number, call getunit()
once in an initialisation and never call freeunit
e.g. subroutine mysub logical, save
first .true. integer, save iunit if (
first ) then - iunit getunit() first
.false. endif
68Input/output complete example
- if ( .not.checkexists(aircraftems,'readem') )
call endrun(.false.) - if ( myproc 0 ) then
- iunit getunit()
- open(iunit,filetrim(aircraftems),status'ol
d') - endif
- do m 1,12
- do k 1,3
- do l 1,niv
- call ppreadfm(iunit,csdervn(1,1,l,m,k,1)) !
formatted read. - end do
- end do
- end do
- if ( myproc0 ) then
- close(iunit)
- ierr freeunit(iunit)
- endif
- ppreadfm and ppread handle reading formatted and
unformatted global arrays, and distributing them
to the correct processors
69p-TOMCAT input/output optimisation
- I/O hits the models performance
- Read data into saved arrays at the start of the
run - Rather than read a file every, say, week/month
read all of it whenever you can (caching) - There is always a tradeoff between reading files
and storing them in memory, so if youre not sure
whats best please ask
70Zonal global means
- Use the supplied functions (see
mod_zonalmean.f90) - my_zonalmean - sums along a latitude row. Each
process gets a copy of the zonal mean for the
latitudes it owns - global_zonalmean - process 0 only gets the
complete zonal mean field - These functions take care of the MPI calls
- They also give reproducible answers regardless of
the domain decomposition - They reduce loss of precision from rounding by
summing smallest values to biggest values - To compute a global mean use the subroutine
- call mpe_reduce(emlight, temp, 0,
mperr)
71Adding fields to the netcdf output
- A common requirement might be to add new output
fields to the netcdf file - Consult the code in mod_unicdf.f90. Lots of
comments at the top of this file explaining what
to do. - Consists of two steps
- Define the new variable in the netcdf field
(once-only step) - Add code to gather the variable from the
processors and write it out
72Defining new netcdf fields
- First must define the new field and its
properties in the netcdf file - Modify the subroutine inicdf in mod_unicdf.f90
- As an example, we want to add output of a 3D
model field for relative humidity. After the
definition of the other 3D fields we add call
unitom_def_var( tomcat_results, 'rh', NF90_FLOAT,
dims4d, - long_name 'Relative humidity', axis'TZYX',
units'' ) - Arguments
- tomcat_results is the variable for the netcdf
file (derived type) - rh is the field name as its short name in the
netcdf file - dims4d means is 3D in space and varies with time
- long_name, axis and units are all attributes
associated with this variable in the netcdf file.
You do not strictly need them but it is strongly
recommended you include them
73Writing out new netcdf fields
- To write out this field, add some code to
write_cdf in mod_unicdf.f90 - First add code to collect the patch arrays into
a global array. If possible, use the already
declared field array to avoid introducing new
global arrays - do k 1, niv
- call collf( field(1,1,k), rh(1,1,k), NLONMX,
1, 1, 0 ) - end do
- Then add some code to write it out
- if ( myproc 0 )
- call unitom_write_var( tomcat_results, rh,
field, timeindex )
74Debugging
- Debugging in parallel is considerably more
complicated than a serial program - There are a number of potential problems that can
arise just because tomcat is parallel - Most common is that the model will just
completely stop dead (why?) - If TOMCAT is giving problems
- Check new code by using -check_bounds
- Reduce the number of processors and print out
numbers (see mod_utility.f90 for helpful code) - Debug interactively on wren - particularly if
model just stopped - Seek help!
75Debugging using Totalview
- Manchester has an interactive, parallel, debugger
available on wren and newton. Extremely useful! - To use it
- Get model to run near to point where it crashes
and save restart file - Recompile with debugging options (but not
-check_bounds) - Alter model to use max. of 4 processors
- Edit tom.run script, comment out line beginning
mpirun -np and uncomment line beginning
totalview mpirun -np - Set your DISPLAY environment variable then do
./tom.run - Then ask someone to show you how totalview works
if you are not sure
76Flight tracks
- A facility exists in the model to output the
species along a flight track either real or made
up - The flight track code will interpolate in space
but not in time and work in parallel - To find out how to use it consult the model user
guide
77Future directions
- Things you might see in future versions of
p-TOMCAT - Ability to run properly at 1x1 or 0.5x0.5
horizontal resolution - More vertical levels
- Faster chemistry code and other optimisations
- New advection scheme
- New chemistry isoprene? Halogen?
- New vertical mixing scheme (PBL)
- New diagnostics, possibly also chemical budgets
- Ask if you want something!
78Final comments
- Any questions, ask Glenn or Nick
- Youre encouraged to look at the code for
examples of how to do things - Suggestions for improvements and bug reports are
always welcome!
79Acknowledgements
- p-TOMCATs parallel code is based on the work
Cate Bridgeman did on parallelizing SLIMCAT and
TOMCAT. The MPI code is originally derived from
portions of the ECMWF semi-Lagrangian model - Martyn Chipperfield is thanked for help with some
aspects of the model code - Last but certainly not least, thanks to Fiona
OConnor and Nick Savage for all their long hard
work on the model and comments on bits of this
course