Title: WinBUGS for Bayesian fisheries models
1 x
WinBUGS for Bayesian fisheries models Steve J.
Fleischman Alaska Department of Fish and Game 333
Raspberry Road, Anchorage, Alaska 99518,
USA steve_fleischman_at_fishgame.state.ak.us KEYWORD
S Bayesian statistics, Markov Chain Monte
Carlo, time series, measurement error, mixture
models, missing data
Introduction Five species of Pacific salmon
(Oncorhynchus spp.) spawn in Alaska Rivers.
Important commercial, recreational, and
subsistence fisheries target specific salmon
stocks throughout the state. The Alaska
Department of Fish and Game (ADFG) is charged
with managing these stocks in a sustainable
manner. Often this means making daily decisions
about whether to permit fishing on individual
salmon populations. Sufficient numbers of salmon
must reach the spawning grounds each summer to
ensure continued viability of populations. The
software program WinBUGS1 enables the user to
specify complex Bayesian statistical models in a
concise manner. It then samples from the joint
posterior probability distribution of all
parameters using Markov Chain Monte Carlo (MCMC)
techniques. As a result, powerful Bayesian
analyses which were virtually impossible just a
decade or two ago are now accessible to the
journeyman statistician. Fisheries science is
fertile ground for implementation of these
capabilities. Three Bayesian analyses of salmon
data are presented here a two-stage mixture
model of fisheries sonar data, an age-structured
stock-recruit analysis, and a multi-stock
escapement and run-timing model.
Summary Weve found Bayesian methods in general,
and WinBUGS in particular, to be valuable tools
for analysis of fisheries data. Multiple,
diverse sources of information can be synthesized
in ways which were previously not possible. The
Bayesian framework permits realistic assessment
of uncertainty and formal incorporation of
auxiliary information. Bayesian posteriors can
be directly utilized in decision analyses, which
can help to objectify and improve fisheries
management. WinBUGS permits seamless
consideration of missing data, which can be a
prominent feature of many of our data sets. In
addition, we find that the process of developing
WinBUGS code forces us to think more clearly
about the processes generating the data. This
facilitates identification of information gaps,
and promotes efficient planning for future
research. We are hopeful that significant
advances in understanding will result from
implementation of fisheries models similar to
those illustrated here. WinBUGS code and further
details are available from the author.
(4)
Age-Structured Stock Recruit Model
In Pacific salmon population dynamics, the number
of progeny returning (R) from a given number of
spawning fish (S) is key. Spawner return models
(aka stock recruit models) are fitted to such
data to estimate an optimal population size,
i.e., how many spawning fish are required to
attain maximum sustained yield (SMSY).
Fundamentally, model estimation is
straightforward- in this example a two parameter
model is fit to bivariate data. Ideally, several
dozen pairs of SR data are available to estimate
the parameters of the model. Problems may arise
when S or R are measured with error, or when
unknown time-varying factors introduce serial
correlation into model residuals.
Two-Stage Mixture Model for Hydroacoustic Species
Classification2
Multi-Stock Run-Timing and Escapement Model (IN
PROGRESS)
R
Hydroacoustics (sonar) is commonly used in Alaska
to count salmon migrating upstream to spawn, but
using sonar to distinguish between species of
fish is difficult. When two species are present,
the conventional approach has been to measure a
hydroacoustic correlate of size for each fish and
classify to species depending on whether or not
the measurement meets a minimum threshold value.
This method rarely succeeds because hydroacoustic
measurements are imperfect size predictors, and
because species often overlap in size. Under
these conditions, the resulting species
composition estimates are biased, especially when
one species is dominant. Therefore weve
developed a mixture model approach to estimate
the proportions of sockeye and chinook salmon
returning to the Kenai River in southcentral
Alaska.
ADFG and other agencies operate numerous
research projects (weirs, counting towers, sonar,
mark-recapture) on rivers throughout the state
whose primary purpose is to count the number of
returning salmon. On a daily basis throughout
the summer, information from these projects is
scrutinized and total escapement projected, in
order to guide management decisions about local
fisheries.
S
The observed frequency distribution of the
hydroacoustic variate (ELSD echo length
standard deviation, panel g) is modelled as a
mixture of distributions from sockeye (b) and
chinook salmon (e). Information about the
relationship between ELSD and fish size comes
from independent experiments with tethered fish
(panel c, sockeye, x chinook). Fish size
is observed from independent netting experiments
(a and d). The mixture weights (species
proportions, panel f) are the parameters of
interest.
Another problem is missing data. Because Pacific
salmon do not all mature at the same rate, they
return as several discrete age classes. E.g.,
the total return from chinook salmon spawning in
1990 consists of fish aged 3 to 7 from 1993 to
1997. Thus we require age composition estimates
for 5 years to estimate the total return from the
1990 brood year. Fisheries agencies often have
long time series of S estimates, but a smaller
set of return data, because estimates of R
require expensive scale aging programs. For this
stock, we have spawner counts from a weir dating
back to 1980, but scale aging was not initiated
until 1993. Conventional statistical treatment of
such data leaves much to be desired. In this
example, 23 years of information are available,
but only 6 complete data pairs are available
for estimating model parameters with a
conventional analysis. Imputing values for the
missing quantities is difficult and
underestimates uncertainty.
Sockeye and chinook salmon return from the sea to
spawn at several discrete ages. Thus we model
the fish-size (x) distributions as
three-component normal age mixtures f(x)
q1 f1(x) q2 f2(x) q3 f3(x) where the qa are
age proportions, and fa(x) is normal with mean m
and variance s21. The relationship of the sonar
measurements (y) to fish size (x) is yi
b0 b1xi gzi ei where b0 is the intercept,
b1 the slope, g the difference between species,
zi 1 if fish i is a sockeye and 0 otherwise,
and ei is normally distributed with mean 0 and
variance s22. Finally, the density of the
observed data f(y) has components due to sockeye
fS(y) and chinook fC(y), with species proportions
pS and pC. f(y) pS fS(y) pC fC(y)
Measurement error and serial correlation are
easily incorporated into the WinBUGS model.
Likewise, missing data present no difficulty
they are simply treated as additional unknowns in
the model. The posterior medians and 80
intervals for all SR pairs are plotted to the
left. The filled squares represent the expected
R posterior median from the Bayesian model, which
considers all 23 data points. The fit is quite
different from what would have resulted from
conventional analysis of the 6 complete data
points in red.
Salmon escapement estimation projects on
tributaries of the Kuskokwim River in western
Alaska.
We are currently in the early stages of
developing a general run-timing model in WinBUGS
which will synthesize information from fish
counts across days, years, and tributaries. We
anticipate using such a model to project total
escapement, incorporating information from the
cumulative numbers to date from all related
tributaries, plus historic run-timing data. Such
projections would be in the form of posterior
predictive distributions, and would reflect all
major sources of uncertainty. They would be
robust to missing data, which are common to
escapement projects when water levels suddenly
rise, washing out weirs or reducing visibility at
counting towers. Finally, they could be further
tuned by including auxiliary information in the
form of informative prior distributions. There
are often several sources of such information
which may contribute to the accuracy and utility
of output from a run-timing model. For instance,
fish travel time between the river mouth and
individual tributaries has been investigated with
tagging experiments, and pre-season forecasts of
escapement may be available from spawner-return
models, such as the one described in the center
panel of this poster.
The next graph shows the posterior intervals for
expected R plotted versus time. As expected,
precision is poor early in the study period, when
age composition information is missing. In the
late 1980s, estimates of R become much more
precise until near the end of the study period,
when brood years returns are incomplete.
- The overall design is therefore a mixture of
(transformed) mixtures the observed
hydroacoustic data were modeled as a
two-component mixture of y, each component of
which was transformed from a three-component
normal mixture of x. With conventional software,
such a model is difficult and cumbersome to fit.
WinBUGS makes it possible to - specify the model rigorously and concisely (the
BUGS code is 30 lines long vs several hundred for
a SAS bootstrap approximation), - consider all sources of uncertainty (i.e.,
sampling error associated with observing y with
the sonar, observing x with the nets, and
estimating the relationship between y and x with
the tethered fish experiments), and - objectively incorporate auxiliary information
(other estimates of p, q, m, and s are available
from the netting experiments and associated
research). - Results are shown for 3 weeks of hydroacoustic
data collected during 2002.
We used non-informative priors in our analysis,
but several parameters, including one of the
Ricker parameters, the time series coefficients,
and process error variances could potentially
borrow strength from analyses of other chinook
salmon populations through the use of
informative priors or hierarchical models.
Literature Cited 1) Gilks, W.R., Thomas, A., and
D.J. Spiegelhalter. 1994. A language and
program for complex Bayesian modelling. The
Statistician 43169-178. World wide web
www.mrc-bsu.cam.ac.uk/bugs 2) Fleischman, S. and
Burwen, D. 2003. Mixture models for the
species apportionment of hydroacoustic data, with
echo-envelope length as the discriminatory
variable. ICES J. of Marine Science 60(3)
592-598.
The posterior distribution for the optimal
population size is shown at left. A forecast of
next years run is also available in the form of
a posterior predictive distribution. As the run
comes in, this can provide a useful informative
prior for projecting total run size from the
cumulative run to date.
Acknowledgements Dave Bernard, Debby Burwen, Bob
Clark, Jim Hasbrouck, Doug Molyneaux, and Dan
Reed contributed thoughts or ideas to the
analyses presented here.
2(No Transcript)