Modelling data and curve fitting - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Modelling data and curve fitting

Description:

Least squares straight line fit, and interpreting the measuring chi square. Non-linear fit using a simple search for the minimum chi square ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 25
Provided by: DrSt5
Category:

less

Transcript and Presenter's Notes

Title: Modelling data and curve fitting


1
Modelling data and curve fitting
  • Least squares
  • Maximum likelihood
  • Chi squared
  • Confidence limits
  • General linear fits

(Chapter 15, Numerical recipes. Press et al )
2
Best fit straight line
  • Assume we measure a parameter y for a set of x
    values, giving a set of data xi and yi
  • We want to model the data using a linear relation

y(xi) a b xi
3
Best fit straight line
  • How do we find the coefficients a and b that give
    the best fit to the data?
  • Given a pair of values for a and b, we need to
    define a measure of the goodness of fit.
  • Then choose the a and b values that give the best
    fit.

4
Least squares fit
  • For each data point, xi, calculate the difference
    between measured yi and the model prediction,
    abxi
  • Note, ?yi can be positive or negative, so S?yi
    can be zero.
  • Minimizing the sum of the squared residuals will
    give a good overall fit
  • Computationally, try a range of values for a and
    b, and for each pair calculate
  • The pair which gives the smallest S is the best
    fit

?yi yi a bxi
SS(?yi2)
5
Maximum likelihood
  • It can be shown that the parameters that minimize
    the sum of the squares are the most likely, given
    the measured data
  • Assume the x values are exact, and the
    measurement errors on the y values are Gaussian,
    with mean zero, and deviation s. So
  • Where ei is a random variable taken from a
    Gaussian distribution

yi ytrue(xi) ei
6
Example Gaussian distribution
7
  • If the true values of a and b are a0 and b0 then
  • So the probability of observing yi is
  • (assuming s is the same for all measurements)

8
  • And the probability of observing the whole
    dataset yi is
  • We can use Bayes theorem to relate this to how
    likely it is that the model parameters are a and
    b

9
Bayes theorem
  • Given two events A and B, then the conditional
    probabilities are related
  • P(AB) P(B) P(BA)P(A)
  • P(AB) is the probability of A happening, given
    that B has happened
  • P(A) is the probability of A happening,
    independent of B

10
Application of Bayes theorem
  • Consider a model M and some data D. Then Bayes
    theorem tells you the probability that the model
    is right, given the data that you have observed
  • So the probability of a particular model, given
    the data, depends on the probability of observing
    your data given the model
  • The most probable model is the one for which the
    observed data is most likely
  • Vary a and b to find the maximum P(M(a,b)D),
    which is the same P(a0,b0) defined earlier

P(MD) P(DM)P(M)/P(D)
11
  • Maximizing
  • means minimizing
  • So for uniform Gaussian errors, maximum
    likelihood is the same as least squares

12
Non-Gaussian errors
  • Sometimes you know errors are not Gaussian, so
    least squares may not be the best method.
  • Minimizing the sum of the modulus is very robust
  • It is equivalent to using the median instead of
    the mean
  • In general use M-estimates maximum-likelihood
    based on non-Gaussian error distribution

13
(Chi squared)
  • If the uncertainty is different for each
    measurement then define a quantity
  • If the errors are Gaussian, then minimizing
    will give the maximum likelihood values of the
    parameters.

14
Example of minimum
15
Finding minimum of (numerically)
  • Calculate S(?yi2) for a grid of a and b values
    and pick the point that is the minimum

16
Finding minimum of (analytically)
  • Analytically differentiate with respect to a
    and b and set
  • and
  • Leads to

17
Confidence interval
  • The distribution of has a chi-square
    distribution with N-M degrees of freedom.
  • The distribution of has a
    chi-square distribution with M degrees of freedom
    (for M parameters).
  • The probability of a given value of A being the
    true value is given by the probability of getting
    the observed for that value.
  • When this corresponds to
    68 ie 1s

18
(No Transcript)
19
The value of
The value of tells you more about the
model and the data If is greater than
the number of degrees of freedom either the real
errors are greater than the that you used,
or the model is not good. If is less
than the number of degrees of freedom either the
real errors are smaller than the that you
used, or the model has too many parameters.
20
General linear models
  • Express your model as the sum of basis functions
    with linear coefficients
  • The functions can be arbitrary, but are fixed
  • A common example is a polynomial fit, where the
    functions Xi(x) are powers of x

21
Finding minimum of(analytically for general
model)
  • Differentiate with respect to each parameter ak
    and set the differentials to zero
  • Define a matrix, a, and a vector ß
  • a is called the curvature matrix

22
  • Then the equations can be written in matrix form
  • And the solutions are given by
  • Where C is the inverse of the curvature matrix
  • C is also called the covariance matrix

23
Non-linear fits
  • Easiest approach is to make it linear, for
    example take logs
  • Otherwise use a direct parameter search for then
    minimum

24
Workshop
  • Least squares straight line fit, and interpreting
    the measuring chi square
  • Non-linear fit using a simple search for the
    minimum chi square
Write a Comment
User Comments (0)
About PowerShow.com