Page 21-22
Chapter 2
CHAPTER 3
L = Log(P)
L1 = Taylor Series. ~= L
-this part is a bit #unclear
P ~= Exp(L`)
GAUSSIAN EQUATION
-look at the parts being exponentiated
You’ll see that these equations are equal (because they’re describing the same probability distribution?). THEREFORE:
Page 22- What we’ve done is re-derived the Gaussian Equation
Section 2.2.1-Coin Tossing Example
Section 2.2.2. Assymetric Posterior PDFs. = Easy, but cool.
Section 2.2.3= Multi-modal PDFs.
Section 2.3- Gaussian Noise and Averages
Section 2.3.1 Data with Different sized error bars. (more general case of the earlier section.).
Example 3: the Lighthouse Problem
Last section => Generalized probability distributions.
This section => Plug in the Binomial Distribution with a uniform prior.
Here’s where the Maximum likelihood is:
**
We’ve solved for H0. We plug this into the formula’s for the second derivatives (to get them in terms of H0).
This allows us to rewrite the equations like so
This formulation has the following properties:
Since Ho does not vary a lot after a moderate amount of data have been analysed, the numerator tends to a constant value; thus the width of the posterior becomes inversely proportional to the square root of the number of data, as can be seen in Fig. 2.1. This formula for the error-bar also confirms our earlier assertion that it is easier to identify a highly biased coin than it is to be confident that it’s fair (because the numerator in eqn (2.20) has its greatest value when Ho = 0.5). The posterior pdf for H , given the data 9 heads in 32 flips, and the Gaussian approximation to it resulting from the above analysis (H = 0.28 ± 0.08), are shown in Fig. 2.4.
My own summary:
The idea of a
is merely an attempt to summarize the posterior with just two or three numbers; sometimes this just can’t be done,
We use Baye’s Rule:
Do some fancy Algebra, and get the following result for the LOG of the posterior:
We set the derivative = 0 to find the MLE. It turns out that this value is simply the average of the individual data-points. (which makes sense, since it’s a Gaussian with a uniform prior.).
We solve for the second-derivative (in order to solve for the value of THE ERROR BAR!! (NOT THETA!!)) (by using an equation we used earlier when we were deriving the Gaussian distribution).
IN THE DERIVATION WE DID EARLIER, WE GOT THETA=ERROR-BAR.
IN THIS DERIVATION, WE GET
ERRORBAR = THETA/SQRT(N)
IN the earlier derivation, this also makes sense, because N = 1.
ACTUALLY, NO. I’m confused now:
Okay, I figured it out. ONE THETA, is referring to the error on each measurement. A different theta is referring to the error given all the measurements. (this is clear, once we get different error-bars for each measurement…).