to use this document yourself.
• Page 21-22

• Chapter 2

• CHAPTER 3

• L = Log(P)
L1 = Taylor Series. ~= L
-this part is a bit #unclear
P ~= Exp(L`)

GAUSSIAN EQUATION
-look at the parts being exponentiated You’ll see that these equations are equal (because they’re describing the same probability distribution?). THEREFORE: • Page 22- What we’ve done is re-derived the Gaussian Equation

• Section 2.2.1-Coin Tossing Example

• Section 2.2.2. Assymetric Posterior PDFs. = Easy, but cool.

• Section 2.2.3= Multi-modal PDFs.

• Section 2.3- Gaussian Noise and Averages

• Section 2.3.1 Data with Different sized error bars. (more general case of the earlier section.).

• Example 3: the Lighthouse Problem

• #unclear No. It makes a lot more sense in this context: • Last section => Generalized probability distributions.

This section => Plug in the Binomial Distribution with a uniform prior.

• Here’s where the Maximum likelihood is: • **
We’ve solved for H0. We plug this into the formula’s for the second derivatives (to get them in terms of H0).

This allows us to rewrite the equations like so • This formulation has the following properties:

Since Ho does not vary a lot after a moderate amount of data have been analysed, the numerator tends to a constant value; thus the width of the posterior becomes inversely proportional to the square root of the number of data, as can be seen in Fig. 2.1. This formula for the error-bar also confirms our earlier assertion that it is easier to identify a highly biased coin than it is to be confident that it’s fair (because the numerator in eqn (2.20) has its greatest value when Ho = 0.5). The posterior pdf for H , given the data 9 heads in 32 flips, and the Gaussian approximation to it resulting from the above analysis (H = 0.28 ± 0.08), are shown in Fig. 2.4.

My own summary:

1. We approximate the Gaussian distribution by using “plain old” taylor Series.
2. We then do some algebra to show that THETA is inversely proportional to the square of the data-collected. `(WHICH IS A COOL RESULT!!).
• The idea of a

1. best estimate. (peak) (derivative =0)
2. error-bar, (where?)
3. or even a confidence interval, (integral)

is merely an attempt to summarize the posterior with just two or three numbers; sometimes this just can’t be done,

1. We’re measuring something, but we assume our measurements will be flawed according to a “normal”/gaussian distribution.
-We prove the assumption later.
2. “Given a set of data {xk}, what is the best estimate of μ”
3. We’re restrict ourselves to the easy case where we know theta.
• We use Baye’s Rule:

1. Posterior = PDF of likelihood of true value.
2. Prior = likelihood of true function.
-We assign a uniform prior. (to express our ignorance.). (Knowing theta doesn’t tell us u).
-The uniform prior is chosen so that integrating gives us an area of 1. (this is done by assigning it to the reciprocal of the range.).
3. Likelihood function = Probability of the {X_k’s} given u.
-Equals the product of the individual data-points (assuming independence of measurements.).
• Do some fancy Algebra, and get the following result for the LOG of the posterior: • We set the derivative = 0 to find the MLE. It turns out that this value is simply the average of the individual data-points. (which makes sense, since it’s a Gaussian with a uniform prior.).

• We solve for the second-derivative (in order to solve for the value of THE ERROR BAR!! (NOT THETA!!)) (by using an equation we used earlier when we were deriving the Gaussian distribution). IN THE DERIVATION WE DID EARLIER, WE GOT THETA=ERROR-BAR.

IN THIS DERIVATION, WE GET
ERRORBAR = THETA/SQRT(N)

IN the earlier derivation, this also makes sense, because N = 1.

ACTUALLY, NO. I’m confused now: Okay, I figured it out. ONE THETA, is referring to the error on each measurement. A different theta is referring to the error given all the measurements. (this is clear, once we get different error-bars for each measurement…).

{"cards":[{"_id":"37244cef708d81881300001f","treeId":"36236ccd644e363790000017","seq":533221,"position":0.5,"parentId":null,"content":"Page 21-22\n"},{"_id":"37244d45708d818813000020","treeId":"36236ccd644e363790000017","seq":352071,"position":1,"parentId":"37244cef708d81881300001f","content":"![](http://i.imgur.com/gxq5Eul.png)\n\nL = Log(P)\nL1 = Taylor Series. ~= L\n-this part is a bit #unclear\nP ~= Exp(L`)\n\nGAUSSIAN EQUATION \n-look at the parts being exponentiated\n![](http://i.imgur.com/xpKBPFB.png)\n\nYou'll see that these equations are equal (because they're describing the same probability distribution?). THEREFORE: \n![](http://i.imgur.com/MDj9q93.png)"},{"_id":"36236cdd644e36379000001a","treeId":"36236ccd644e363790000017","seq":1,"position":1,"parentId":null,"content":"Chapter 2"},{"_id":"37249cc6708d818813000021","treeId":"36236ccd644e363790000017","seq":1,"position":2,"parentId":"36236cdd644e36379000001a","content":"Page 22- What we've done is re-derived the Gaussian Equation\n"},{"_id":"37249cf5708d818813000022","treeId":"36236ccd644e363790000017","seq":1,"position":1,"parentId":"37249cc6708d818813000021","content":"#unclear\n\n![](http://i.imgur.com/bGqPtag.png)\n\nNo. It makes a lot more sense in this context: \n\n![](http://i.imgur.com/jSp9pDt.png)"},{"_id":"3724fd17708d818813000024","treeId":"36236ccd644e363790000017","seq":1,"position":3,"parentId":"36236cdd644e36379000001a","content":"Section 2.2.1-Coin Tossing Example\n"},{"_id":"3724fda8708d818813000026","treeId":"36236ccd644e363790000017","seq":1,"position":1,"parentId":"3724fd17708d818813000024","content":"Last section => Generalized probability distributions. \n\nThis section => Plug in the Binomial Distribution with a uniform prior. \n\n"},{"_id":"372507ff708d818813000027","treeId":"36236ccd644e363790000017","seq":1,"position":2,"parentId":"3724fd17708d818813000024","content":"Here's where the Maximum likelihood is: \n\n![](http://i.imgur.com/LtGl5oc.png)"},{"_id":"37250d1a708d818813000028","treeId":"36236ccd644e363790000017","seq":1,"position":3,"parentId":"3724fd17708d818813000024","content":"** \nWe've solved for H0. We plug this into the formula's for the second derivatives (to get them in terms of H0). \n\n**This allows us to rewrite the equations like so\n **\n![](http://i.imgur.com/PcU1pUd.png)"},{"_id":"37273eec78b37f309e00000f","treeId":"36236ccd644e363790000017","seq":1,"position":4,"parentId":"3724fd17708d818813000024","content":"**This formulation has the following properties: **\n\n*Since Ho does not vary a lot after a moderate amount of data have been analysed, the numerator tends to a constant value; thus the width of the posterior becomes inversely proportional to the square root of the number of data, as can be seen in Fig. 2.1. This formula for the error-bar also confirms our earlier assertion that it is easier to identify a highly biased coin than it is to be confident that it’s fair (because the numerator in eqn (2.20) has its greatest value when Ho = 0.5). The posterior pdf for H , given the data 9 heads in 32 flips, and the Gaussian approximation to it resulting from the above analysis (H = 0.28 ± 0.08), are shown in Fig. 2.4.*\n\nMy own summary: \n0. We approximate the Gaussian distribution by using \"plain old\" taylor Series. \n1. We then do some algebra to show that THETA is inversely proportional to the square of the data-collected. `(WHICH IS A COOL RESULT!!). "},{"_id":"372ea2dadcc8ba714d000010","treeId":"36236ccd644e363790000017","seq":1,"position":4,"parentId":"36236cdd644e36379000001a","content":"Section 2.2.2. Assymetric Posterior PDFs. = Easy, but cool. "},{"_id":"372ea72fdcc8ba714d000012","treeId":"36236ccd644e363790000017","seq":1,"position":5,"parentId":"36236cdd644e36379000001a","content":"Section 2.2.3= Multi-modal PDFs. "},{"_id":"372ea7c1dcc8ba714d000013","treeId":"36236ccd644e363790000017","seq":1,"position":1,"parentId":"372ea72fdcc8ba714d000012","content":"** The idea of a **\n0. best estimate. (peak) (derivative =0)\n1. error-bar, (where?)\n2. or even a confidence interval, (integral)\n\n**is merely an attempt to summarize the posterior with just two or three numbers; sometimes this just can’t be done,**"},{"_id":"372eb0f2dcc8ba714d000014","treeId":"36236ccd644e363790000017","seq":1,"position":6,"parentId":"36236cdd644e36379000001a","content":"**Section 2.3- Gaussian Noise and Averages **"},{"_id":"372eb219dcc8ba714d000015","treeId":"36236ccd644e363790000017","seq":1,"position":1,"parentId":"372eb0f2dcc8ba714d000014","content":"0. We're measuring something, but we assume our measurements will be flawed according to a \"normal\"/gaussian distribution. \n-We prove the assumption later.\n1. \"Given a set of data {xk}, what is the best estimate of μ\"\n2. We're restrict ourselves to the easy case where we know theta. "},{"_id":"372ebc70dcc8ba714d000016","treeId":"36236ccd644e363790000017","seq":1,"position":2,"parentId":"372eb0f2dcc8ba714d000014","content":"We use Baye's Rule: \n0. Posterior = PDF of likelihood of true value. \n1. Prior = likelihood of true function. \n-We assign a uniform prior. (to express our ignorance.). (Knowing theta doesn't tell us u). \n-The uniform prior is chosen so that integrating gives us an area of 1. (this is done by assigning it to the reciprocal of the range.). \n2. Likelihood function = Probability of the {X_k's} given u. \n-Equals the product of the individual data-points (assuming independence of measurements.). "},{"_id":"372ed3d8dcc8ba714d000017","treeId":"36236ccd644e363790000017","seq":1,"position":3,"parentId":"372eb0f2dcc8ba714d000014","content":"Do some fancy Algebra, and get the following result for the LOG of the posterior: \n\n![](http://i.imgur.com/6abWuRA.png)"},{"_id":"372ee014dcc8ba714d000018","treeId":"36236ccd644e363790000017","seq":1,"position":4,"parentId":"372eb0f2dcc8ba714d000014","content":"We set the derivative = 0 to find the MLE. It turns out that this value is simply the average of the individual data-points. (which makes sense, since it's a Gaussian with a uniform prior.). "},{"_id":"373053e4dcc8ba714d000019","treeId":"36236ccd644e363790000017","seq":1,"position":5,"parentId":"372eb0f2dcc8ba714d000014","content":"We solve for the second-derivative (in order to solve for the value of THE ERROR BAR!! (NOT THETA!!)) (by using an equation we used earlier when we were deriving the Gaussian distribution). \n\n![](http://i.imgur.com/D7ATGlq.png)\n\nIN THE DERIVATION WE DID EARLIER, WE GOT THETA=ERROR-BAR. \n\nIN THIS DERIVATION, WE GET \nERRORBAR = THETA/SQRT(N)\n\nIN the earlier derivation, this also makes sense, because N = 1. \n\nACTUALLY, NO. I'm confused now: \n![](http://i.imgur.com/MlxlD9V.png)\n\nOkay, I figured it out. ONE THETA, is referring to the error on each measurement. A different theta is referring to the error given all the measurements. (this is clear, once we get different error-bars for each measurement...). "},{"_id":"37308617dcc8ba714d00001a","treeId":"36236ccd644e363790000017","seq":1,"position":7,"parentId":"36236cdd644e36379000001a","content":"Section 2.3.1 Data with Different sized error bars. (more general case of the earlier section.). "},{"_id":"37308927dcc8ba714d00001b","treeId":"36236ccd644e363790000017","seq":1,"position":8,"parentId":"36236cdd644e36379000001a","content":"**Example 3: the Lighthouse Problem**"},{"_id":"37308cfbdcc8ba714d00001e","treeId":"36236ccd644e363790000017","seq":1,"position":2,"parentId":null,"content":"CHAPTER 3"}],"tree":{"_id":"36236ccd644e363790000017","name":"Sivia","publicUrl":"sivia"}}