• Register
PhysicsOverflow is a next-generation academic platform for physicists and astronomers, including a community peer review system and a postgraduate-level discussion forum analogous to MathOverflow.

Welcome to PhysicsOverflow! PhysicsOverflow is an open platform for community peer review and graduate-level Physics discussion.

Please help promote PhysicsOverflow ads elsewhere if you like it.


PO is now at the Physics Department of Bielefeld University!

New printer friendly PO pages!

Migration to Bielefeld University was successful!

Please vote for this year's PhysicsOverflow ads!

Please do help out in categorising submissions. Submit a paper to PhysicsOverflow!

... see more

Tools for paper authors

Submit paper
Claim Paper Authorship

Tools for SE users

Search User
Reclaim SE Account
Request Account Merger
Nativise imported posts
Claim post (deleted users)
Import SE post

Users whose questions have been imported from Physics Stack Exchange, Theoretical Physics Stack Exchange, or any other Stack Exchange site are kindly requested to reclaim their account and not to register as a new user.

Public \(\beta\) tools

Report a bug with a feature
Request a new functionality
404 page design
Send feedback


(propose a free ad)

Site Statistics

205 submissions , 163 unreviewed
5,075 questions , 2,226 unanswered
5,347 answers , 22,749 comments
1,470 users with positive rep
818 active unimported users
More ...

  Does this statistical inference make any sense?

+ 3 like - 0 dislike

I am trying to measure a quantity from two data sets which were from identical experiments except that the data were taken at different times. The measurement from the first data set gives \(a\pm\sigma_a\); the second data set gives  \(b\pm\sigma_b\) where  \(\sigma_a \sim \sigma_b \sim \sigma\) is the statistical uncertainty. But it turned out that  \(a-b > 3\sigma\), which means t-test shows that the two results are highly unlikely to be from the same distribution. If I combine the two results in the usual way by weighting them by the inverse of the squares of their uncertainties and calculate the final statistical uncertainty likewise, I will get  \((a+b)/2 \pm \sigma/\sqrt{2}\).  But \((a+b)/2 \pm \sigma/\sqrt{2}\) is not a satisfactory interpretation of my measurement. From a Bayesian statistics point of view, the result from the first data can be treated as the prior probability distribution of the parameter being measured. Updating the prior probability distribution by the posterior probability distribution, i.e. from the second data set, gives me entirely different final probability distribution of the parameter than the one described by \((a+b)/2 \pm \sigma/\sqrt{2}\) in the usual Frequentist 's way. We are exhausted trying to find the systematic variation between the two data sets which were supposed to give us the same result. What is the correct inference of the final result in both the ways, Bayesian and Frequentist? Or, how to present the final result in the case? Any kind of opinion in the form of comment or answer will be highly appreciated. Thanks.

asked Jun 23, 2014 in Experimental Physics by Nottherealwigner (135 points) [ revision history ]
edited Jun 24, 2014 by Nottherealwigner

As an experimenter, if I got a 3sigma difference between two supposedly identical initial conditions experiments I would question the "identical" first, i.e. check very carefully what has changed between the two times of taking data. If I could find no conceivable input change I would take a third data set . Playing with probability methods would not be my choice. Maybe the "error" is in the first data set.

Dear Ana, it seems like you did not read the complete question. I know what you mean but there is this important part in the question "We are exhausted trying to find the systematic variation between the two data sets". That means we have no alternative data as well. That's why it is an important question to ask in POF. Could you please comment on the whole question. 

You are saying there is no possibility of getting a third data set by a new experiment and you have to publish? I would treat the two data sets independently with their statistical and systematic errors and publish both values  in a similar way that the particle data group presents the values of different experiments , for example , the mass of the Z and let the reader decide.

Alternatively and maybe in parallel I would combine the two sets assuming that the 3 sigma is a fluke of statistics, stating clearly the matter . After all three sigma significance resonances have disappeared before. That is why we set the 5 sigma as definitive. ( I am using "resonances" as an example)

1 Answer

+ 3 like - 0 dislike

There are two possibilities: Either (1) you have seen a rare 3-sigma event, or (2) you have underestimated the uncertainties in your experiment. Making an accidental systematic change between the two measurements would count as an example of (2), i.e. a source of uncertainty that you didn't take into account.

Experience suggests that (2) is by far the likelier possibility.

Ideally you would figure out how and why you underestimated the uncertainty -- what aspect wasn't controlled properly or whatever. But if it's not terribly important and you're in a hurry and you can't do a third and fourth experiment, you can just say there is an additional source of uncertainty $\sigma_{other}$ which accounts for the "unknown unknowns".

Now your two measurements are:

\(a \pm \sigma_a \pm \sigma_{other}\)

\(b \pm \sigma_b \pm \sigma_{other}\)

You can guess \(\sigma_{other}\) by setting it to a value that makes the two measurements 1 sigma apart or so (a reasonably probable value). After you do that, you can say that your best guess for the real answer is something like \((a+b)/2 \pm (\sigma_{other}/\sqrt{2})\) (since I gather that \(\sigma_{other}\) is the dominant source of uncertainty). But you can't treat that expression too literally, it's just a very rough guess.

answered Jul 10, 2014 by Steve B (135 points) [ revision history ]
edited Jul 10, 2014 by Steve B

Your answer

Please use answers only to (at least partly) answer questions. To comment, discuss, or ask for clarification, leave a comment instead.
To mask links under text, please type your text, highlight it, and click the "link" button. You can then enter your link URL.
Please consult the FAQ for as to how to format your post.
This is the answer box; if you want to write a comment instead, please use the 'add comment' button.
Live preview (may slow down editor)   Preview
Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
Anti-spam verification:
If you are a human please identify the position of the character covered by the symbol $\varnothing$ in the following word:
Then drag the red bullet below over the corresponding character of our banner. When you drop it there, the bullet changes to green (on slow internet connections after a few seconds).
Please complete the anti-spam verification

user contributions licensed under cc by-sa 3.0 with attribution required

Your rights