I really shouldn't, but now any who continue with this post will be punished with the tedious explanation of how if the opposite occurred -- the statistics had the authors claiming from them that a real difference probably existed -- that in many cases would be unjustified as well.

Most scientists and readers believe that when a difference in effect is seen, and the statistical analysis says that the difference is to, for example, p < 0.05, this means that there is less than a 5% probability that random chance, rather than real effect, caused the measured difference in effect.

But that is not at all what it means! It has nothing to do with that.

Rather, it is dealing with a completely different matter: if we generated two sets of numbers differing from the same value, by chance alone, in the same manner that the test data was seen to randomly vary, what percent of the time would chance alone produce the apparent effect that was seen?

For example let's say we are medicinal chemists. We've taken on the really poor chance-of-success idea of randomly trying isolated compounds from previously unknown plants and seeing if they seem to increase the lifespan of rats.

The company we work for has been doing this for decades now. 1000 compounds have been run and, on giving really thorough investigation to each one that looked hopeful, only one has really panned out as solid.

But we keep going. We are now considering Compound X.

We ran it with, oh I don't know, 16 rats. They lived an average of 33 months with a standard deviation of, oh I don't know, 72 days.

Our established placebo results are average of 30 months, with standard deviation of, I don't know, say 65 days.

We run the statistics, which I'm not actually going to do, and hey!! That 3 month difference is statistically significantly different to p of say right exactly at 0.05! In other words, only 5% of the time would chance alone yield an apparent improvement as large as 3 months!

Hooray! Let's publish the article!

But wait a sec, is there really only a 5% probability that in the actual experiment, chance alone was the cause?

By no means. We already know that only 1 time in 1000 (roughly, as our best information) does a so-selected compound work to increase lifespan of rats.

But for every 1000 experiments we do, by chance alone on average 50 of them will turn up an apparent improvement as large as this!

So for every real effect, there are 50 others that will appear merely by chance but still looking just as solid and satisfying the p <= 0.05 requirement.

So the reality is that the chances are 50 to 1 against this compound actually having this effect, rather than a 95% chance that it does have this effect that most readers and scientists would assume.

This is far more important than one might think. This fundamental error results in a lot of nonsense being taken as being demonstrated.

When tens or hundreds of thousands of quite-unlikely-to-have-real-effect experiments are done each year in the world, and there most certainly are at least this many, there are inevitably vast numbers that clkaim outcomes as "statistically significant" but in fact the claimed effect does not exist.

Different situation from the article, as there it was claimed that a difference probably did not exist when that could not correctly be concluded from the data, but the same fundamental thing of coming to undemonstrated or wrong conclusions because of misunderstanding statistical fundamentals in ways that are practically epidemic among scientists, doctors, readers in general, etc.