Wording can sometimes be quite critical to the meaning. “Has not been shown experimentally” is quite different from “Has been shown experimentally not to.”
Exercise science is a difficult area for measurement. For example for pharmaceuticals, so much money is involved that there can and often will be thousands of subjects. This causes random error or “noise” (for example, the way the weight of persons will fluctuate somewhat even with no real change in the body) to largely cancel out.
But in exercise science, there is much less money, and as a result studies ordinarly use rather few subjects, such as 7 or 10. As a result, noise becomes a large factor.
For example, let’s consider fat loss, where let’s say 5 subjects in each group (treatment and placebo) are instructed to start an exercise program and keep diet the same for say 4 weeks, but diet is not controlled.
Just to have an example to look at, let’s say the placebo group changes in weight are +4, 0, +1, -4, and -1 lb. The total weight change among all 5 subjects is 0 lb. On the face of it, someone might conclude that exercise alone did nothing.
And let’s say the group receiving the treatment gets results of -4, -2, -5, 0, and +4 lb. The total weight change among all 5 subjects is -7 lb. Looks better than what actually happened with the placebo group, but do the results show that the treatment had any effect?
It’s clear even without grinding through a statistical analysis that we can’t conclude that.
Might it not have happened that by chance say the placebo group’s +4 lb subject hadn’t been chosen, or had been put into the group receiving the treatment, and instead another person like the -4 lb person had taken his place? That could have happened, couldn’t it?
If it had, then the total weight change among all 5 subjects would instead of 0 lb have been -8 lb. We really can’t say at all precisely what weight loss in the placebo group was caused by the exercise. Our measurement is of exercise effect plus noise.
So since the placebo + exercise group might plausibly have lost 8 lb (total), while the treatment + exercise group lost 7 lb total, clearly the treatment “has not been shown experimentally” to aid weight loss.
But might it? Sure. Maybe noise went the other way here. Perhaps if there had been very many subjects, almost none would have had a weight gain result, and there would have been many more cases of large weight loss. Maybe 2 or even 3 out of the 5 subjects in this group were individuals who would have gained considerable weight during the 4 weeks if they hadn’t had the treatment. But we don’t know. There is no way to know from a study like this with so few subjects and so much random variation.
To actually be able to measure this and have a meaningful result, we’d need a lot of subjects, and we might need much more than 4 weeks.
The above was just illustration.
Muscle mass gain is an even harder thing to measure for most exercise science researchers, because the subjects are typically capable of gaining muscle easily during the course of the study, the treatment period is short, and individual variation is very large.
So this is why even a real effect that is of use to those who, through long, dedicated, and intelligent training have gotten to where they are largely plateau’d, is usually undetectable in typical exercise science studies which measure muscle mass.
Protein synthesis, on the other hand, is much more readily measured to high statistical significance. So for example with leucine plus essential amino acids, http://ajcn.nutrition.org/content/94/3/809
More generally the science can be excellent reason to rule things out (where things are shown to not work, as opposed to not show whether they work) and to find promising leads and then try them out.
The “lab” of personal experience, where trained to the point where one knows pretty accurately what “would” happen if not taking the supplement, actually can be more meaningful than the typical study which is limited by the above factors. (A study which used athletes in a like situation and allowed sufficient time could be better yet, but there are very few such studies.)