An interesting read over at quartz focused on p values and the t test. Researchers are proposing using 0.005 instead of 0.05 as the standard goal for a p value to fix a lack of repeatability in research. Certainly make a lot of Brulosophy, Experimental Brewing, etc. tests “non significant”, although I’d never be able to convince people at work to use 0.005 instead anyway. A fun little insight into beer/Guinness history anyway.
I read that last week and I don’t know if I’d feel comfortable, in our conditions, moving to that much lower a standard. I also feel like the homebrew community use of the p-value is at least somewhat close to Gosset’s original use case.
This makes sense only if you can get like 500-1000 people sampling each experiment. For those who can only get 20-30 people, a p value of like 0.1-0.2 would actually be more reasonable IMHO, and then you can say “well we truly can only place like 80-90% confidence in our results, due to such a low population of samplers… many more experiments with higher populations will be necessary to help support or refute results”.
+10 ^
I’m leaning toward ignoring p value for citizen science experiments.
I lean towards keeping p-value, with better education and emphasis of what it does or doesn’t mean, and frequent reminders of the various caveats associated with any experiment. Although p-values are not the end-all, be-all of science, they are a consistent way to interpret numerical results. I would argue that results from any experiment are virtually meaningless without some form of quantification. Now the flip side of that is just because you throw a number on it doesn’t mean that number means anything…but I also think that if we aren’t really willing to take homebrew experiments seriously, they’re not really providing anything useful. We’re then back to the school of “meh, I’ll do what I want” which I guess works in some cases but can result in a lot of wasted time, money, and beer. Which is really why I (as a homebrewer and as a scientist) find homebrew experiments useful–I want to evaluate if step A or ingredient Z or equipment R is worth the trouble and expense!
So where should you ignore (or be extremely cautious with p-value)? If a power test says your sample size ain’t big enough to reliably detect a difference, you’re probably safe to ignore. Which to me means just conduct more experiments or recruit more tasters!
I guess the older I get and the more experiments I do, the less I think of it as “real” science. So in that regard, I question the value of the p value.
Like Denny, I am old, but unlike Denny, I do not screw around or care about experiments.
However, when I hit a home run, I am willing to share the details.
BREW PACMAN’S BLACK WIDOW KOLSCH, ONLY SWAP OUT THE HONEY MALT FOR MELANOIDIN.
If one brewer pays attention to the above, then my work is done.
Like Denny, I am old, but unlike Denny, I do not screw around or care about experiments.
However, when I hit a home run, I am willing to share the details.
BREW PACMAN’S BLACK WIDOW KOLSCH, ONLY SWAP OUT THE HONEY MALT FOR MELANOIDIN.
If one brewer pays attention to the above, then my work is done.
For reference: Black Widow Kolsch (All Grain) 6