Consensus while judging?

For those who judge competitions, have you ever been on a flight where you could not come to a consensus with another judge?  I am not talking about adjusting a score up or down a few points.  I am talking about a score delta that is so large that the judges have to agree to disagree because no consensus score can be reached.

Yes, we brought in another judge from another table to break the tie. He was wrong too.

Yes, this occurs when one or more bullheaded judges don’t know what the hell they are talking about.  Happens a lot, unfortunately.  I am VERY much AGAINST competition organizers FORCING judges to compromise to come within a certain number of points of one another.  Why not allow each individual to speak for himself/herself!?  Taste and desire is a little subjective.  No one on Earth should be allowed to dictate to another person that what they perceive as desirable or undesirable is wrong.  Yes we have style guidelines for a reason.  No they are not perfect or all-encompassing.  And I am sorry, but I have seen Master judges make as many mistakes as Recognized judges.  We are all human and we should all be entitled to our own opinions.  Compromise drives me crazy.  Leave me alone to do my job the best way I know how based on my own experience, and let the entrant be the judge of the judges as to which one is on mark and which is on base (if any).

It must really be a hoot to organize one of these things.

disagree on which should win a gold medal to the point that a third judge has to come in?  sure.

waaay apart on score and neither can convince the other to budge?  this is where I’d expect myself and the other know-it-alls to come in and do some educatin’…but I’ve never seen it happen.  Sometimes it has happened when a new judge didn’t understand/didn’t like the style but always they admitted their score was too low or high.

So, the lowest is 13. The highest is 49. That’s only 36 points between hell and nirvana. How can they not be within 7 points of each other given the narrow margin?

I guess I have a lot more to learn than I thought.

I had a similar situation at a recent contest. I scored the beer at 41 and my judging partner scored it at 29…didn’t like the hop character. I’m not one to say I’m infallible, so I did drop my score and they compromised a bit too. That beer was eventually pushed to mini BOS and won the category.

An important thing is to make sure that beers that might be good enough to push, get their chance to shine in another forum. When you have large contests with multiple flights, there is a greater chance that other palates will have the opportunity to judge it. The beers in question, just need the opportunity!

I was judging ciders with a brand new guy who had zero experience. He thought everything was wonderful. In the real world he was correct, but we have guidelines. One one beer he refused to budge and I wasn’t changing. For some reason I remember it as him being low and we were 10 points apart. I rejected his score entirely and awarded my score. Later in the flight he got with the program, but we did not revisit that cider. I cannot remember if it placed or not in the flight, but I did flip the score sheet and explain what happened to the entrant.

FWIW - I like to be within 3 points.

Friday will be my first, if they need me. I’m looking forward to the learning experience.

May the Force be with you, padawan learner!

Here’s the situation.  The delta between the scores was almost twenty points.  It was one of those beers that people either loved or hated (50% of the non-flight judges who tasted the beer loved it whereas the other fifty percent thought that it should be dumped).  I was going to give the beer a courtesy score of 13 before I saw the other judge’s score sheet.  Our comments were so different that it made me believe that we must have tasted different beers.  I bumped my score up to 29, but there was no way that I was going to give a seriously flawed beer a forty.  The other judge would not budge.  The head judge was clearly uncomfortable judging the category.  He did not have an opinion one way or the other, so he adjusted his score up to move the beer on.  I finally reached the point were I told the head judge to throw out my score because there was no way that I was going to give the beer a score anywhere near forty.

I think the only fair thing to do in that situation is to bring in a third judge (or even a second pair of judges if they’re available). Otherwise there’s no real way to know if a flaw that one judge noted, or a positive element that one didn’t, is truly there.

It definitely happens. I entered two beers in the NHC last year that - reading through the pencil erasing - had scoring deltas of 10 and 13. Clearly something was going on between those two judges that goes beyond a minor difference in perception. If one judge calls a beer too malty, and the other says it’s too dry, changing the scores to bring them closer together doesn’t do anything to get me valid feedback on the beer.

As an entrant, I always treat the erased pencil marks as my true scores.  I sometimes wish that the judges wouldn’t erase so hard so that I could actually read their original scores.

I’m seeing a lot of “loved it” and not “thought it fit the style really well”
I would like to believe that is not a problem BJCP judges have, but I’d settle for finding out it is rare.

perhaps you could give us a bit more on the style in question and what was so poor about it that made you consider a 13, and eventually 29?

This is really the most important thing. Ensuring that a medal-worthy beer gets a chance at medaling is more important than worrying about scores.

Usually, if the guy I’m judging with cannot see eye-to-eye with me, I will come down on score with the caveat that the beer will be pushed to mini-BOS. I explain to them that the worst case for that situation is that if the beer is really as bad as they say it is (e.g. a 29 for ‘subdued hop aroma’ in an APA), then the beer will be kicked immediately. No harm, no foul. But if the beer was as good as I think it is, it should place in mini-BOS. I’m usually correct in pushing the “questionable” beer to mini-BOS (in that they usually medal at that point), but I’ve been wrong before and seen a beer I fought for get kicked quickly. Better to err on the safe side though!

Agree completely with Martin and Amanda! I tend to error on the side of caution if I am not sure.  Had a somewhat similar situation this weekend.  Thought we had a beer that was pretty good so I figured I would give it a chance in many BOS so that a few other palates could taste the beer and decide it’s fate.  It placed in the top 3 and will advance to Nationals.

If you gave it a 13 and the other guy gave it a 40, then you thought it was flawed and the other guy didn’t.  What flaw did you find?

I often have to force myself to be objective when judging IPAs as I really don’t like CTZ hops, but a lot of people do, so I have to try to be objective even when I feel like dumping the beer down the drain.

Also curious. 13 to 40 is a pretty large gap, but I’ve seen stranger things. :smiley:

On my BJCP tasting exam, there was a Belgian dubbel.  Very phenolic, tasted exactly like friggin Carmex.  I believe I even used the term “Carmex” on the tasting sheet.  As such, I scored it relatively low, in the 20s.  It was an otherwise okay dubbel, with the dark fruit flavors, etc., but I just couldn’t get past the Carmex.  Meanwhile the Master level proctors all loved it, scored it in the 40s, probably claiming that they loved the rich complex phenols.  = Carmex.  Yuck.  Of course as a result of this disagreement, my exam score was severely impacted, and I remain convinced that I was in the right and they were in the wrong.  I might only be Certified but I don’t care what level they were.  I don’t want friggin Carmex in any beer that I drink, thank you very much.  No way I would have changed my score upwards for that beer.  After the exam, I also came to find out that many of the other test-takers agreed with me.  If only we could have negotiated with those Master judges, perhaps we could have brought them down.  I wonder how many other takers got screwed that day.

I don’t know what the point of all this is, except perhaps to say, taste is subjective, and we should all be entitled to our own opinions.  I have very deep feelings against trying to force anyone to do otherwise.  We can and should compare notes, listen to reason, and adjust scores when appropriate.  However we should also respect those who refuse to budge if they feel very strongly one way or the other.  I think in those cases, we should just let the scoresheets ride as is, and yes, assume that the higher score is the correct one, in fairness to the entrant.

The beer was basically a science experiment that was entered as a specialty beer.  With no claimed “like” beer and no category guidelines to use in judging the beer, I judged the beer based on the ingredients, process, and bugs claimed on the entry form.  The beer had a really harsh middle of the tongue flavor that made it darn near undrinkable for me, which is why I contemplated giving it a 13 (my first score was actually in the low twenties).  I brought my score up because I wanted to reach a consensus.

As I brew mainly to study the behavior of brewing cultures (I have maintained a culture collection for most of the time that I have brewed), I am familiar with the flavors produced by the bugs claimed in the fermentation.  The harsh off-flavor was not a flavor that is produced by the any of the bugs claimed under normal circumstances.  The flavor was definitely produced by wild non-brewing microflora pickup, which is a flaw that would prevent any beer from scoring in the forties.