We Have Four Years of NHC Data. What Do You Want to Know?

Hey AHA community!

The AHA Data Task Force has been given access to NHC entry data from 2023 through the present, and we’re working toward a presentation at Homebrew Con 2026 titled “You’re Gonna Need a Bigger Spreadsheet: Analysis of NHC Data 2023 to Today”. Before we get too deep into the numbers, we want to hear from you.

As a small taste of what this data holds: the share of NHC entrants affiliated with a homebrew club has grown from 59.5% in 2023 to 83.5% in 2026. There’s a lot more where that came from.

But here’s where you come in. We need you to answer both of these:

  1. What would you like to know from the data?

  2. Why do you want to know it?

Both questions matter. The first tells us what to look at; the second tells us what you’re actually trying to understand or accomplish. Your answers will shape our analysis, the HBC presentation, and a potential future Zymurgy article. We’re looking forward to hearing what’s on your minds!

top ten styles by medals won as a percentage of submissions - why? identify which styles perform best when they are submitted

top 3 or 5 flaws across all ratings - why? know what is bad

top 5 most common positive comments on all ratings above 45 - why? know what is good

Thank you

1 Like

My mind immediately goes to Ray Daniels’ book “Designing Great Beers”. Common ingredients in gold medal beers by style.

1 Like

@freddthecat Thanks for laying those out so clearly, really useful framing for us.

On your first one, top styles by medals as a percentage of submissions, yes, we can absolutely do that. It’s one of the more revealing cuts of this data because raw medal counts are misleading on their own. A style that wins 15 medals but had 200 entries is a very different story from one that wins 8 medals out of 20 entries. Percentage of submissions gets at what you’re actually asking: where do your odds look best?

On the second and third, I want to be straight with you, judges’ comments are not in the dataset we have access to right now, so we can’t pull the top flaws or most common positive descriptors from scoresheets. I’m going to reach out to see if that data is something we can get our hands on. Fair warning though, even if we can get it, that kind of text analysis is a heavier lift than the numerical stuff, so no promises on timeline.

@cbeardsl That might be something for the AHA recipe data set analysis. Still worth doing in the long run but a different set of data than the BAP NHC data we’re working with.

Need to go back farther than that. Need to show Pre-Covid when entries were only $15 versus what they cost now. Also need to correlate low site numbers with the insane amount of first round sites. I would also say next time make this a presentation for the main stage instead of having it in a meeting room with a set number of chairs and also taking a speaking slot from someone that otherwise could have given an actual educational presentation.

@tj4336 Welcome to the forum! Congrats on your first post.

Our dataset starts at 2023, so the pre-Covid comparison is outside what we’re working with right now. Though some data exists in this article from 2019. - National Homebrew Competition 2019: Statistics and Trends - American Homebrewers Association

As for the presentation, we’re pretty confident there’s a rich and useful story in this data for anyone interested in competitive homebrewing. Hope you’ll come check it out.

I know what you mean.

re: comments, sorting them and presenting them.
I actually heard someone explaining developments in “fuzzy” searching using AI the other day. In fact it’s doable to you now depending on what AI you have access to - even the right free online one could work pretty well.

You want it to be able to search for concepts rather than exact words a la a search engine. ie. its searching for words in the ballpark of idk - “tropical aroma” and would include anything related to pineapple, coconut, mango, passion fruit, various other strength values for related terms like fresh, juicy, sweet, ripe and other layers of strength for other supporting things and assigning negative values for terms that oppose the concept of tropical aroma.

the presentation of this data could be in a bunch of ways, but yeah - it would be a neat project.

as a start are the judges’ comments in handwritten or digital format?