Skip to main content

Reimagining Cupping Competition Scoring, Part 3: Determining the Winner

Daily Coffee News photo/Nick Brown

(Editor’s note: This is Part 3 of an ongoing series exploring the start-to-finish development of a new cupping competition scoring system, to be used in practice at the Kona Coffee Cupping Competition in November. Two of the primary goals of the new system are objectivity and local adaptability. Part 1. Part 2.)

The first two parts of this series laid out some of the problematic issues with current competition scoring systems, and ways in which we might better define the winning coffee in a competition. With an understandable and clear definition, contest entrants and observers can better understand and engage with the competition. Also, having a definition facilitates a much more objective assessment of all the coffees entered in the contest. This post describes how the entrants can be assessed, and how to choose a winner of the contest.

In coming up with a definition of a winning coffee, specific characteristics must be selected, and the exact intensity enumerated. For example, if acidity is chosen as part of the definition, then it must have a specific intensity value, as well, such as 7. For the value of 7 to have any meaning, the intensity scale must be established and the judges must be trained on how to use it. Once they learn it, they can measure any other coffee using that scale, while removing variables such as how they may feel about the coffee, or how any given characteristic relates to another. When the judges are assessing competition coffees, all they have to do is score the intensity of each characteristic.

Ranking the coffees against the winner definition requires some simple algebra. Please note, the choice of algebraic functions is arbitrary and can be tweaked to fit the needs or wants of competition organizers. In the debut of this system for the Kona Coffee Cultural Festival cupping competition, we’re going discover the rankings by calculating the absolute difference of the scores for each characteristic, squaring those differences, then adding them together.

Here’s an example: Let’s say the definition of our winning coffee has three components: acidity = 7, body = 4, and floral = 9. A coffee that is assessed ends up having the following scores: acidity = 3, body = 2, and floral = 10. To calculate the ranking of the coffee, we’ll first calculate the absolute differences:

  • Acidity: |7 – 3| = 4
  • Body: |4 – 2| = 2
  • Floral: |9 – 10| = 1

Then, we square each of those numbers:

  • Acidity: 4² = 16
  • Body: 2² = 4
  • Floral 1² = 1

Finally, we add up the squares: 16 + 4 + 1 = 21. This coffee is 21 points away from the ideal, according to the winner definition. A perfect coffee would have no deviation from the definition as every characteristic would have an absolute difference of 0. The coffee with the lowest score, then, is the winner.

The reason we take the difference is because we’re interested in how similar the coffee is to the ideal. By subtracting each individual characteristic, we get a very real sense for the importance of each chosen characteristic. The numbers are then squared so as to penalize coffees that are aligned closely with the ideal on many characteristics but are very different in just one or two characteristics.

In our example, the acidities of the two coffees are quite different, whereas the other two characteristics are fairly close. One could imagine, with many more characteristics in the definition, that two coffees could attain a similar score but one coffee misses the definition by a point or two in each characteristic and the other coffee matches many characteristics but has one or two that are wildly off. In our competition, we value the coffees that are mostly aligned in all characteristics rather than ones with one or two large deviations, so we square the numbers. Another competition could just as easily not square the numbers before summing them.

In our example, floral was a characteristic. However, it is possible to choose only standard characteristics — acidity, sweetness and body, for example — and create two additional categories to help capture the gestalt of the experience: complexity and defects.

These categories would not be scored on intensity but instead on their absolute values. For example, the definition may include a complexity value of 3, which would mean three unique, positive descriptors — fruit, floral and caramel, for example — could be identified by the judges. Desired defects would almost certainly be set at 0, and every unique defect identified would add a point to the score. Weights could also be given to defect occurrence (Is it in one cup or all 5? Should defects incur a larger numerical penalty per occurrence?) as well as cut-offs for how many judges need to detect the descriptors and the defects.

In the next post, we plan to reveal the definition that the group came up with for our winning coffee. We’ll also talk a bit about how we’re going to train the judges.

Comment