A Solid update to Style+

Ben Sammis, June 09, 2015 -   

As Eno mentioned for last Friday's Beer Chat, last week we made a pretty significant update to Style+, and therefore to Solid% as well.

I've noticed that Solid% was fluctuating oddly before, but that seemed to be a consequence of a disparity between the number of beers a brewery offers and the number of those offerings appearing on our leaderboards.  Essentially, only the most popular beers from a number of small breweries were showing up, and as a consequence we were only considering the best and most popular beers they produced, artificially inflating their overall ranking.

Something different was happening this time though.  A little over a year ago, 221 breweries had achieved a Solid% of 100.  Last week, that number was 4215 (of 6081), or just over 69%.  I somehow find myself doubting that over two-thirds of breweries worldwide don't make a single beer that's below average for its style.

So what was causing the score inflation this time?  Style+ was.  The graph in the header image is the distribution of Style+ scores from 0 to 200.  At a glance it seems to be what we'd expect, a more or less normal distribution with lengthy tails, but a closer examination reveals that the vast majority of beers fall to the right of the nominal average score of 100.  Or to put it another way:

Over 90% of all beers in the database qualified as above average for their style.  Within each style there's some variation, but for the most part that 90% number was consistent across all styles.  Hell, many of the beers that qualified as above average within their style also have a negative BAR, which shouldn't have been possible.

After some investigation by fearless leaders Eno and Matt we discovered that even though Style+ was intended to be calculated independent of checkin quantity, we were averaging all checkins instead of the average score for each beer. So, we've done a bit of tinkering to the way in which the average for each style is calculated.  The original method was simply to total the scores of every checkin for that style, divide by the number of checkins and voila, average.  

However, it appears that the most frequently checked in beers for each style are also among the lower rated, artificially inflating the Style+ scores for all beers and the Solid% scores for all breweries.  Now, the average score for each beer is calculated and we then base Style+ on the median, rather than the average, of all beer scores within the style.  As of last Wednesday, the new numbers are live and they all make much more sense.  Now all styles have an average of 100 Style+, with an equal number of beers above and below that number.

Here's a histogram of the new Style+ distribution for IPAs.



There's still something a bit odd happening on the Solid% leaderboards, about which more will be forthcoming.  For now, this change to Style+ makes me very happy, because numbers that don't make sense give me hives.  Hopefully it makes the rest of you happy as well.

comments powered by Disqus