Page Title

Home Articles About Legal Notice Privacy policy Login Logout

For my first article, I chose the subject of craft beer to combine my interests in craft beer and artificial intelligence. I prefer fruity and hoppy craft beers, which are not so bitter and don't have much alcohol. I am curious to taste new exotic craft beers, especially on holidays abroad. I like Pale Ale's and prefer Session IPA's because they are fruity, hoppy, less bitter and they contain less alcohol than normal IPA's.

In this article, I applied linear regression, multiclass classification and unsupervised clustering to analyse craft beers based on two features, international bittering units (ibu) and alcohol by volume (abv). For the whole analysis Scikit-learn: Machine Learning in Python was used.

When choosing a new craft beer, I make my decision based on the international bittering units (ibu) and on the amount of alcohol (abv). Unfortunately, not all beers are labelled with ibu. But every beer is labelled with abv. So I have to found my decision simply on abv and of course a nice designed label also affect my decision 😅

My experience is that IPA's have more alcohol and are more bitter than other craft beers.

I found a data set about craft beers on Kaagle: "Hould, J.N. (2017, January). Craft Beers Dataset, Version 1.0. Retrieved 2022, January from https:// www.kaggle.com/nickhould/craft-cans." which is used to analyse craft beers.

The data set contains data on 2.410 craft beers. The features used for the analysis are alcoholic content by volume (abv), the international bittering units (ibu) and the style of the craft beer, like IPA, Pale Ale, and so on. Unfortunately, not all beers in the set have ibu data. Furthermore, there are many styles which are only represented by a few samples. I had to preprocess the data set and remove all beers without ibu and styles with sparse samples. After the preprocessing, the data set consists of 8 styles of craft beer with 38 samples each.

Figure 1 shows ibu plotted over abv for the 8 styles of craft beer. This is the basis for the following analysis. After the preprocessing, there are only American craft beer styles left. Therefore, I will omit the prefix 'American' when I speak about a style. Let's analyse Figure 1.

Fig. 1: 304 craft beers divided into 8 styles with 38 samples each.

Generally, the range for abv is from 0 to 1. 0 means no alcohol and 1 means pure alcohol. A value of abv=0.05 means 5% alcohol. The range for ibu for the analysed beers is from 0 to 160. I am not sure what is the maximum value for ibu, but craft beers with more than 1000 ibu are available. The amount of alpha acid released from hops defines ibu. A value of ibu=1 means that 1 liter of beer contains 1 mg of alpha acid.

Figure 1 shows that double IPA's surpass other craft beers in alcohol and bitterness units. They can have up to 10% alcohol and over 100 ibu. I never had a beer with over 80 bittering units and it tasted really bitter to me. One of my favourite style, Pale Ale, can have up to 60 ibu and 6% alcohol.

Generally, many parameters determine ibu, including the type and amount of hops. The temperature and duration of the boiling process also determine how much of the bitter substances from hops remain in the beer. Figure 1 shows that you can have an IPA with about 9% alcohol with ibu in the range from 60 to 130. But the tendency is that ibu increases with abv. This is also my experience. That's why I check abv if ibu are not labelled and avoid beers with over 7% alcohol. Above this limit, I rarely find a beer I like. So if you like craft beers that are less bitter, choose one with less alcohol.

I proved ibu increases with abv. A linear model can describe the relationship between ibu and abv. The input of the model is abv and the output is ibu. Figure 2 shows the data and the prediction with a linear model.

Fig. 2: Data and linear regression (black line)

To quantify the quality of the linear regression, the "R-squared" score is used as a test score. The "R-squared" score indicates how much variation of a dependent variable is explained by the independent variable. The best possible test score is 1.0, but it can also be negative for worse models. The test score for the trained linear model is 0.63. That means that only 63% of the variation of ibu can be predicted from abv. But that shows that the amount of alcohol is the dominating factor for bitterness.

The model is not very accurate with a test score of 0.63, but nevertheless it is practical in real world. You can input abv which is always labelled on a bottle and predict ibu, it's a point on the black line, and get an estimation. As a rule of thumb, ibu increases by 15 points for every percentage point more alcohol and vice versa.

In this chapter, different models and algorithms are used to classify the craft beers. What does it mean, to classify? The data are already classified into the 8 craft beer styles. The intention of classification is to separate data into classes by creating a region for every class.

The input for the classification are the feature's abv, ibu and the style and the output is an area for every style. See the results of the classification in Figure 3. The colour of the area corresponds to the respective style. The dark blue area is classified as Double IPA, the green area is classified as Pale Wheat Ale and so on.

1 / 10

Neural Network Model

2 / 10

Support Vector Machine.

3 / 10

Gaussian Process Classifier.

4 / 10

Linear Discriminant Analysis.

5 / 10

Quadratic Discriminant Analysis.

6 / 10

Decision Tree Classifier.

7 / 10

Gaussian Naive Bayes Classifier.

8 / 10

Linear classifier with SGD training.

9 / 10

Logistic Regression.

10 / 10

Nearest Neighbors Classifier.

❮ ❯

Fig. 3: Slideshow with 10 different algorithms used to classify craft beers.

Beside very popular Neural Networks there are many other algorithms and models worth checking out, which are faster to train and easier to interpret, like Support Vector Machine, Gaussian Process Classifier, Linear Discriminant Analysis, Quadratic Discriminant Analysis, Decision Tree Classifier, Gaussian Naive Bayes Classifier, Linear classifier, Logistic Regression and Nearest Neighbors Classifier. I wanted to compare these models and see how they perform. Go through the slideshow and check the results of the different algorithms. The test score is on top of the figure.

To quantify the quality of classification, the 'Accuracy score' is used as the test score. The best possible test score is 1.0 and the worst is 0.0. A test score of 1.0 means 100% of the samples were classified correctly. A test score of 0.0 means no sample was classified correctly.

The test scores for the used algorithms are in a similar range, from 0.44 and 0.56. That means that a maximum of 56% of the craft beers were classified correctly. Due to the structure of the data, the results are not very accurate. Because many styles are overlapping, it is not possible to separate all styles correctly. But the areas that don't have many overlapping styles, like IPA and Double IPA can be separated better. The model with the best test score of 0.56 is the neural network. But other algorithms like logistic regression and nearest neighbours show similar good results and are faster to train.

So what can we do with the trained classifiers? We can input abv, ibu and predict the style of the beer. Because usually the style is labelled on the bottle, it's not very useful in real life, unlike other applications like image classification.

So far, I used models and algorithms from the field of supervised learning for linear regression and for classification. For the last analysis, models and algorithms from the field of unsupervised learning were applied to separate craft beers by clustering.

The intention of clustering is similar to classification, to separate data into classes. The input are the feature's abv and ibu and the output is a predicted style for every craft beer. The difference is that the style is not an input feature. That means the algorithms can't train a model to fit best with the input data. They can't minimise the difference between the style and the predicted style. The algorithms don't know the style, they don't know the ground truth. That is why it is called unsupervised learning.

I was not familiar with algorithms from the field of unsupervised learning, so I wanted to compare different algorithms and see how they perform. Therefore, seven different algorithms were used for clustering: DBSCAN (Density-Based Spatial Clustering of Applications with Noise), OPTICS (Ordering Points To Identify the Clustering Structure), MeanShift Clustering, Agglomerative Clustering, Gaussian Mixture Clustering, K-means Clustering and Spectral Clustering.

The results of clustering are shown in Figure 4. The test score and the number of formed clusters is on top of the figure. Go through the slideshow and check the different algorithms.

1 / 7

K-means Clustering.

2 / 7

Agglomerative Clustering.

3 / 7

Gaussian Mixture Clustering.

4 / 7

MeanShift Clustering.

5 / 7

Spectral Clustering.

6 / 7

DBSCAN Clustering (Density-Based Spatial Clustering of Applications with Noise).

7 / 7

OPTICS Clustering (Ordering Points To Identify the Clustering Structure).

❮ ❯

Fig. 4: Slideshow with 7 different algorithms used to cluster craft beers.

Every formed cluster has a different symbol and colour. The original data were added as a small sub-figure to the figure. So it is easier to compare the styles in the data and the formed clusters. Furthermore, the clusters formed by DBSCAN and OPTICS contain black points. These points are considered as noise, or they can't be clustered. Except for DBSCAN, OPTICS and MeanShift, it is necessary to specify the number of clusters to form. To quantify the quality of clustering, the "Silhouette score" is used as the test score. The range for this test score is between -1 for incorrect clustering and +1 for highly dense clustering. Scores around zero indicate overlapping clusters.

Except for DBSCAN and OPTICS the test scores are similar and in the range from 0.61 to 0.68. The number of formed clusters is in the range from one to three. The majority of the algorithms form two clusters, with Double IPA's and IPA's being in one cluster and the remaining styles in the other.

I prefer the MeanShift algorithm because it doesn't need the number of clusters to form as an input parameter. That's helpful when you don't know the ground truth of the data. Furthermore, the formed clusters are more intuitive to me because the MeanShift algorithm puts the not overlapping data, the Double IPA's, in one cluster.

Let's answer the question of what AI would think about craft beer, from a perspective of an unsupervised cluster algorithm. Regression and classification models were trained to fit best to the input data. Clustering models don't know the style. They are not forced to fit best to the input data. It's up to them to structure the data. And for these models, only a maximum of three styles of craft beer exists.

The analysis of craft beer data shows, that by tendency, international bittering units increase with increasing alcohol by volume. This is the most important fact. This correlation can be simulated by a linear model.

Classification of craft beers with different algorithms leads to an accuracy in the range of 44% - 56%.

Unsupervised clustering of craft beers results in a maximum of three styles.

If you read the article to the end and still want to know more, check the Jupyter Notebook on my GitHub repo. There you will also find a bonus analysis of supervised clustering.

What Would AI Think

About Craft Beer?

Introduction

Outlook

The problem

What I actually know about craft beer

The Data

Predict the bittering units from alcohol by regression

Predict the craft beer style by classification

Clustering

Conclusion

Want to go deeper in detail?