This month, I learning machine learning in R through this DataCamp track. In the “Supervised Learning for Classification” course, the second track is Bayesian Methods, my topic for this week.
This week’s learnings:
- Bayesian statistics is based on estimating probabilities based on past information.
- When one event is predictive of another event, they are considered dependent. For example, the probability of me sleeping will vary widely based on the time of day. Conditional probability formulas can be used to mathematically describe this relationship.
- The Naïve Bayes package in R allows for a Y ~ X model that can be used to make predictions and provide probabilities.
Thankfully, R has made it very simple to do Naïve Bayes. We load in one package and set up our model. Also, we can easily view predictions and predicted probability.
# Load the naivebayes package
>library(naivebayes)
# Build the location prediction model
>locmodel = naive_bayes(location ~ daytype, data = where9am)
# Predict next Thursday’s location based on prior 90 days data
>predict(locmodel, thursday9am)
# Obtain the predicted probabilities for Thursday at 9am
>predict(locmodel, thursday9am , type = “prob”)
Another week, another introduction to a machine learning model. On to the next.