Predicting Sports Winners with Decision Trees

In this chapter, we will look at predicting the winner of sports matches using a different type of classification algorithm to the ones we have seen so far: decision trees. These algorithms have a number of advantages over other algorithms. One of the main advantages is that they are readable by humans, allowing for their use in human-driven decision making. In this way, decision trees can be used to learn a procedure, which could then be given to a human to perform if needed. Another advantage is that they work with a variety of features, including categorical, which we will see in this chapter.

We will cover the following topics in this chapter:

  • Using the pandas library for loading and manipulating data
  • Decision trees for classification
  • Random forests to improve upon decision trees
  • Using real-world datasets in data mining
  • Creating new features and testing them in a robust framework