Downloading the example code

You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you could visit http://www.packtpub.com/support and register to have the files e-mailed directly to you. I've also setup a GitHub repository that contains a live version of the code, along with new fixes, updates and so on. You can retrieve the code and datasets at the repository here: https://github.com/dataPipelineAU/LearningDataMiningWithPython2

You can read the dataset can by looking at each row (horizontal line) at a time. The first row (0, 1, 0, 0, 0) shows the items purchased in the first transaction. Each column (vertical row) represents each of the items. They are bread, milk, cheese, apples, and bananas, respectively. Therefore, in the first transaction, the person bought cheese, apples, and bananas, but not bread or milk. Add the following line in a new cell to allow us to turn these feature numbers into actual words:

features = ["bread", "milk", "cheese", "apples", "bananas"]

Each of these features contains binary values, stating only whether the items were purchased and not how many of them were purchased. A1 indicates that at least 1 item was bought of this type, while a 0 indicates that absolutely none of that item was purchased. For a real world dataset, using exact figures or a larger threshold would be required.