An Overview of Statistics

In this section, we will briefly discuss the goal of the overarching field of statistics and talk about some of its fundamental ideas. This conversation will set the context for the subsequent topics in this chapter and this book.

Generally speaking, statistics is all about working with data, be it processing, analyzing, or drawing a conclusion from the data we have. In the context of a given dataset, statistics has two main goals: describing the data, and drawing conclusions from it. These goals coincide with the two main categories of statistics — descriptive statistics and inferential statistics — respectively.

In descriptive statistics, questions are asked about the general characteristics of a dataset: What is the average amount? What is the difference between the maximum and the minimum? What value appears the most? And so forth. The answers to these questions help us get an idea of what the dataset in question constitutes and what the subject of the dataset is. We saw brief examples of this in the previous chapter.

In inferential statistics, the goal is to go a step further: after extracting appropriate insights from a given dataset, we'd like to use that information and infer on unknown data. One example of this is making predictions for the future from observed data. This is typically done via various statistical and machine learning models, each of which is only applicable to certain types of data. This is why it is highly important to understand what types of data there are in statistics, which are described in the next section.

Overall, statistics can be thought of as a field that studies data, which is why it is the foundation for data science and machine learning. Using statistics, we can understand the state of the world using our sometimes-limited datasets, and from there make appropriate and actionable decisions, made from the data-driven knowledge that we obtain. This is why statistics is used ubiquitously in various fields of study, from sciences to social sciences, and sometimes even the humanities, when there are analytical elements involved in the research.

With that said, let's begin our first technical topic of this chapter: distinguishing between data types.