Labeling data points

In this recipe, we will learn how to label individual or multiple data points with text.

Getting ready

For this recipe, we don't need to load any additional libraries. We just need to type the recipe in the R prompt or run it as a script.

How to do it...

Let's say we want to highlight one data point in the cars' scatter plot that we used in the previous few recipes. We can label it using the text() command:

plot(mpg~disp, data=mtcars)
text(258,22,"Hornet")

How it works...

In the preceding example, we first plotted the graph and then used the text() function to overlay a label at a specific location. The text() function takes the x and y coordinates and the text of the label as arguments. We specified the location as (258,22) and the label text as Hornet. This function is especially useful when we want to label outliers.

There's more...

We can also use the text() function to label all the data points in a graph instead of just one or two. Let's look at another example where we wish to plot the life expectancy in countries versus their health expenditure. Instead of representing the data as points, let's use the name of countries to represent the values. We will use the HealthExpenditure.csv example dataset:

health<-read.csv("HealthExpenditure.csv",header=TRUE)
plot(health$Expenditure,health$Life_Expectancy,type="n")
text(health$Expenditure,health$Life_Expectancy,health$Country)

We first use the plot() command to create a graph of life expectancy versus expenditure. Note that we set type equal to "n", which means that only the graph layout and axes are drawn but no data points are drawn. Then, we use the text() function to place country names as labels at the x-y locations of all the data points. Thus, text() accepts vectors as values for (x, y) and labels in order to dynamically label all the data points with the corresponding country names. If the text labels overlap, we can use the jitter() function or remove some labels to reduce the overlap.