Chapter 12 Handy Solutions

12.1 Tea solution

First ensure you have the "tea_df" loaded (remember your working directory will need to be in the correct location first). Also it needs to be preprocessed with the gsub()
function.
tea_df <- read.csv("Chapter_10-11/tea_consumption.csv", check.names=FALSE)
tea_df$lb <- tea_df$KG_LB_annual_per_capita
tea_df$lb <- gsub(pattern = ".*_", replacement = "", tea_df$lb)
colnames(tea_df)[3] <- "kg"
tea_df$kg <- gsub(pattern = "_.*", replacement = "", tea_df$kg)
tea_df$kg <- as.numeric(tea_df$kg)
tea_df$lb <- as.numeric(tea_df$lb)
Remember there are many ways to carry this out but here is one.
First create a vector with the names of the countries we want:
Set the row names to the countries for easy indexing:
Note: Row name must be unique which is the case here.
Create a data frame that only contains our countries of interest. We use the vector as an index for the rows.
Here because we are working with a temporary variable we will overwrite the kg column so the values only contain one decimal place
Last step is to print out the statement. We will use paste0()
which is exactly like paste()
but the sep =
option is set to ""
.
12.2 English speakers across the world solution

First make sure the data frame is created. Remember to set your working directory to where the file is.
english_df <- read.csv(
"Chapter_10-11/english_speaking_population_of_countries.tsv",
sep = "\t",
row.names = 1,
check.names = FALSE)
english_df[is.na(english_df)] <- 0
english_complete_datasets_df <-
english_df[
(english_df$`As first language` + english_df$`As an additional language`) ==
english_df$`Total English speakers`,
]
Create new data frame only containing countries with an eligible population of > 100 million.
english_100mil_df <- english_complete_datasets_df[
english_complete_datasets_df$`Eligible population` > 100000000,
]
Create column with fraction of total english speakers against population
english_100mil_df$`Fraction of population that are English speakers` <-
english_100mil_df$`Total English speakers` /
english_100mil_df$`Eligible population`
Create row with mean values
Create row with totals
Create the total fraction of english speakers
english_100mil_df["Total","Fraction of population that are English speakers"] <-
english_100mil_df["Total","Total English speakers"] /
english_100mil_df["Total","Eligible population"]
Write the data as a file