2.4 Loops and maps

2.4.1 For-loops

For iteratively performing computation steps, R has a special syntax for for loops. Here is an example of an (again, stupid, but illustrative) example of a for loop in R:

# fix a vector to transform
input_vector <- 1:6

# create output vector for memory allocation
output_vector <- integer(length(input_vector))

# iterate over length of input
for (i in 1:length(input_vector)) {
  # multiply by 10 if even
  if (input_vector[i] %% 2 == 0) {
    output_vector[i] <- input_vector[i] * 10
  }
  # otherwise leave unchanged
  else {
    output_vector[i] <- input_vector[i]
  }
}

output_vector

## [1]  1 20  3 40  5 60

Exercise 2.12

Let’s practice for-loops and if/else statements! Create a vector a with 10 random integers from range (1:50). Create a second vector b that has the same length as vector a. Then fill vector b such that the \(i\)th entry in b is the mean of a[(i-1):(i+1)]. Do that using a for-loop. Note that missing values are equal to 0 (see example below). Print out the result as a tibble whose columns are a and b.

Example: If a has the values [25, 39, 12, 33, 47, 3, 48, 14, 45, 8], then vector b should contain the values [21, 25, 28, 31, 28, 33, 22, 36, 22, 18] when rounded to whole integers. The value in the fourth position of b (value 31), is obtained with (a[3] + a[4] + a[5])/3. The value in the first position of b (value 21) is obtained with (0 + a[1] + a[2])/3 and similarly the last value with (a[9] + a[10] + 0)/3. (Hint: use conditional statements if, if else and else to deal specifically with the edge cases (first and last entry in the vectors).)

a <- c(sample((1:50), 10, replace = T))
b <- c(integer(length(a)))

for (i in 1:length(a)){
  if (i == 1) {
    b[i] <- (sum(a[i:(i+1)])/3)
  } 
  else if (i == length(a)) {
    b[i] <- (sum((a[(i-1):i]))/3)
  }
  else {
    b[i] <- (mean(a[(i-1):(i+1)]))
  }
}

tibble(a, b)

## # A tibble: 10 × 2
##        a     b
##    <int> <dbl>
##  1    27  18.3
##  2    28  24  
##  3    17  28.7
##  4    41  28.3
##  5    27  26.3
##  6    11  13.3
##  7     2  19.7
##  8    46  27.7
##  9    35  43.3
## 10    49  28

2.4.2 Functional iterators

Base R provides functional iterators (e.g., apply), but we will use the functional iterators from the purrr package. The main functional operator from purrr is map which takes a vector and a function, applies the function to each element in the vector and returns a list with the outcome. There are also versions of map, written as map_dbl (double), map_lgl (logical) or map_df (data frame), which return a vector of doubles, Booleans or a data frame. The following code repeats the previous example which used a for-loop but now within a functional style using the functional iterator map_dbl:

input_vector <- 1:6
map_dbl(
  input_vector,
  function(i) {
    if (input_vector[i] %% 2 == 0) {
      return(input_vector[i] * 10)
    }
    else {
      return (input_vector[i])
    }
  }
)

## [1]  1 20  3 40  5 60

We can write this even shorter, using purrr’s short-hand notation for functions:¹¹

input_vector <- 1:6
map_dbl(
  input_vector,
  ~ ifelse(.x %% 2 == 0, .x * 10, .x) 
)

## [1]  1 20  3 40  5 60

The trailing ~ indicates that we define an anonymous function. It, therefore, replaces the usual function(...) call which indicates which arguments the anonymous function expects. To make up for this, after the ~ we can use .x for the first (and only) argument of our anonymous function.

To apply a function to more than one input vector, element per element, we can use pmap and its derivatives, like pmap_dbl etc. pmap takes a list of vectors and a function. In short-hand notation, we can define an anonymous function with ~ and integers like ..1, ..2 etc, for the first, second … argument. For example:

x <- 1:3
y <- 4:6
z <- 7:9

pmap_dbl(
  list(x, y, z),
  ~ ..1 - ..2 + ..3
)

## [1] 4 5 6

Exercise 2.13

Use map_dbl and an anonymous function to take the following input vector and return a vector whose \(i\)th element is the cumulative product of input up to the \(i\)th position divided by the cumulative sum of input up to that position. (Hint: the cumulative product up to position \(i\) is produced by prod(input[1:i]); notice that you need to “loop over”, so to speak, the index \(i\), not the elements of the vector input.)

input <- c(12, 6, 18)

map_dbl(
  1:length(input),
  function(i) {
    prod(input[1:i]) / sum(input[1:i])
  }
)

## [1]  1  4 36

Just for the record, we can achieve the same result also by ifelse(input_vector %% 2 == 0, input_vector * 10, input_vector).↩︎