2.5 Piping

When we use a functional style of programming, piping is your best friend. Consider the standard example of applying functions in what linguists would call “center-embedding”. We start with the input (written inside the inner-most bracketing), then apply the first function round, then the second mean, writing each next function call “around” the previous.

# define input
input_vector <- c(0.4, 0.5, 0.6)

# first round, then take mean
mean(round(input_vector))
## [1] 0.3333333

Things quickly get out of hand when more commands are nested. A common practice is to store intermediate results of computations in new variables which are only used to pass the result into the next step.

# define input
input_vector <- c(0.4, 0.5, 0.6)

# rounded input
rounded_input <- round(input_vector)

# mean of rounded input
mean(rounded_input)
## [1] 0.3333333

Piping lets you pass the result of a previous function call into the next. The magrittr package supplies a special infix operator %>% for piping.12 The pipe %>% essentially takes what results from evaluating the expression on its left-hand side and inputs it as the first argument in the function on its right-hand side. So x %>% f is equivalent to f(x). Or, to continue the example from above, we can now write:

input_vector %>% round %>% mean
## [1] 0.3333333

The functions defined as part of the tidyverse are all constructed in such a way that the first argument is the most likely input you would like to pipe into them. But if you want to pipe the left-hand side into another argument slot than the first, you can do that by using the . notation to mark the slot where the left-hand side should be piped into: y %>% f(x, .) is equivalent to f(x, y).

Exercise 2.14

A friendly colleague has sent reaction time data in a weird format:

weird_RTs <- c("RT = 323", "RT = 345", "RT = 421", "RT = 50")

Starting with that vector, use a chain of pipes to: extract the numeric information from the string, cast the information into a vector of type numeric, take the log, take the mean, round to 2 significant digits. (Hint: to get the numeric information use stringr::str_sub, which works in this case because the numeric information starts after the exact same number of characters.)

weird_RTs %>%
  stringr::str_sub(start = 6) %>%
  as.numeric() %>%
  log %>%
  mean %>% 
  signif(digits = 2)
## [1] 5.4

2.5.1 Excursion: More on pipes in R

When you load the tidyverse package the pipe operator %>% is automatically imported from the magrittr package, but not the whole magrittr package. But the magrittr package has three more useful pipe operators, which are only available if you also explicitly load the magrittr package.

library(magrittr)

The tree pipe %T>% from the magrittr package passes over to its RHS whatever it was fed on the LHS, thus omitting the output of the current command in the piping chain. This is particularly useful for printing or plotting intermediate results:

input_vector <- c(0.4, 0.5, 0.6)
input_vector %>%
  # get the mean
  mean %T>%
  # output intermediate result
  print %>%
  # do more computations
  sum(3)
## [1] 0.5
## [1] 3.5

The exposition pipe %$% from the magrittr package is like the base pipe %>% but makes the names (e.g., columns in a data frame) in the LHS available on the RHS, even when the function on the RHS normally does not allow for this. So, this does not work with the normal pipe:

tibble(
  x = 1:3
) %>%    # normal pipe 
  sum(x) # error: object `x` not found

But it works with the exposition pipe:

tibble(
  x = 1:3
) %$%    # exposition pipe makes 'x' available
  sum(x) # works!
## [1] 6

Finally, the assignment pipe %<>% from the magrittr package pipes the LHS into a chain of computations, as usual, but then assigns the final value back to the LHS.

x <- c(0.4, 0.5, 0.6)
# x is changed in place
x %<>%
  sum(3) %>%
  mean
print(x)
## [1] 4.5

Base R has introduced a native pipe operator |> in version 4.1.0. It differs slightly from the magrittr version, e.g., in that it requires function brackets:

1:10 |> mean    # error!
1:10 |> mean()  # 5.5

You can read more about the history of the pipe in R in this blog post.

2.5.2 Excursion: Multiple assignments, or “unpacking”

The zeallot package can be additionally loaded to obtain a “multiple assignment” operator %<-% which looks like a pipe, but isn’t.

library(zeallot)

It allows for several variables to be instantiated at the same time:

c(x, y) %<-% c(3, "huhu")
print(x)
## [1] "3"
print(y)
## [1] "huhu"

This is particularly helpful for functions that return several outputs in a list or vector:

input_vector <- c(0.4, 0.5, 0.6)
some_function <- function(input) {
    return( list(sum = sum(input), mean = mean(input)))
}
c(x, y) %<-% some_function(input_vector)
print(x)
## [1] 1.5
print(y)
## [1] 0.5

  1. The pipe symbol %>% can be inserted in RStudio with Ctrl+Shift+M (Win/Linux) or Cmd+Shift+M (Mac).↩︎