2 Basics of R


HERE

R is a specialized programming language for data science. Though old, it is heavily supported by an active community. New tools for data handling, visualization, and statistical analysis are provided in the form of packages.1 While other programming languages, like Python or Julia, specialized for scientific computing, also lend themselves beautifully for data analysis, the choice of R in this book is motivated because R’s raison d’être is data analysis. Also, some of the R packages used in this book provide cutting-edge methods that are not as conveniently available in other programming languages (yet).

In a manner of speaking, there are two flavors of R. We should distinguish base R from the tidyverse. Base R is what you have when you do not load any packages. We enter the tidyverse by loading the package tidyverse (see below for information on how to do that). The tidyverse consists of several components (which are actually stand-alone packages that can be loaded separately if needed) all of which supply extra functionality for data analysis, based on a unifying philosophy and representation format. While eventually interchangeable, the look-and-feel of base R and the tidyverse is quite different. Figure 2.1 lists a selection of packages from the tidyverse in relation to their role at different stages of the process of data analysis.

Overview of selected packages from the tidyverse. The image is taken from [this introduction to the tidyverse](https://rviews.rstudio.com/2017/06/08/what-is-the-tidyverse/).

Figure 2.1: Overview of selected packages from the tidyverse. The image is taken from this introduction to the tidyverse.

The official documentation for base R is “An Introduction to R”. The standard reference for using the tidyverse is “R for Data Science (R4DS)”. There are some very useful cheat sheets which you should definitely check out! There are pointers to further material in Appendix A.

The learning goals for this chapter are:

  • become familiar with R, its syntax and basic notions
  • become familiar with the key functionality of the tidyverse
  • understand and write simple R scripts
  • be able to write documents in Rmarkdown

  1. Packages live in the official package repository CRAN, or are supplied in less standardized forms, e.g., via open repositories, such as GitHub.↩︎