1.2 Course structure

The course consists of five parts. After giving a more detailed overview of the course in this chapter, Part I introduces R, the main programming language that we will use. Part II covers what is often called descriptive statistics. It also gives us room to learn more about R when we massage data into shape, compute summary statistics, and plot different data types in different ways. Part III covers the basic theoretical concepts of Bayesian data analysis. Part IV introduces regression modeling. Part V introduces basic ideas from frequentist data analysis and compares the frequentist and the Bayesian approach.

A number of characteristic features distinguishes this course from the bulk of its cousins out there.

  1. First, we use a model-centric approach, i.e., we are going to explicitly represent and talk about statistical models as a formalized set of the assumptions which underly a specific analysis.
  2. Second, we will use a computational approach, i.e., we foster an understanding of mathematical notions with computer simulations or other variants of helpful code.
  3. Third, this course takes a dual approach in that it introduces both the frequentist and the Bayesian approach to statistical inference. We will start off with the Bayesian approach, because it is arguably more intuitive. Yet, a model-centric Bayesian approach also helps with understanding basic concepts from the frequentist paradigm.
  4. Fourth, the course focuses on generalized linear models, a class of models that have become the new standard for analyses of experimental data in the social and psychological sciences. They are also very useful for data exploration in other domains (such as machine learning).

There are also appendices with additional information:

  • Further useful material (textbooks, manuals, etc.) is provided in Appendix A.
  • Appendix B covers the most important probability distributions used in this book.
  • An excursion providing more information about the important Exponential Family of probability distributions and the Maximum Entropy Principle is given in Appendix C.
  • The data sets which reoccur throughout the book as “running examples” are succinctly summarized in Appendix D.
  • Appendix E surveys and motivates ideas for good scientific practice and principles of open science.