rmarkdown
in order to ‘knit’ the HTML output.The first part of the homework should help you to get comfortable with Rmarkdown. Then we focus more on some basics that you have learned so far in R.
In the following we will work with some “scientifically reasonable data” concerning the properties of some selected brain regions. The variables included in this data set are: “brain region”, “average cortical thickness”, “surface area” and “hemisphere” (left or right). The following graphic depicts where the selected brain regions are located:
Desikan Killiany atlas
Your first task is to start creating the Rmarkdown document, namely to:
# Task n
(Note: Use the provided cheat sheets for RMarkdown from RStudio to look up how to make headers, create a table, ect. in Rmd))
Variable | Variable type | Values |
---|---|---|
brain regions | ? | (?, Precentral, Lateral Occipital, Transverse temporal, Temporal pole ) |
surface area | ? | (5941.8, 4718.8, 4672.9, 799.48, 443.3) |
thickness | ? | (2.59, 2.74, 2.3, 2.52, 3.66) |
hemisphere | ? | (?, ?, L, R, R) |
After having introduced the variables included in the data set, let us work with the data.
Before starting to work in R, we should load the relevant packages that we need in order to work with the data.
tidyverse
(if you have not already installed the package you have first to install them by using install.packages("tidyverse")
)< R CODE HERE >
Now, we can start to create a tibble in R.
brain_data
brain_regions
, surface_area
, thickness
and hemisphere
,brain_regions
a factor (when creating the tibble, use the keyword factor
in the creating of this column)(Hint: your tibble should have 4 columns and 5 rows eventually.)
< R CODE HERE >
Oh no, we have done a little mistake! The brain region “Transverse temporal” should actually be represented as “Banks superior temporal”. We have to change this level in the factor brain_regions
.
Give R Code that
brain_regions
into the right value “Banks superior temporal”fct_recode
from the purrr
package (you can access the factor/colum using brain_data$brain_regions
)brain_data
again.The column brain_regions
should now have the value “Banks superior temporal” instead of “Transverse temporal”.
< R CODE HERE >
Using indexing (like in a matrix), extract the thickness
for the brain region “Precentral” from the tibble.
< R CODE HERE >
Consider again the data set. Suppose you come across the information, that you can calculate the volume of brain regions with the given variables by the formula: volume = surface_area * thickness.
Therefore, you decide to create a new tibble, named brain_data2 from scratch.
volume
whose values are calculated by multiplying surface_area
with thickness
< R CODE HERE >
Why would it not work to use the same code to create a data frame instead of a tibble?
< ANSWER HERE >
In what follows, we’ll play around with brain_data2
to get used to R.
Write R code to
brain_regions
from the dataregions_v1
and< R CODE HERE >
Write R code that returns the data type of regions_v1
?
< R CODE HERE >
Suppose you do not know the values of the variable regions_v1
but you want to know how many brain regions are included. Write some R code that returns the length of the vector regions_v1
.
< R CODE HERE >
Let us consider now again the whole data set brain_data2
. Give R code which returns the number of different elements in the vector stored in column hemisphere
. For this, you want to first call the function unique
(which returns a vector with all and only unique elements) and then the function length
(to get the length of that vector). Use the pipe operator %>%
to do this.
< R CODE HERE >
Describe in no more than 10 words what the goal of this code is
brain_data2[str_detect(brain_data2$hemisphere, "R"),"brain_regions"]
< ANSWER HERE >
Let us consider again the variable volume
in the data set brain_data2
. We can calculate the variable volume
also by using a function. Which we will do now.
13.1 First,
surface_area
from the data set brain_data2
and store it as variable x
thickness
and store it as y
< R CODE HERE >
13.2 Second,
x
and y
and gives as output the multiplication of bothvolume_calc
< R CODE HERE >
13.3 Third,
volume_compare
; the two list elements arevolume1
, which is the column volume
from the data set brain_data2
andvolume2
, which is the volume calculated by your function< R CODE HERE >
Suppose that we have the chance to measure the volume of each brain region directly. Here are the directly measured values for volume:
14.1 Create a new variable volume_true
with the given values.
< R CODE HERE >
14.2 Extract volume1
from your created list volume_compare
and store it as volume_calculated
< R CODE HERE >
14.3 Write a new function that takes as
x
and y
andvolume_diff
< R CODE HERE >
Create a tibble with two variables: brain_regions
(use regions_v1
) and calculated_volume
, which indicates the difference between the variables volume_calculated
and volume_true
(make use of your custom-built funtion). Store the tibble as “brain_data3” and print it.
< R CODE HERE >
Look at the following formula and the output that it returns. What does this formula do? Can you give a question for which this formula returns the answer?
Hint: Use for example the help-possibility in R in order to understand what which.min
does.
brain_data3[which.min(brain_data3$calculated_volume),1]
## # A tibble: 1 x 1
## brain_regions
## <chr>
## 1 Rostral middle frontal
Here’s a vector with some interesting names:
family <- c("Gomez", "Morticia", "Pugsley", "Wednesday", "Uncle Fester", "Grandma")
Use the function map_chr
(which is an iterator that returns a character vector) and a custom-made anonymous function which uses the function str_c
(which concatenates strings) to append the string " Adams"
to all family members, except Uncle Fester and Grandma. The output should look like this:
## [1] "Gomez Adams" "Morticia Adams" "Pugsley Adams" "Wednesday Adams"
## [5] "Uncle Fester" "Grandma"