15 Final week (R markdown tutorial)

For our final week of instruction we read the Xuereb et al. paper in depth as we have at this point done so many similar analyses to what’s presented there. We thank the authors for providing a well documented and useful dataset.

15.1 R markdown tutorial

For your final assignments we’ve asked you to prepare your code in R markdown format. You may work in Jetstream for this class, but it may also be more convenient to work on your own laptop for this lesson and for making your final R markdown document for your final assignment.

This is a brief tutorial showing you how to use R markdown. There are many other ways to extend your R markdown skills including these guides here, and a helpful cheatsheet here here.

R Markdown comes pre-installed with RStudio, but if you’re using base R (not RStudio) you can install it with install.packages("Rmarkdown")

To launch an R Markdown script you go to the new file icon on the upper left of your screen and select File > New File > R Markdown…

This will open up a page as below.

You can leave it untitled for now and the output as HTML. So just click OK

You should now have a template that looks like this:

R markdown has two phases, a build or edit phase where you can write the kind of document you want, and a render phase where you turn that document into a pdf, or html file.

We’ll spend a little bit of time editing our R Markdown template and then we’ll use the package knitr to make our final document.

R markdown allows you to have separate sections, namely sections devoted to text where you can use html encoding to make headers or make text bold or italicized, and other sections that allow you to place code that R markdown will run each time you render your document.

For the normal text sections you can just start writing, note that you can modify headers or make words or phrases bold using the tips in the cheatsheet here

FOr the code sections you need to start each section with three ````which is a character you'll find with the~` or tilde button (top left on a PC keyboard). You also need to tell it what kind of code you’re writing so for R this will typically look something like:

You can now start editing your R markdown document using the example below.

15.2 Example with the baby name dataset

You can write an introduction in your R markdown to state that this is an R markdown tutorial using the baby names dataset and then include the following sections as {r, echo=T} code.

15.3 Load the packages and data


#load the two required packages

library(ggplot2)
library(babynames)
#> Warning: package 'babynames' was built under R version 4.2.1

#load the data
data("babynames")

15.4 Explore the data

By running the code below you will find that the babynames dataset has the names of baby’s each year if their name was given to at least 5 baby’s born in that year.


head(babynames)
#> # A tibble: 6 × 5
#>    year sex   name          n   prop
#>   <dbl> <chr> <chr>     <int>  <dbl>
#> 1  1880 F     Mary       7065 0.0724
#> 2  1880 F     Anna       2604 0.0267
#> 3  1880 F     Emma       2003 0.0205
#> 4  1880 F     Elizabeth  1939 0.0199
#> 5  1880 F     Minnie     1746 0.0179
#> 6  1880 F     Margaret   1578 0.0162

?babynames

15.5 Make a plot

This plot allows you to choose a name and then plots the title of the plot using a manipulation of the facet_wrap and theme function. For this plot you can choose any name you’re interested in seeing by changing it in the original subsetting command and it will also plot that name as the title of the plot.


ggplot(data=babynames[babynames$name == "Serena",], aes(x=year, y=prop, group=sex, color=sex))+
  geom_point()+
  facet_wrap(~name)+
  theme(strip.background = element_blank(),
        strip.text = element_text(hjust = 0, size = 14))

I always like to plot my name because we know the recent history around the peak in the early 2000’s (it wasn’t anything that I was up to!) but many congrats to Serena Williams (and Venus!) for Winning a gold medal in the 2000 summer olympics and starting a naming trend.

Now let’s render our R markdown by selecting Knit from the upper left options in Rstudio. Note that you will need to save your script somewhere to do this. Make sure you save it as an .Rmd file.

This should be what you see as an html file (with all your own modifications of course!)

You can now go back to your R markdown script and do the exercises below. Create a new header called exercises and start a new code chunk.

15.6 To export as a pdf

In order to export your RMarkdown file as a pdf from RStudio you will need to install a LaTex distribution. You can do this from your RStudio console with the following command:


tinytex::install_tinytex()

This will allow you to select Knit and pdf from the Knit pulldown menu.

15.7 To transfer your Rmarkdown files and submit your final assignment

Once you’ve generated an html file of your Rmarkdown script and have confirmed that the code for each plot shows up and generates the plot that it is intended to generate (this depends on the final video prompt that you select, some of which ask for more than one plot), you’re ready to transfer it from jetstream to your own computer.

The easiest way to transfer a relatively small file from jetstream is to use the firefox web browser to log into your CSULA (or other: gmail, hotmail, etc) email service and attach the files to an email that you can address to yourself (so you can for example review the .pdf file on your own computer). You can then upload it to canvas or submit it directly to Dr. Krug via email.

15.8 Exercises

Add these following exercises to the example RMarkdown that we’ve been working on. You can also test the html output and try emailing an html or pdf version to yourself through jetstream.

For the same name you choose for your plot above find out what year was the peak (maximum) number given. Does the maximum number match the year for which the proportion is greatest?

Solution


#when was the max? 

max(babynames[babynames$name=="Serena",]$n)
#> [1] 1504

# what year was it?

babynames[babynames$n==1504 & babynames$name =="Serena",]
#> # A tibble: 1 × 5
#>    year sex   name       n     prop
#>   <dbl> <chr> <chr>  <int>    <dbl>
#> 1  2000 F     Serena  1504 0.000754

# this matches the plot above! (not all names do though!)

15.9 Final tips and tricks

R markdown has a ton of flexibility. In fact the class website has been made in R markdown (via bookdown). For example if you’re curiuos about how to make the pull down solution option we use the code below:


<details><summary><span style="color: SeaGreen;">Solution</span></summary>
<p>


##the code you want hidden behind a "solutions" tab


</p>
</details>

Which gives something like this after you knit the document:

Solution

##the code you want hidden behind a "solutions" tab

You can see all the code for each week on our github