1 Introduction to Marine Genomics

Welcome to Marine Genomics! This is a course on learning how to analyze genomic data for undergraduate students. We focused on marine organisms but these lessons can easily be applied to any taxa. We required that students have a background in biology and have completed the general biology series. Additionally, it is helpful if students have taken a course in Evolution, though this was not a requirement. We did not require any coding experience in R or Bash/UNIX.

We have now taught this course several times. There is a 15 WK SEMESTER VERSION here. And our version from Spring 2022 here

For most weeks, our course consists of one 1.5 hour lecture and one 50 minute coding session per week over a 10 wk quarter system. The lectures begin with a power point presentation where we introduce the weeks concept (for example, PCAs) and then are followed by a live coding demonstration carrier out by the instructor. The coding sessions provide an opportunity for the students to either repeat the same coding demonstration carried out during the lecture, or to extend the demonstration via coding exercises which we have written for each week and for which the solutions are available on our website (under each exercise).

The first 5 weeks are primarily training in bash/UNIX and R basics and are structured as 2 lectures a week that are broken up with exercises for the students to practice each skill they are shown in live-coding demonstrations. The exersices typically take 5-15 minutes to complete. The remaining weeks are focused on perfoming a specific analysis (PCA, GWAS, Fst outlier analyses, etc).

We provide all the data for the students to complete each weeks coding activies along with instructions for how to download the data. Additionaly, the students found the website a phenomenally helpful resource.

1.1 Cloud computing versus individual laptops

This course was designed to be carried out on the NSF funded Extreme Science and Engineering Discovery Environment (XSEDE) cloud computing resource Jetstream.

Jetstream is a unique user-friendly cloud computing resource in that it provides the students with an environment with root privileges, making program installation less complicated. Jetstream also provides a Desktop simulator which allows students to transition between bash and R using the same exact directory paths as the instructor, reducing the confusion of students having unique file paths and providing a more streamlined approach to what for many students is their first coding class.

Jetstream educational allocations can be applied for through XSEDE. The process of applying for a jetstream allocation for educational purposes is relatively straightforward and simply requires a course description, the number of students anticipated to participate, and a calculation of the computing resources required, and the instructors CV. Allocations are typically approved in two to three days and can be submitted ~2 weeks prior to the start of the course.

Alternatively, these course materials can be used on students and instructors individual laptops. The file paths will need to be modified and we cannot guarantee that the installation of programs and R packages will proceed as they did on Jetstream. Thus, progam installation will need to be verified for each week by the instructor.

1.2 Meet our instructors!

Rachael Bay, Assistant Professor, co-instructor ()– Website.

Serena Caplins, Postdoc, lead instructor and course designer ()– Website

Vanessa Guerra, Postdoc, co-instructor ()

Madison Armstrong, Graduate student, Teaching Assistant, creature of the week content generator and website designer ()– Website

1.3 Schedule

Lecture Tuesdays 12:10-1:30, Coding session Thursdays 12:10-1:00

Office hours-Serena Caplins: Thursday 1:10-2:00 PM