Marine Genomics 2021
1
Introduction to Marine Genomics
1.1
Cloud computing versus individual laptops
1.2
Meet our instructors!
1.3
Schedule
2
Week 1- Welcome!
2.1
Introduction to shell computing via the data carpentry tutorial
2.2
How to access the shell
2.3
Week 1 Objectives
2.4
Navigating your file system
2.5
Shortcut: Tab Completion
2.6
Summary & Key Points
2.7
Navigating Files and Directories
2.8
Moving around the file system
2.9
Examining the contents of other directories
2.10
Full vs Relative Paths
2.11
Navigational shortcuts
2.12
Key Points
2.13
Creature of the Week!
3
Week 2-Working With Files
4
Before we start, make sure to clone or update the github folder MarineGenomics in the user directory
4.1
Our data set: FASTQ files
4.2
Wildcards
4.3
Command History
4.4
Examining Files
4.5
Details on the FASTQ format
4.6
Creating, moving, copying, and removing
4.6.1
Copying Files
4.6.2
Creating Directories
4.6.3
Moving / Renaming
4.6.4
File Permissions
4.6.5
Removing
4.7
Redirections
4.8
Searching files
4.9
Redirecting output
4.10
Writing for loops
4.11
Using Basename in for loops
4.12
Writing Scripts and Working with Data
4.13
Writing files
4.14
Writing scripts
4.15
Moving and Downloading Data
4.15.1
Getting data from the cloud
4.15.2
Uploading and Downloading Data to your Virtual Machine with scp - UNIX
4.16
Awk
4.17
Creature of the Week!
5
Week 3- What is a Genetic Variant?
5.1
To get started lets dowload the data and install a few programs
5.2
Raw read quality control
5.3
Trimming to remove adapters
5.4
Building an index of our genome
5.5
Map reads to the genome
5.6
sam to bam file conversion
5.7
Genotype likelihoods
5.8
Creature of the Week!
6
Week 4- The R environment
6.1
Lesson 1: Orientation to R
6.2
Manipulating a vector object
6.3
Operations act on each element of a vector:
6.4
Operations can also work with two vectors:
6.5
A few tips below for working with objects:
6.6
Practice R Operations
6.7
1.2 Characterizing a dataframe
6.8
How to access parts of the data:
6.9
Data Manipulation
6.10
Practice exploring a dataframe
6.11
1.3 Subsetting datasets & logicals
6.12
Practice Subsetting datasets/logicals
6.13
Creature of the Week!
7
Week 5- R Continued
7.1
2.1 Plotting
7.2
Scatterplots
7.3
Customizing your plot
7.4
Practice Problems 2.1
7.5
2.2 plotting with ggplot2
7.6
Practice Problems 2.2
7.7
For loops and the apply family of functions
7.8
apply family
7.8.1
sapply
7.8.2
apply
7.8.3
lapply – “list” apply
7.8.4
tapply - “per Type” apply
7.9
Exercise 2.3 apply and tapply
7.10
Creature of the Week!
8
Week 6- Principle Component Analyses
8.1
Download the data
8.2
Installing programs
8.3
Run pcangsd on our data
8.4
Setting up a new project in R
8.5
Reading data into R
8.6
Exercises
8.7
Creature of the Week!
9
Week 7- F
st
and outlier analysis
9.1
Download the data
9.2
Getting R set up
9.3
Finding outliers using pcadapt
9.4
pcadapt Exercises
9.5
Using F
st
to find outliers
9.6
OutFLANK Practice
9.7
Creature of the Week!
10
Week 8: Genome wide association study (GWAS)
10.1
load in the data
10.2
install packages in R
10.3
install angsd and pcangsd (again?)
10.3.1
install angsd
10.3.2
for pcangsd
10.4
The data
10.5
take Beagle file and generate lrt file
10.6
take lrt file and make a manhattan plot
10.7
Exercises
10.8
Creature of the Week!
11
Week 9- Population Structure using NGSadmix
11.1
Download the data
11.2
Using NGSadmix
11.3
ploting stucture in R
11.4
How do we know which K to pick?
11.5
Exercises
11.6
Creature of the Week!
12
Week 10- Genomics is a BLAST
12.1
Genome Annotation and BLAST searches
12.2
Download the data
12.3
Install Transdecoder
12.4
Run Transdecoder
12.5
Run NCBI-BLAST
12.6
What did we find?
12.7
The importance of Genomic Architecture
12.8
Exercises
12.9
Creature of the Week!
Published with bookdown
3
Week 2-Working With Files
<<<<<<< HEAD