2 Week 1- Welcome!

Welcome to Introduction to Genomics in Natural Populations at UC Davis!

This class has code modified from the Data Carpentry’s introduction to the shell tutorial, which can be found here https://datacarpentry.github.io/shell-genomics/01-introduction.html This tutorial is also modified from previous tutorials compiled by Serena Caplins: https://baylab.github.io/MarineGenomics/

2.1 Main Objectives

  • Take Pre-class assessment (link on syllabus)
  • Introduction to genomics & shell computing (via slideshow here: Week 1 Slides
  • Learn how to access terminal via Farm OnDemand
  • Learn how to interact with the command line interface and move around in your file system

2.2 How to access the shell via Farm OnDemand

Instructions on how to get your terminal window running are on this PDF and we will walk through this together in class: https://canvas.ucdavis.edu/courses/978599/files?preview=27151921

2.4 Shortcut: Tab Completion

Typing out file or directory names can waste a lot of time and it’s easy to make typing mistakes. Instead we can use tab complete as a shortcut. When you start typing out the name of a directory or file, then hit the Tab key, the shell will try to fill in the rest of the directory or file name.

Since we are already in our home directory type ‘cd gr’ and then hit tab:

$ cd /gr

The shell will fill in the rest of the directory name for group.

Now continue into the rbaygrp directory and then the eve198-genomics directory by using tab to fill in the rest of the names:

$ cd rbayg
$ cd ev

Using tab complete can be very helpful. However, it will only autocomplete a file or directory name if you’ve typed enough characters to provide a unique identifier for the file or directory you are trying to access.

2.5 Creating our own directories

We will now each make our own directory in the eve198-genomics directory so that everyone can run code individually.

To make a new directory type the command mkdir followed by the name of the directory, in this case ‘mlarmstrong’. You can do your ucdavis email name or your first intial and last name, just make sure it is specific to you.

$ mkdir mlamstrong

Check that it’s there with ls. As other people make directories those will pop up too!

$ ls
mlarmstrong week1

2.6 Full vs Relative Paths

The cd command takes an argument which is a directory name. Directories can be specified using either a relative path or a full absolute path. The directories on the computer are arranged into a hierarchy. The full path tells you where a directory is in that hierarchy.

Navigate to the home directory, then enter the pwd command.

$ cd
$ pwd

You should see home/‘yourusername’:

/home/madarm11

This is the full name of your home directory. This tells me that I are in a directory called madarm11, and yours will be named with your username. This directory sits inside a directory called home which sits inside the very top directory in the hierarchy. The very top of the hierarchy is a directory called / which is usually referred to as the root directory. So, to summarize: madarm11 is a directory in home which is a directory in /. More on root and home in the next section.

Now enter the following command:

$ cd /group/rbaygrp/eve198-genomics

This jumps several levels to the class directory. Now go back to the home directory.

$ cd 

But you can’t navigate to the class directory using this command:

$ cd group/rbaygrp/eve198-genomics

Why not? The first uses the absolute path, giving the full address from the home directory. The second uses a relative path, giving only the address from the working directory. A full path always starts with a /. A relative path does not.

A relative path is like getting directions from someone on the street. They tell you to “go right at the stop sign, and then turn left on Main Street”. That works great if you’re standing there together, but not so well if you’re trying to tell someone how to get there from another country. A full path is like GPS coordinates. It tells you exactly where something is no matter where you are right now. We need to have a full path since we need to be able to navigate to the broadest directory using cd from our home directory, and then move into the group directory. using cd /group

Now navigate inside of your individual directory and run the command ‘ls’.

$ cd mlarmstrong
$ ls

It should be empty because we just created it and haven’t put anything in it yet. Next class we will learn to work with files!

Let’s look at some of the options for the ls function using the man command (note this will print out several lines of text). man (short for manual) displays detailed documentation (also referred as man page or man file) for bash commands. It is a powerful resource to explore bash commands, understand their usage and flags. Some manual files are very long. You can scroll through the file using your keyboard’s down arrow or use the Space key to go forward one page and the b key to go backwards one page. When you are done reading, hit q to quit.

$ man ls

The -a option is short for all and says that it causes ls to “not ignore entries starting with .” This is the option we want.

$ ls -a

You’ll see there are many more files shown now that we can look at the hidden ones.

In most commands the flags can be combined together in no particular order to obtain the desired results/output.

$ ls -Fa 
$ ls -laF

2.7 Examining the contents of other directories

By default, the ls commands lists the contents of the working directory (i.e. the directory you are in). You can always find the directory you are in using the pwd command. However, you can also give ls the names of other directories to view. Navigate to your directory if you are not already there.

$ cd /group/rbaygrp/eve198-genomics/

Then enter the command:

$ ls 'yourdirectory'

data

This will list the contents of the your directory without you needing to navigate there.

If you want to move around multiple directories without necessarily looking in each one, you can chain these together like so:

$ cd ../../

moves you back two directories. If you just want to look in that directory but not move there you can change the command to:

$ ls ../../

We now know how to move around our file system using the command line. This gives us an advantage over interacting with the file system through a GUI as it allows us to work on a remote server, carry out the same set of operations on a large number of files quickly, and opens up many opportunities for using bioinformatic software that is only available in command line versions.

2.8 Group Work Activity- Treasure Hunt!

Let’s practice moving around in directories. Form a small group with nearby classmates but make sure everyone practices navigating on their own computer.

Start in your home directory. Then navigate to the week1_activity directory in the eve198-genomics directory. Move around directories until you find a directory with the scientific name of the Wolverine (which you can google the name)! Once you find it, copy the full path to that directory and screenshot the contents of the directory. Submit both the full path and the screenshot on canvas under the ‘Assignments’ tab for ‘Week 1: Treasure Hunt’.

2.9 Key Points

  • Most commands take options (flags) which begin with a -.

  • The shell gives you the ability to work more efficiently by using keyboard commands rather than a GUI.

  • Useful commands for navigating your file system include: ls, pwd, and cd.

  • Tab completion can reduce errors from mistyping and make work more efficient in the shell.

  • Relative paths specify a location starting from the current location, while absolute paths specify a location from the root of the file system.