Topics in Systematics and Evolution: Bioinformatics for Evolutionary Biology
In this tutorial we will be going through parts of several software carpentry workshops. Right now, just code along with me, but after the lesson you can go through the tutorial on your own to clarify any understanding problems.
Here are some good tutorials if you’re interested in learning a programming language
most programs come with a manual page, explaining all options. You can get help about individual command with the following:
man bash
help
, and help <cmd>
. help on for loops: help for
, help on conditionals help if
, help on change directory help cd
, etc.ls
, grep
, sed
, awk
you have to look at their manual pages: e.g. man sed
less
Programs like man
and less
show an on-screen navigation system:
[dD]og
+ enter will search for “dog” or “Dog”. then n and p to go the next and previous occurrence.We’ll have to edit files often in the course. You can edit files locally on your computer and copy them over (we show you how to copy files to the server in this topic). If you don’t have an editor on your laptop, we can suggest Sublime Text, or Visual Studio Code (VS Code). A simple text editor builtin to your os will do. e.g. Wordpad or gedit. Avoid notepad or word.
We also have several editors which you can run directly on the server. Editing directly on the server is faster because you’ll be debugging iteratively.
nano filename
. a barebones editor with key bindings similar to notepad. good for small edits. easiest option
nano -l filename
shows line numbers.nano -z filename
allows suspending the editing with CTRL+zvim filename
i
enters insert (edit) modeemacs filename
You will be asked to type commands interactively, but in later topics you will be asked to create scripts. Here is an example to create a bash script, which by convention ends with .sh
.
# here we use nano, but you could use any other editor of choice
nano my_first_script.sh
If the file doesn’t exist, you will see an empty file. You can then type content (i.e. a series of bash commands) in the file. Example:
Save the file, and exit. You can then run this script with:
bash my_first_script.sh
If you add the special line #!/bin/bash
(aka “hashbang”) at the top of your script, and mark the script executable (chmod +x my_first_script.sh
), then you will be able to run it more easily:
./my_first_script.sh
If you have X11 Forwarding enabled, you can use graphical editors installed on the server:
# emacs supports both terminal based and window (x11) based
emacs my_first_script.sh
If you see a window come up, then your X forwarding is configured correctly. Otherwise the terminal version will come up. Graphical emacs looks like this (hit q
to remove the welcome screen):
You can use cp
to copy files from and to the same computer. To
copy across computers, you have to rely on networking tools. We
have collected information on copying files into Copying across
machines.
A key feature of command line use is piping the output of one command to the input of another command. This means that large files can be analyzed in multiple scripts without having to write to disk repeatedly.
|
.2>
. It is also conventionally used to provide progress information to the user.Stream editor. It parses and transforms text using regular expressions. Very powerful, but most easily used to reformat files based on patterns.\ Examples:
seq 10 | sed s/1/one/g
seq 10 | sed s/^1$/one/g
seq 1 10 41 | sed -n 3,5p
Search using regular expression (regex). This command searches for patterns and prints lines that match your pattern.\ Examples:
seq 10 | grep 1
seq 10 | grep -v 1
seq 10 | grep 1 | grep 0
seq 10 | grep "1\|2"
Exercise 1 – build a pipeline that:
man seq
seq 2 2 100 | grep -v 0 | sed 's/2$/2!/g' | grep '!\|3' > exercise_3.txt
Often you will run commands that take hours or days to finish. If you run them normally your connection needs to be maintained for the whole time which can be impossible. Using screen/tmux/byobu allows you to keep a screen open while you’re logged out and then reconnect to it without loss of data at a later time.
byobu is a layer of veneer on top of screen/tmux. screen and tmux are equally powerful, but can be unintuitive to use.
Cancel command = ctrl-c. This will cancel a script that is currently running. Example:
> seq 1000000
ctrl-c to cancel
Byobu can create multiple levels.
We also provide the underlying command which performs the same action in tmux, in case you experience difficulties with your terminal and function keys.
byobu-config
). (CTRL+b is typical for tmux. CTRL+a is typical for screen)tmux split-window -h
)tmux split-window -w
)Troubleshoot: Function keys broken: Byobu is tailored to linux terminal emulators (esp
gnome-terminal
). If you find that the function keys don’t behave as expected when you’re logged in to the server, you might have to configure your terminal parameters to pass the correct escape codes. This is covered in Topic 1: finalize tool config.
Troubleshoot: Strange characters pop-up: The font in your terminal emulator needs to support unicode characters. The font
Ubuntu Mono
is known to work well. If you find the lower bar distracting, you may run the commandbyobu-quiet
. This can be undone withbyobu-quiet --undo
.
Exercise 2:
> byobu
F2
> seq 10000000
F3
F3
> exit
F6
> byobu
> exit