Meet Your Local Terminal

Terminal

Lecture Notes

⬇︎ Terminal

Helpful

⬇︎ Linux Cheatsheet

⬇︎ Linux Pocket Guide

If you are new to the command line you might find these links useful:

In case you prefer a movie ...

This is only a small and limited selection. There is more, much more.

Get Started

First a few words about the following manual. The gray boxes are code chunks. Click on the page icon (⎘) at the top-right corner of each code block to copy the content directly into memory. Everything after the octothorp # (hashtag, pound or number sign) is a comment and will not be executed. The number of octothorp does not matter and it should be used to explain and document your code. It can also be used to format and structure your code to make it more readable. Here an example:

# --------------
#   Get Ready 
# --------------

mkdir -p ${HOME}/GDA21/terminal # Create a working directory
cd ${HOME}/GDA21/terminal       # Go to the newly created directory

# --------------
#   ASCII Art
# --------------

# (a) Print an ASCII-Ant to a file

echo " \(¨)/ " >  ant.txt # Write the head and the front legs
echo " -( )- " >> ant.txt # Write the torax and the middle legs
echo " /(_)\ " >> ant.txt # Write the abdomen and the hind legs
cat ant.txt               # Print the ant 

# (b) Print a ASCII-Bee to a file

echo "     _  _      " >  bee.txt # Write first line to new file
echo "    | )/ )     " >> bee.txt # Write second line to existing file
echo " \\ |// /__    " >> bee.txt # Write line #3 to existing file
echo " (¨)(_)(()))=- " >> bee.txt # Write 4th line 
echo "    <\\        " >> bee.txt # Write last line
cat bee.txt                       # Show content of text file

# (!) I am not particularly happy with the head of the bee.  
# (?) Is there a better way to generate ASCII art?

This is a simple example and we will talk more about comments and structure. For now, you need to remeber to use comments to explain your code and you can use it to improve readability.

We start slowly and explore different aspects of the linux command terminal. I encourage more experienced students to hop from challenge to challenge and read the theory above if you struggle with the challenge.

Command Line Syntax

The general syntax pattern of a command line is:

# prompt> command [option(s)] (File)

Command example: List (ls) content of a folder.

ls         # show ontent of the current folder
ls test/   # show content of folder test
ls -l -h   # list (option -l) content of current folder and use human-readable (option -h) file size 
ls -lh     # the same but shorter

Note

Most problems have multiple solutions. Finding the shortes possible way (e.g. -lh instead of -l -h) is very Linux. It is almost an obsession to find a faster and shorter solution to any linux terminal command. In the beginning you are stuck with what you know and that is fine too.

You might have noticed that the list command ls is providing more than just the directory content. It shows you the rights, the owner, file or folder size, (last changing) date and of course the content (e.g., files and folders).

# - rw- r-- r-- jwalser users 161K Jan 17 12:22 p117_Help.txt
# - rw- r-- r-- jwalser users  92M Jan 17 16:51 p117_Metafile.txt
# d rwx r-x r-x jwalser users  16G Feb 19 15:43 data
# | --- --- ---  |       |     |    |            |
# |  |   |   |   |       |     |    |            └⊳ files / folders
# |  |   |   |   |       |     |    └⊳ date and time
# |  |   |   |   |       |     └⊳ size
# |  |   |   |   |       └⊳ group
# |  |   |   |   └⊳ owner
# |  |   |   └⊳ rights all (read/write/execute)
# |  |   └⊳ rights group (read/write/execute)
# |  └⊳ rights owner (read/write/execute)
# └⊳ d director / - file

The rights are important and have to be set correctly. You will not be able to change a file if you only have read-rights (r--). Have a look at the Guru99 tutorial for more details about file permissions. This is an advanced topic and we talk about it later. For now, remeber that the command ls can do more than just show contents of folders.

Basic Commands

Here a list of some basic commands.

pwd.................: absolute pathname of the current working direction
man <command>.......: manual page for command (exit with q)
file <file>.........: determine file type
cd <where>..........: change directory/folder
cd .. ..............: go up one directory
cd .................: go home
mkdir <dir>.........: create directory
rmdir <dir>.........: remove directory (if empty)
ls <dir>............: list content of directory
ls -alh <dir>.......: more detail list
echo "message"......: prints content or message
cat <file>..........: print and concatenate files
head -n 5 <file>....: show first n lines
tail -n 5 <file>....: show last n lines
more <file>.........: read file (exit with q)
less <file>.........: similar to more but newer
> & >>..............: re-direct output (e.g. pwd > file.txt)
cp <ori> <copy>.....: copy file
mv <old> <new>......: move and/or rename file
rm <file>...........: remove file - careful!!!
wc <file>...........: word, line, character, and byte count
grep "query" <file>.: search file(s) matching query
find................: find files based on characteristics (e.g. name)
sed.................: transform content of text files
clear...............: clear terminal 
history.............: show terminal history
date................: display date
cal.................: display calendar

▻ Have a look at the LinuxCheatSheet for more commands and ideas.

Help

No worries if you forgot the options for a command, help is at your fingertip! A command line manual is build-in. The command man <command> will help you find the option you are looking for. It can also be used to explore a command. There are alternatives and google would work too, but you might get information not suitable for your system/version.

## Access the manual (Syntax: man <command>)
man cat # example

## Available commands (and aliases, functions, builtins and keywords)
compgen -c | less # scroll with [↑] and [↓] and exit with [q]

## Search for commands
apropos list | grep "directory"

## Help (Syntax: help <command>)
help cat # might not work for all commands

## Help
cat --help # might not work for all commands

## Version
cat --version # might not work for all commands

## Where is binary file for the command located?
which cat

System Variables

Built-in shell variables are in uppercase characters by convention. This are internal and reserved variables. Use them but do not overright them!

echo ${BASH}          # Bash binary
echo ${BASH_VERSION}  # Bash version
echo ${SHELL}         # Gives present shell
echo ${USER}          # Displays username
echo ${HOME}          # Home directory of User
echo ${RANDOM}        # To get a random number
echo ${PWD}           # Current directory

User Defined Variables

You can define your own (local) variables if you like. Remember, they are local and not permanent.

## Define Variables
MyRegistry="NCC-1701"
MyName="USS_Enterprise"
## Define Resource Location 
RSSLocation="http://www.gdc-docs.ethz.ch/GeneticDiversityAnalysis/GDA21/"
## Download Text Files
curl -o NCC.txt ${RSSLocation}/images/NCC.txt
## Print variables and text file
clear; echo ""; echo "${MyName} (${MyRegistry})"; cat NCC.txt
# Remark: We can *chain* independent commands together on one line using a semi-colon.

Home Sweet Home

When you open the terminal, you should be in your home directory.

## Where am I?
pwd # > current directory name
## Where is my HOME
echo ${HOME}
## Go HOME
cd ${HOME}
## Tilde is equal to HOME
cd ~ # > go HOME with tilde
## Shortcut HOME
cd

Directories / Folders

## Create a working directory/folder in your HOME
mkdir ${HOME}/GDA          # Main folder (directory) in your home directory
mkdir ${HOME}/GDA/Terminal # Subfolder (sub-directory) inside you main folder

# Alternative:
# mkdir ~/GDA
# mkdir ~/GDA/Terminal

# HOME < you are here
# └── GDC < created main folder
#     └──Terminal < created subfolder

## "Go to" the working directory

cd ${HOME}/GDA/Terminal            # Change directory

# Alternatives:
# cd ~/GDA/Terminal                # Tilde alternative
# cd GDA/Terminal                  # alternative if you are already in your HOME directory
# cd ${HOME}; cd GDA; cd Terminal  # step-by-step

# HOME
# └── GDC
#     └──Terminal < now you are here

The windows of your graphical desktop are called directors (folders) in the terminal. In fact, every windows corresponds to a directory. We will learn to move from directory to directory along a path.

# Tree       Path
# A          A
# ├── B      A/B
# │   └── C  A/B/C
# │   └── D  A/B/D
# └────── E  A/E

Summary:

# cd        > HOME 
# cd <path> > Move to folder
# cd ..     > Up one folder
# cd ../..  > Up two folders

In the previous example, we used mkdir multiple times to create subfolders. Would it not be convinient to do this with less typing?

## Test Folders
mkdir -p TestFolder_A/TestFolder_B/TestFolder_C
# ➜ Option -p creates all intermediate folders

# WD < you are here
# └── TestFolder_A
#     └── TestFolder_B
#         └── TestFolder_C

# WD: working directory

## Go down and go up / step-by-step 
cd TestFolder_A/TestFolder_B/TestFolder_C

# WD
# └── TestFolder_A
#     └── TestFolder_B
#         └── TestFolder_C < you are here

cd .. # go one folder up

# WD
# └── TestFolder_A
#     └── TestFolder_B < you are here
#         └── TestFolder_C

cd ../.. # go to folders up

# WD < you are back here now
# └── TestFolder_A
#     └── TestFolder_B
#         └── TestFolder_C 

## Remove test folders
rmdir TestFolder_A

# ✘ does not work because there are sub-folders inside TestFolder_A
# ☛ rmdir deletes only empty folders
# ✔︎ we have to delete subfolder by subfolder

rmdir TestFolder_A/TestFolder_B/TestFolder_C

# WD < you are still here but subfolder C is gone
# └── TestFolder_A
#     └── TestFolder_B

rmdir TestFolder_A/TestFolder_B

# WD < you did not move but you deleted sub-folder B
# └── TestFolder_A

rmdir TestFolder_A/

# WD < you are still here and all sub-folders are gone

❖ Challenge #1: The command rmdir removes only empty folders. Can you find an alternative way to remove all subfolders at once? Tip: Use the remove command rm and have a look at the manual page.

Suggestion #1

We can use the remove rm command with the recursive option. Careful with the force option - there is no undo and gone is gone!


  mkdir -p TestFolder_X/TestFolder_XX/TestFolder_XXX
  rm -fr TestFolder_X/TestFolder_XX/TestFolder_XXX
  # ➜ Option -f, --force: ignore nonexistent files and arguments, never prompt
  # ➜ Option -r, -R, --recursive: remove directories and their contents recursively

❖ Challenge #2: Does it matter if you use capital letters or not? In other words, is test.txt == TEST.TXT? Find a way to test if your local terminal is case sensitive.

Suggestion #2

Some filesystems are case insensitive. It is important to know if TEST and test is the same on your computer.


  mkdir TEST
  mkdir test
  # mkdir: TEST: File exists ➜ Terminal is not case sensitive

❖ Challenge #3: Do the following:

Create two directories (RunA_210415 and RunB_210412) in your HOME.
Create a subdirectory (infoA and infoB) for each directory.
Switch between the two subdirectories.

Suggestion #3


  pathA="${HOME}/RunA_210415/infoA/"
  pathB="${HOME}/RunB_210412/infoB/"
  mkdir -p ${pathA} ${pathB}
  cd ${pathA}
  pwd
  cd ${pathB}
  pwd

Data Streams

Every program we run on the command line has three data streams. We can alter these data streams in interesting and useful ways.

## Data streams
# STDIN  (0) - Standard input
# STDOUT (1) - Standard output (by default printed to the terminal)
# STDERR (2) - Standard error (by default printed to the terminal)

Redirect Output (overwrite)

We change the default of STDOUT / STDERR from terminal to file.

## Print message on terminal
echo "Hello Terminal #1" # works and prints result to terminal
icho "Hello Terminal #1" # typo, does not work and prints error to terminal

## Redirect outputs (STDOUT and STDERR) to files
echo "Hello Terminal" 1> test1.txt 2> errors.txt
more test1.txt  # you could also use [less] or [cat] instead of [more]
more errors.txt # empty because there was no error

## Redirect outputs (STDOUT and STDERR) to files
icho "Hello Terminal #1" 1> test1.txt 2> errors.txt
cat test1.txt  # empty because there was a typo in the command
cat errors.txt # error message

## Redirect outputs (STDOUT and STDERR) to one file
echo "Hello Terminal #2" > test2.txt 2>&1
cat test2.txt

## Redirect only output (STDOUT) in a file
echo "Hello Terminal #3" > test3.txt # you do not need 1> if you only print STDOUT
cat test3.txt

Careful, you overwrite a file if you redirect STDOUT and STDERR into a existing file with the same name.

## Overwrite the previous message
echo "Nothing in life is to be feared." > text1.txt
more text1.txt

You can print to different files and combine (concatenate) the files.

## Print another message to a different file
echo "It is only to be understood." > text2.txt
## Merge content of files
cat text1.txt text2.txt > text12.txt
cat text12.txt

Redirect Output (append)

There is an alternative to merge mulitple output files. We can use the double greater than operator (>>) to appended the output to an existing file.

## Add (>>) a third line to the combined file
echo "Marie Curie" >> text12.txt
cat text12.txt

Redirecting from a File

We can use the less than operator (<) to change input direction.

<command> file.txt   # do something with the file
<command> < file.txt # feed the file to the command

It looks similar but there is a subtle difference.

## Count the number of lines in a text file,
## but use two different approaches: 
wc -l text12.txt   >  count.txt # Version #1
wc -l < text12.txt >> count.txt # Version #2

## What was different?
cat count.txt
# 3 text12.txt
# 3 <file name is missing>

## Alternative Solution (more later)
cat test12.txt | wc -l

When we redirect the STDIN we send the data "anonymously". The program does not know where the data is coming from. A trick to avoid unwanted ancillary information.

## This is ugly
wc -l text12.txt > count.txt
echo "We have $(cat count.txt) lines."
## This is better
wc -l < text12.txt > count.txt
echo "We have $(cat count.txt) lines."

❖ Challenge #4: In the previous example, the file information was unwanted. Can you think of an example where it would be useful to have the filename with the line count?

Suggestion #4

Assume you have multiple files and you need to count the lines of each file.


  wc -l text1.txt text2.txt text3.txt
  # 1 text1.txt
  # 1 text2.txt
  # 3 text12.txt

Piping

Sending data from one program (STDOUT) to another one (STDIN) is called piping. We us the vertical bar | to feed the output from one cammand to the next.

## Create a multi-line message 
echo -e "Think Like a Proton\nStay Positive" > proton.txt
# ➜ \n stands for newline and divides the string into two lines
cat proton.txt

## Copy first / last line to a new file
cat proton.txt | head -n 1
cat proton.txt | tail -n 1

## Count the number of lines (alternative)
cat text12.txt | wc -l

Log-Files

You can use the redirect option to create log-files of e.g. your terminal sessions. Some application have a verbose (-v) option you should not ingnore during your testing. You can redirect the output to a file and search for errors or warnings.

## Create an empty file and fill it
rm LOG.txt; touch LOG.txt
echo "Test Log File"                    >> LOG.txt
echo "User: ${USER}"                    >> LOG.txt
echo "-------------------------------"  >> LOG.txt
echo "My working directory:${PWD}"      >> LOG.txt
echo "My tree version:"                 >> LOG.txt             
tree --version | cut -c1-12             >> LOG.txt
echo "-------------------------------"  >> LOG.txt
tree                                    >> LOG.txt
echo "-------------------------------"  >> LOG.txt
echo "env | grep "TERM_P"            "  >> LOG.txt                     
echo "-------------------------------"  >> LOG.txt
date +"%d/%m/%y"                        >> LOG.txt
echo "-------------------------------"  >> LOG.txt
clear; cat LOG.txt

❖ Challenge #5: Let us create a virtual dice and safe the results of three throws in a text file.

For this purpose we use the internal Bash function $RANDOM. It returns a (pseudo)random integer.

## Default use
echo ${RANDOM}

# > The default range is too large (0 - 32767)

## Restrict the range
echo $(( RANDOM % 7))

# > Better but we need a range from 1-6

## Range between 1-6
echo $((1 + RANDOM % 6))

So far so good. Now, we need to roll our dice three times and safe the results in a file.

Suggestion #5

3.1 Step-by-Step


   echo $((1 + RANDOM % 6)) > random_number_1.tmp
   echo $((1 + RANDOM % 6)) > random_number_2.tmp
   echo $((1 + RANDOM % 6)) > random_number_3.tmp
   cat random_number_[123].tmp > random_numbers_S1.txt
   rm *.tmp
   cat random_numbers_S1.txt

3.2 No temporary files


   echo $((1 + RANDOM % 6)) >  random_numbers_2.txt
   echo $((1 + RANDOM % 6)) >> random_numbers_2.txt
   echo $((1 + RANDOM % 6)) >> random_numbers_2.txt
   cat random_numbers_S2.txt

3.3 Using a FOR Loop (for more advanced users)


   for i in 1 2 3
   do
     echo "Random Number ${i}: ${RANDOM:3}"
   done > random_numbers.txt
   cat random_numbers_S3.txt

Copy, Rename and Remove

## Copy a file - original is kept
cp text12.txt Marie_Curie.txt
ls -l
cat text12.txt Marie_Curie.txt

## Rename (move) file - original is lost
mv ZERO.txt logfile.txt
ls -l

## Remove file(s)
rm text12.txt text.txt
ls -l

❖ Challenge #6.1: What is the difference between the two commands.

cp file.txt newfile.txt
cat file.txt > newfile.txt

Suggestion #6.1

- The first line creates a copy of the file. It is possible to copy any file.


  cp file.pdf newfile.pdf # ✔︎

- The second command reads and writes the content of the text file into a new text file. This will not work for files that cannot be opened with cat. For example, it would not work for PDFs.


  cat file.pdf > newfile.pdf # ✖︎

❖ Challenge #6.2: When would you use mv instead of cp?

mv file.txt newfile.txt

Suggestion #6.2

Use the move command to rename and/or move a file. Copy a file is safer because you keep the original but large files might take a while to copy and use disk space.

Wildcards

In the previous chapter you created various text files. You can list all files in your working directory or select only specific files. Wildcards can be very hand for this task.

## List all text files
ls *.txt          # list all files with ending .txt
ls text?.txt      # list all files starting with test, followed by one character, and ending with .txt
ls text[123].txt  # list all files starting with test, followed by 1,2 or 3, and ending with .txt
# ➜ * any characters
# ➜ ? one charachter
# ➜ [123] a group - meaning 1, 2, or 3

## Remove multiple files
rm text1.txt text2.txt text3.txt
rm text[123].txt

❖ Challenge #7.1: Can you find a command line to delete the index (I1 or I2) samples but keep the forward (R1) and reverse (R2) reads?

Sample_GX0I1_R1.fq.gz
Sample_GX0I1_R2.fq.gz
Sample_GX0I1_I1.fq.gz
Sample_GX0I1_I2.fq.gz
Sample_GX0I2_R1.fq.gz
Sample_GX0I2_R2.fq.gz
Sample_GX0I2_I1.fq.gz
Sample_GX0I2_I2.fq.gz

Suggestion #7.1

There are usually more than just one possible solution. Some might be better (e.g. faster, more secure) than others but it is paramount you understand what you do.


  ## Suggestion 7.1a
  rm -i Sample_GX0I1_I1.fq.gz Sample_GX0I1_I2.fq.gz Sample_GX0I2_I1.fq.gz Sample_GX0I2_I2.fq.gz
  # ➜ Safe and it works but imagine you have a few hundred files.

  ## Suggestion 7.1b
  rm -i Sample_GX0I?_I?.fq.gz
  # ➜ Also safe and would work just fine as long as all samples follow the same name structure.

  ## Suggestion 7.1c
  rm -i *_I1.*
  # ➜ Short and precise but can be dangerous. 

  ## Tip: You might test your wildcards first?
  ls -ah *_I1.*

❖ Challenge #7.2: What is the problem with the following command? Can you correct it?

# cat sequence*.fa >> sequence_all.fa # ✖︎✖︎✖︎ Do not use!

Suggestion #7.2

The command will never finish until your hard drive is filled. The wildcard also includes the output file and this would create a "never ending" circle. Better/correct solutions would include:


  cat sequence*.fa >> all_sequence.fa
  cat sequence*.fa >> different_path/sequence_all.fa

Terminal History

You might be familiar with the history of your internet browser. The terminal has a history too. This is great because with command history we cannot only search the past but it also means not to retype previous commands. Use arrow up and down to travel within your history. You can also access it:

history

With no options (default), you will see a list of previous commands with line numbers. Lines prefixed with a ‘*’ have been modified. An argument of n lists only the last n lines.

-c clear history 
-d offset Delete the history entry at position offset.
-d start-end Delete the history entries between positions start and end
-a Append the new history lines to the history file.
-n Append the history lines not already read from the history file to the current history list.
-r Read the history file and append its contents to the history list.
-w Write out the current history list to the history file.

Syntax examples:

history [n]
history -c
history -d offset
history -d start-end
history [-anrw] [filename]

❖ Challenge #8.1: Why would you need a history of your commands?

Suggestion #8.1

There are many good reason. Let me list a few, for me obvious ones. I am happy to learn new ones if you like to share.
- You might have noticed that with the keys you can navigate the history. Saves a lot of time typing the same or a similar command.
- You can also used the history for troubleshooting or to keep a log file of your session.

❖ Challenge #8.2: Preserve your history!

Create a text file with a title and your username.
Add the last 20 command lines you used to the file.
Add a date to the bottom of the file.

Suggestion #8.2


  echo "=== Safe My History ===" >  MyHistory.txt
  echo "${USER}"                 >> MyHistory.txt
  echo "-----------------------" >> MyHistory.txt
  history | tail -n 20           >> MyHistory.txt
  echo "-----------------------" >> MyHistory.txt
  date "+%A, %d.%B %Y"           >> MyHistory.txt
  clear; cat MyHistory.txt

Some Sequence Examples

# Download a sequence fasta file
pwd # make sure this is the right place for the download
curl -O http://gdc-docs.ethz.ch/GeneticDiversityAnalysis/GDA20/data/RDP_16S_Archaea_Subset.fasta
ls -lh RDP_16S_Archaea_Subset.fasta

# Count the number of lines in the file 
wc -l RDP_16S_Archaea_Subset.fasta

# Have a look at the fasta file
less RDP_16S_Archaea_Subset.fasta

# Have a look at the first 15 lines
head -n 15 RDP_16S_Archaea_Subset.fasta

# Count the number of sequences
grep ">" -c RDP_16S_Archaea_Subset.fasta

# Find a specific motif and highlight it
grep "cggattagatacccg" --color RDP_16S_Archaea_Subset.fasta

# Count how many time you found the motif
grep "cggattagatacccg" -c RDP_16S_Archaea_Subset.fasta

# Find alternatives step-by-step
grep "cgggaggc" -c RDP_16S_Archaea_Subset.fasta
grep "cgggtggc" -c RDP_16S_Archaea_Subset.fasta
grep "cgggcggc" -c RDP_16S_Archaea_Subset.fasta
grep "cggggggc" -c RDP_16S_Archaea_Subset.fasta

# Find alternative faster
grep "cggg[atcg]ggc" -c RDP_16S_Archaea_Subset.fasta