Scripts, functions, and variables

Shell scripts

We now know a lot of UNIX commands! Wouldn’t it be great if we could save certain commands so that we could run them later or not have to type them out again? As it turns out, this is extremely easy to do. Saving a list of commands to a file is called a “shell script”. These shell scripts can be run whenever we want, and are a great way to automate our work.

$ cd ~/Desktop/data-shell/molecules
$ nano
	#!/bin/bash         # this is called sha-bang; can be omitted for generic (bash/csh/tcsh) commands
	echo Looking into file octane.pdb
	head -15 octane.pdb | tail -5       # what does it do?
$ bash   # the script ran!

Alternatively, you can change file permissions:

$ chmod u+x
$ ./

Let’s pass an arbitrary file to it:

$ nano
	echo Looking into file $1       # $1 means the first argument to the script
    head -15 $1 | tail -5
$ ./process cubane.pdb
$ ./process propane.pdb
  • head -15 “$1” | tail -5 # placing in double-quotes lets us pass filenames with spaces
  • head $2 $1 | tail $3 # what will this do?
  • $# holds the number of command-line arguments
  • $@ means all command-lines arguments to the script (words in a string)
Question `file permissions` Let’s talk more about file permissions.
Question 34

In the molecules directory (download link mentioned here), create a shell script called containing the following:

head -n $2 $1
tail -n $3 $1

While you are in that current directory, you type the following command (with space between two 1s):

./  '*.pdb'  1  1

What output would you expect to see?

  1. All of the lines between the first and the last lines of each file ending in .pdb in the current directory
  2. The first and the last line of each file ending in .pdb in the current directory
  3. The first and the last line of each file in the current directory
  4. An error because of the quotes around *.pdb

You can watch a video for this topic after the workshop.

You can watch a video for this topic after the workshop.

If statements

Let’s write and run the following script:

$ nano
    for f in $@
      if [ -e $f ]      # make sure to have spaces around each bracket!
        echo $f exists
        echo $f does not exist
$ chmod u+x
$ ./ a b c
  • Full syntax is:
if [ condition1 ]
  command 1
  command 2
  command 3
elif [ condition2 ]
  command 4
  command 5
  default command

Some examples of conditions (make sure to have spaces around each bracket!):

  • [ $myvar == 'text' ] checks if variable is equal to ’text'
  • [ $myvar == number ] checks if variable is equal to number
  • [ -e fileOrDirName ] checks if fileOrDirName exists
  • [ -d name ] checks if name is a directory
  • [ -f name ] checks if name is a file
  • [ -s name ] checks if file name has length greater than 0
Question 23 Write a script that complains when it does not receive arguments.


We already saw variables that were specific to scripts ($1, $@, …) and to loops ($file). Variables can be used outside of scripts:

$ myvar=3        # no spaces permitted around the equality sign!
$ echo myvar     # will print the string 'myvar'
$ echo $myvar    # will print the value of myvar

Sometimes you can see the notation:

$ export myvar=3

Using ’export’ will make sure that all inherited processes of this shell will have access to this variable. Try defining the variable newvar without/with ’export’ and then running the script:

$ nano
    echo $newvar

You can assign a command’s output to a variable to use in another command (this is called command substitution) – we’ll see this later when we play with ‘find’ command.

$ printenv    # print all declared variables
$ env         # same
$ unset myvar   # unset a variable
Question `using a variable inside a string`
echo $varshine
echo ${var}shine
echo "$var"shine
Question `variable manipulation`
echo $myvar
echo ${myvar:offset}
echo ${myvar:offset:length}
echo ${myvar:2:3}    # 3 characters starting from character 2
echo ${myvar/l/L}    # replace the first match of a pattern
echo ${myvar//l/L}   # replace all matches of a pattern

Environment variables are those that affect the behaviour of the shell and user interface:

$ echo $HOME
$ echo $PATH
$ echo $PWD
$ echo $PS1

It is best to define custom environment variables inside your ~/.bashrc file. It is loaded every time you start a new shell.

Question 22 Play with variables and their values. Change the prompt, e.g. PS1="\u@\h \w> ".

You can watch a video for this topic after the workshop.


Functions are similar to scripts, but there are some differences. A bash script is an executable file sitting at a given path. A bash function is defined in your environment. Therefore, when running a script, you need to prepend its path to its name, whereas a function – once defined in your environment – can be called by its name without a need for a path. Both scripts and functions can take command-line arguments.

A convenient place to put all your function definitions is ~/.bashrc file which is run every time you start a new shell (local or remote).

Like in any programming language, in bash a function is a block of code that you can access by its name. The syntax is:

functionName() {
  command 1
  command 2

Inside functions you can access its arguments with variables $1 $2 … $# $@ – exactly the same as in scripts. Functions are very convenient because you can define them inside your ~/.bashrc file. Alternatively, you can place them into a file and then source them whenever needed:

$ source

Here is our first function:

greetings() {
  echo hello

Let’s write a function ‘combine()’ that takes all the files we pass to it, copies them into a randomly-named directory and prints that directory to the screen:

combine() {
  if [ $# -eq 0 ]; then
    echo "No arguments specified. Usage: combine file1 [file2 ...]"
    return 1        # return a non-zero error code
  mkdir $dir
  cp $@ $dir
  echo look in the directory $dir
Question `swap file names` Write a function to swap two file names. Add a check that both files exist, before renaming them.

Question `archive()`

Write a function archive() to replace directories with their gzipped archives.

$ ls -F
chapter1/  chapter2/  notes/
$ archive chapter* notes/
$ ls
chapter1.tar.gz  chapter2.tar.gz  notes.tar.gz
Question `countfiles()`

Write a function countfiles() to count files in all directories passed to it as arguments (need to loop through all arguments). At the beginning add the check:

    if [ $# -eq 0 ]; then
        echo "No arguments given. Usage: countfiles dir1 dir2 ..."
        return 1

You can watch a video for this topic after the workshop.

Scripts in other languages

As a side note, it possible to incorporate scripts in other languages into your bash code, e.g. consider this:

function test() {
    cat << EOF > $randomFile
print("do something in Python")
    chmod u+x $randomFile
    /bin/rm $randomFile

Here EOF is a random delimiter string, and << tells bash to wait for the delimiter to end input. For example, try the following:

cat << the_end
This text will be
printed in the terminal.