When it comes to talk about scripting, we cannot avoid talking about the probably most famous of the shells: the Bourne Again SHell. Thoroughly explaining it would require a whole book, so as usual in this post we explore only the features that it’s theory likely the reader should learn. The post is not intended to be easily understood by new-bies: it is structured as a cheat sheet, so the reader can use it as a quick reference when needed, but this approach has the drawback that there’s not much room to elaborate things enough.

Dealing with locales

The system locales are a set of variables that contains all of the information required to elaborate data in compliance with the formats of the region of the world the user wants to use. By default, they are often set to the region the user lives at, but they can set to whatever region the user prefers. This set of variables can be listed as follows:

locale

on my system the output is:

LANG=it_CH.UTF-8
LC_CTYPE="it_CH.UTF-8"
LC_NUMERIC="it_CH.UTF-8"
LC_TIME="it_CH.UTF-8"
LC_COLLATE="it_CH.UTF-8"
LC_MONETARY="it_CH.UTF-8"
LC_MESSAGES="it_CH.UTF-8"
LC_PAPER="it_CH.UTF-8"
LC_NAME="it_CH.UTF-8"
LC_ADDRESS="it_CH.UTF-8"
LC_TELEPHONE="it_CH.UTF-8"
LC_MEASUREMENT="it_CH.UTF-8"
LC_IDENTIFICATION="it_CH.UTF-8"
LC_ALL=

since I'm in southern Switzerland, my LANG variable is set to “it_CH.UTF-8”:

  • it stands for Italian, the language we speak here (in Switzerland three different languages are spoken: German, French and Italian)
  • CH is the ISO-3166 alpha 2 code for Switzerland
  • UTF-8 character set

Other variables, such as LC_NUMERIC and LC_CTYPE affect numeric values, such as decimal and thousands separators.
All these stuff can lead to unexpected behavior when dealing with scripts, so if you want to make them as portable as possible,the very first thing that you should do is to set LC_ALL variable (a special variable that overrides all the other variables of the list) to C ( a very simple locale format specifically designed to ease things).

export LC_ALL=C

You can then reset locale variables to their original values when it comes to display values to the user.

Globbing

BASH does not support REGular EXpressions, but implements globbing: the shell expands filenames by using the following rules:

?

Matches any character, but only once. You can use it multiple times, for example ??? matches a word
of three characters in length

*

Matches any word (sequence of characters)

[]

Define a range of matching characters:, for example

  • [ab] matches characters a or character b
  • [a-c] matches characters a or character b o character c

for example:

ls -d /usr/share/doc/[a-c]*

You can specify more than one globbing pattern by using braces: for example

ls -d /usr/share/doc/{a*,d*}

List and Sequences

Braces can also be used to generate lists or sequences at the command line or in a shell script: the syntax consists of putting among curly braces "{}" either

*

a comma separated list of items, such as {aa,bb,cc,dd}:

echo {aa,bb,cc,dd}
aa bb cc dd
*

a sequence specification (a starting and ending item separated by two periods ".."), such as {0..12}:

echo {0..12}
0 1 2 3 4 5 6 7 8 9 10 11 12

other examples of sequences are

{3..-2}

3 2 1 0 -1 -2

{a..g}

a b c d e f g

{g..a}

g f e d c b a

Setting options

BASH behavior can be modified by specifying options.

For example, issuing

set -o pipefail

make a whole pipeline (a sequence of commands) to fail if any of the statements should fail: a best practice is to set this option in RUN statement in Dockerfile.
Another very useful option is nounset:

set -o nounset

this makes scripts to fail when trying to expand an undefined variable, avoiding a lot of headaches due to conditionals that checks a variable you think that it do exist, and instead is undefined.
You can list all the available options by issuing:

set -o

The following table highlight most of them:

Shell option

Set option

Description

-a

allexport

Each variable or function that is created or modified is given the export attribute and marked for export to the environment of subsequent commands

-B

braceexpand

use brace expansions

 

emacs

use an emacs-style line editing interface. This also affects the editing interface used for read -e.

-H

histexpand

Enable ‘!’ style history substitution (see History Interaction). This option is on by default for interactive shells on

-C

noclobber

Prevent output redirection using ‘>’, ‘>&’, and ‘<>’ from overwriting existing files.

-n

noexec

Read commands but do not execute them. This may be used to check a script for syntax errors.

-f

noglob

disable filename expansion (globbing)

-u

nounset

treat unset variables and parameters other than the special parameters ‘@’ or ‘*’ as an error when performing parameter expansion. An error message will be written to the standard error, and a non-interactive shell will exit

-P

physical

do not resolve symbolic links when performing commands such as cd which change the current directory.

-v

verbose

Print shell input lines as they are read.

-e

vi

Use a vi-style line editing interface. This also affects the editing interface used for read -e.

-x

xtrace

Print a trace of simple commands, for commands, case commands, select commands, and arithmetic for commands and their arguments or associated word lists after they are expanded and before they are executed.

The value of the PS4 variable is expanded and the resultant value is printed before the command and its expanded arguments.

For example:


export PS4='+(${BASH_SOURCE}:${LINENO}): ${FUNCNAME[0]:+${FUNCNAME[0]}(): }'

Processing variables

You can define a variable and its value as follows:

NUMBER=1

or, when dealing with strings,

MYVAR="itsvalue"

to print the value of the variable:

echo ${MYVAR}

you can unset the previous variable by issuing

unset MYVAR

Verify if a variable is set

you can test if a variable is actually set as follows:

if [[ -n ${VAR+x} ]]; then
echo "VAR is set and its value is ${VAR}"
fi
WARNING: the double square brackets [[ are mandatory - you cannot use single square bracket.

Numbers - add

To sum two variables containing numbers:

VAR_A=1
VAR_B=2
let VAR_C=VAR_A+VAR_B

Numbers - subtract

To subtract two variables containing numbers:

VAR_A=3
VAR_B=2
let VAR_C=VAR_A-VAR_B

Numbers - multiply

To multiply two variables containing numbers:

VAR_A=3
VAR_B=2
let VAR_C=VAR_A+VAR_B

Numbers - divide

To divide two variables containing numbers:

VAR_A=6
VAR_B=2
let VAR_C=VAR_A/VAR_B

Numbers - rounding

Here I provide some examples – I do not explain things since they are trivials:

printf '%.*f\n' 0 6.66
7
printf '%.*f\n' 1 6.66
6.7
printf '%.*f\n' 2 6.66
6.66
printf '%.*f\n' 3 6.66
6.660
printf '%.*f\n' 3 6.666
6.666
printf '%.*f\n' 3 6.6666
6.667

Numbers – using bc

bc is a very handy utility you can use to perform accurate math - you can simply pipe the operation to it as follows:

VAR_A=3
VAR_B=2
echo "${VAR_A}/${VAR_B}"|bc -l

if you want to limit to 2 decimal digits:

echo "scale=2; ${A}/${B}"|bc -l

you can use bc for modulo operation:

echo "scale=0; 3%5;" |bc -l

String - getting the length

To get the length of a string:

echo ${#STRING}

String - testing if it is empty

[[ -z ${STRING} ]] && echo "empty" 

beware that -z actually count string lengths - so it does does not take in account spaces - for example if strings is " " -z is 1.

WARNING: the double square brackets [[ are mandatory - you cannot use single square bracket.

String - testing if its not empty

[[ -n ${STRING} ]] && echo "has a value"

beware that -n actually count string lengths - so it does does not take in account spaces - for example if strings is " " -n is 1.

WARNING: the double square brackets [[ are mandatory - you cannot use single square bracket.

String - removing spaces

echo ${var}|tr -d " "

String - counting words

set ${string}
echo $#

beware that set resets parameters list! Maybe a safer approach is:

echo ${string} | wc -w

String - to uppercase

tr '[:lower:]' '[:upper:]'

String - to lowercase

tr '[:upper:]' '[:lower:]'

Substring - extracting

${variable:start:length}

Substring - last 5 characters

${variable: -5}

BEWARE!!! the space is mandatory

Substring - search for

true if $var starts by abc

STRING="This is a string"
[[ $STRING == This*]] && echo "matches"
[[ $STRING == *is*]] && echo "matches"

String - replace text

echo ${variable/pattern/replacement}

Arrays - Declaring

declare -a array=("one" "two" "three")

Arrays - Length

${#numbers[@]}

Arrays - Print whole array (values)

echo ${numbers[@]}

Arrays - Print whole array (indexes)

echo ${!numbers[@]}

Arrays - Adding elements

numbers[4]="four"

Arrays - Removing elements

unset numbers[4]

Conditionals

if..elif..else

#!/bin/bash
if [[ -n ${1+x} ]]; then
   if [[ -z "${1//[[:digit:]]}" ]]; then
     echo "${1} is a number"
   elif [[ -z "${1//[[:alpha:]]}" ]]; then
     echo "${1} is a string of only characters"
   elif [[ -z "${1//[[:alnum:]]}" ]]; then
     echo "${1} is a string of numbers and characters"
   else
     echo "${1} is a string that can contain both numbers, characters and punctations"
   fi
else
  echo "You should provide an argument"
fi

Case

case "$1" in
 start)
    echo "start"
    ;;
  stop)
    echo "stop"
    ;;
  *)
    echo $"Usage: $0 {start|stop}"
    exit 1
esac

Loops

For

It can be used to iterate within a list of items – a common use is by a sub-shell command that generates the list such as:

for i in $( ls )
do
  echo item: $i
done

the following example generates a list of numbers using the seq command from a sub-shell:

for i in `seq 1 10`
do
  echo $i
done

For can use also a C-like syntax as the following one

for ((i=0; i<10; i++))
do
  echo $i
done

While

Condition should be true before entering the loop end remains in loop while it is true

COUNTER=0
while [ ${COUNTER} -lt 10 ]
do
  echo counter is ${COUNTER}
  let COUNTER=COUNTER+1
done

Infinite Loop

while true
do
  sleep 30
done

Until

Code loops until the first occurrence of the condition

COUNTER=20
until [ $COUNTER -lt 10 ]
do
  echo COUNTER $COUNTER
  let COUNTER-=1 
done

a real life example: client is the JBoss Fuse CLI binary. We can loop until it connects

until /opt/fuse/bin/client -u admin -p aGoodPassword "info"
do 
  echo "."
  sleep 5
done

Processing the input of the script

The very most command line programs expects input parameters (they are also called arguments), sometimes optional, other times mandatory. For example, the date command issued without parameter behaves as follows:

date
Wed May 27 16:01:42 UTC 2020

but you can easily customize its output using the “+” parameter followed by format modifiers such as %Y, %m or %d,

date +"%Y-%m-%d"
2020-05-27

for more information on date command and its parameters:

man date

When writing scripts, it's very likely that you have to parse input parameters: the first thing to know is that there are some some special variables that are related to parameters:

$0

Name of the script

$1,$2, ..

Positional parameters passed to the script

$*

The whole list of parameters passed to the script, expanded using IFS. For example: $1,$2

$@

The whole list of parameters passed to the script, expanded with the syntax you'd use to declare an array. For example: {$1, $2}

$#

The number of parameters passed to the script

Parameters are told to be:

short

they start with a single dash (-) and are of one character only. For exampe: -a -P -v

long

they start with a double dash (--) and are of one word only. For exampe: --all --pretty-print --verbose

A double dash (--) without character is used to say to stop interpreting parameters by that point. For example. Let say we want to search for occurrences of "-v" in file.txt:

grep -- -v file.txt

-- prevents grep from considering -v a parameter, so that grep considers it “-v” the first argument (what we want to search) and file.txt the second argument (where we want to search)
When writing your script, you can rely on getopt utility to parse the options passed to the script: I share this snippet with you – I use it as a template when writing bash scripts

if [[ -z ${1} ]]; then
   # no argument supplied, just print the usage by calling usage function
   usage
   exit 1
elif [[ $1 == "user" ]]; then
   # the first argument is “user”, let's shift the pointer to the current argument by one
   shift 1
   if [[ "$1" == "add" ]]; then
      # after shifting, current argument is “add”
      shift 1
      if [[ -z ${1} ]]; then
         # no other arguments provided, let's print the usage by calling usage function
         user_add_usage
         exit 1
      fi
      # get the remaining argument list parsed by getopt utility and store it into TEMP variable 
      TEMP=`getopt -o u:g:h --long uid:,gid,help -n 'ldap.sh user' -- "$@"`
      # evaluate the contents of TEMP variable 
      eval set -- "$TEMP"
      # this infinite loop is used to extract options and their arguments into variables.
      # when -- is reached we exit the loop
      while true ; do
         case "$1" in
            -u|--uid)
               USERID=$2 ; shift 2 ;;
            -g|--gid)
               GROUPID=$2 ; shift 2 ;;
            -h|--help)
               user_add_usage ; exit 1 ;;
            --) shift ; break ;;
            *)  user_add_usage ; exit 1 ;;
         esac
      done
      # here we can perform input validation
      if [[ "${GROUPID}" == "" ]] || [[ "${DESCRIPTION}" == "" ]]; then
         # it has not been specified a value for group id
         echo
         echo "GROUPID and DESCRIPTION are mandatory options!"
         user_add_usage
         exit 1
      fi
   elif [[ "$1" == "mod" ]]; then
      # after shifting, current argument is “mod”
      shift 1
      if [[ -z ${1} ]]; then
         # no other arguments provided, let's print the usage by calling usage function
         user_mod_usage
         exit 1
      fi
      # here you put other code in a similar way of the previous we used for the “add” option
      # to fulfill everything needed by the “user” usecase
   fi
elif [[ $1 == "group" ]]; then
   # the first argument is “group”, let's shift the pointer to the current argument by one
   shift 1
   if [[ "$1" == "add" ]]; then
      # other code to continue processing – I hope you get the whole logic now
   fi
else
   # the first argument is invalid, let's print the usage by calling usage function
   usage
   exit 1
fi

explaining it in details would be very long, but it's not too difficult to guess how it does work so I leave to you to guess it.

Return values

Every program, and so also your scripts, are supposed to return codes and leave other traces such as their PID: you can easily get these values from the following special variables:

$?

Return value of a process

$!

Last background PID

$$

PID of the current shell

Testing Files and Directories

File existence

if [[ -f /var/log/messages ]]; then
  echo "exists"
fi

-s: exists and has a size greater than zero (is not empty)
-r: exists and running user has read access
-w: exists and running user has write access
-x: exists and running user has execution access

Directory existence

if [[ -d /var/log ]]; then
  echo "exists"
fi

Menu

Reading input from tty

read -p "Enter your name: " name

Reading input from tty disabling echoes (e.g password)

read -s -p "Enter your password: " password

Reading Yes or No input from tty

done="processing"
until [ $done == "done" ]
do
  read -p "Do you want to proceed (Y/n)?"
  choice [[ -z ${choice} ]] && choice="Y"
  case $choice in
    [Yy1])
      echo "You choose yes"
      done="done"
      ;;
    [Nn0])
      echo "You choose no"
      done="done"
      ;;
    *)
      echo "invalid option"
      ;;
  esac
done

Select

OPTIONS="Hello Quit"
select opt in $OPTIONS
do
  if [ "$opt" = "Quit" ]; then 
    echo done
    exit
  elif [ "$opt" = "Hello" ]; then
    echo Hello World
  else
    clear
    echo bad option
  fi
done

Logging

custom file descriptors

Opening a new custom file descriptor

exec 3>mylog.log

Writing to the previously opened custom file descriptor

echo hi everybody >&3

Closing the previously opened custom file descriptor

exec 3<&-

Input/Output

Reading from a custom file descriptor

#open the custom input file descriptor
exec 3< input.txt
read -u 3 LINE
# or
read LINE <&3
#close file descriptor
exec 3<&-

> vs |tee

Consider a directory containing files 1 and 2

ls > files.txt

files.txt contains

  • 1
  • 2
  • files.txt
ls |tee files.txt

files.txt contains

  • 1
  • 2

xargs

Consider a directory containing files 1 and 2

ls |rm
usage: rm [-f | -i] [-dPRrvW] file ...
unlink file

you should use xargs - this is because ls lists files separated by "\n"

ls |xargs rm

Footnotes

Here it ends the quick-guide: although I often code using BASH I cannot remember everything, so I wrote it for my own needs, but as it grows I thought that it has become quite mature and that somebody else may benefit if I publish it. So here it is: I hope you enjoyed it.

Writing a post like this takes a lot of hours. I'm doing it for the only pleasure of sharing knowledge and thoughts, but all of this does not come for free: it is a time consuming volunteering task. This blog is not affiliated to anybody, does not show advertisements nor sells data of visitors. The only goal of this blog is to make ideas flow. So please, if you liked this post, spend a little of your time to share it on Linkedin or Twitter using the buttons below: seeing that posts are actually read is the only way I have to understand if I'm really sharing thought or if I'm just wasting time and I'd better give up.

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>