TITLE(«
Write programs that do one thing and do it well. Write programs
to work together. Write programs to handle text streams,
because that is a universal interface. -- Doug MacIlroy
», __file__)
SECTION(«CMD(«sh») versus CMD(«bash»)»)
- Bash scripts begin with a sha-bang: CMD(«#!/bin/bash»)
- sh is the POSIX defined shell specification.
- Bash is one implementation of the sh specification.
- /bin/sh links to the default shell of your system.
- This can be different from your user shell!
- Each shell has its idiosyncracies.
- Using a sha-bang pointing to bash is safer CMD(«#!/bin/bash»)
- < 10.3, the default Mac OS X shell is tcsh (bad!)
- Scripts need to be executable (chmod u+x script).
EXERCISES()
- Your current shell is stored the CMD($SHELL) environment variable.
Use echo to find out what it is.
- Similarly, find out your CMD($BASH_VERSION).
- Use readlink to find out what CMD(/bin/sh) points to in your system.
HOMEWORK(«
Explain why bash does not exit when you type CMD(«Ctrl+C») on the
command line.
»)
SECTION(«Variables»)
- Variables are defined using CMD(«=») (no spaces!).
- Variables are read using CMD(«$»).
- Spaces are the enemy of variables. Spaces break variables.
- Double quotes CMD(«"») a the defense against spaces.
- braces (curly brackets) CMD(«{}») can also protect variables from
ambiguity. eg: CMD(«${foo}»)bar. They also group commands.
- Single quotes CMD(«'») are like double quotes, but are literal.
- Bash scripts have special variables:
- CMD(«$0»): script full path and name.
- CMD(«$1»): first command line argument.
- CMD(«$2»): second argument ... etc.
- CMD(«$#»): number of command line arguments.
- CMD(«$*»): list of arguments as a single string.
- CMD(«$@»): list of arguments as a delimited list.
- Parentheses CMD(«()») execute a command in a sub-shell.
- Double parentheses return the result of arithmetic expansion
(positive integer only).
EXERCISES()
- Write a simple script in which you define a variable with the string
"Hello World!". echo this variable without quotes, with single and
double quotes, and with braces (again with and without different
quotes). Become comfortable with the results.
- How do you return the results of a sub-shell ()?
- Write a simple script to add two positive integers.
- Write a simple script to add two positive integers supplied as
arguments to the script.
»)
HOMEWORK(«
Write a script using your favorite editor. The script should display
the path to your home directory and the terminal type that you
are using.
»)
SECTION(«Tests»)
- CMD(«[...]») is the POSIX sh test function.
- CMD(«[[...]]») is the Bash test function (more powerful).
- These tests are logical: they return TRUE or FALSE.
- Tests use logical operators.
- Spaces are a must!
- There are three types of operators: File, String and Integer.
- A few single file operators eg: CMD(«[[ -e somefile ]]»)
- CMD(«-e»): file exists
- CMD(«-s»): file not zero size
- CMD(«-d»): file is a directory
- CMD(«-r»): you have read permission
- CMD(«-O»): you are the owner of the file
- A few multiple file operators eg: CMD(«[[ file1 -nt file2 ]]»)
- CMD(«-nt»): first file newer than second
- CMD(«-ot»): first file older than second
- A few integer operators:
- CMD(«-eq»): equal to
- CMD(«-ne»): not equal to
- CMD(«-gt»): greater than
- CMD(«-ge»): greater than or equal to
- CMD(«-lt»): less than
- CMD(«-le»): less than or equal to
- A few string operators:
- CMD(«==»): equal to
- CMD(«!=»): not equal to
- CMD(«=~»): regex match (Bash specific)
- CMD(«-z»): string is null (zero length)
- CMD(«-n»): string is not null (zero length)
- When you understand how these operators work, you will have a good
idea of the kinds of things you can do in Bash.
- Tests can be combined with CMD(«&&»): "and" and CMD(«||») "or".
HOMEWORK(«
Write a script that checks whether or not you own a file, and reports
back if you do not. (This is useful if you are working on multiple
user systems and need your script to remove many files.
»)
SECTION(«Conditionals»)
- The most commonly used Bash conditional structure is: CMD(«if»)
... CMD(«then») ... CMD(«fi»)
- A shorter version uses logic in place of if..then..fi. eg: CMD(«[[
test ]] && { execute if TRUE; also execute }»)
EXERCISES()
- Modify your calculator script to check for valid inputs:
- There must be two.
- They should not have letters
- Write a script to check for a directory, and create it if it
doesn't exist.
- Write a script to remove a file specfied as an argument, but only
if you are its owner.
HOMEWORK(«
Write a script that takes exactly one argument, a directory name. If
the number of arguments is more or less than one, print a usage
message. If the argument is not a directory, print another message. For
the given directory, print the five biggest files and the five files
that were most recently modified.
»)
SECTION(«Loops»)
- The most commonly used Bash loop structure is: CMD(for ... do
... done)
- The CMD(for) statement behaves a lot like a variable assignment.
- File globbing works: CMD(«for file in *.txt; do; done»)
- Sequences are also useful:
- CMD(«for num in {1..5}; do; echo $num; done 1 2 3 4 5»)
- CMD(«for num in {1..10..2}; do; echo $num; done 1 3 5 7 9»)
EXERCISES()
- Write a script that stores the results of a arithmetic in files
named by the inputs.
HOMEWORK(«
Come up with a for loop which prints the first 10 squares (1, 4,
9, ...).
»)
SECTION(«Pipes and Here Strings»)
- Here strings are an alternative to conventional piping.
- Because they do not spawn a sub-shell, they retain variables in
the shell the script is running in. eg: instead of CMD(head file1 |
cut -f 1) write CMD(head | cut -f 1 <<< file1)
- They can be easier or more difficult to read, depending on your
taste.
EXERCISES()
Write a script that uses pipes, change it to use a Here string.
HOMEWORK(«
Tie all the above together with the following task:
Let's say that you want to perform an analysis by population
(k-means) cluster of some accessions (ecotypes). You want to
generate a separate bed file with the SNPs of the members of
each cluster (which you have previously calculated).
The relevant plink argument is: CMD(«--keep “$keeplist”») where
keeplist is a two column file specifying family and ecotype
(made for human data). We will just duplicate the ecotype
in the two columns. e.g.:
> cat keeplist
88 88
107 107
etc.
I have provided a comma-separated file of which cluster each ecotype
belongs to: CMD(«/tmp/cluster_course/admix_pop.csv») Take a look at
this file. You will see that it is a comma separated file with
ecotype ID numbers in the first column and the cluster assignment
in the second.
Use the 1001_cluster_course files as your test dataset.
You suspect that your clusters might change, (in assignment
and number ), so you want to write a Bash script to generate
separate bed files for a given clustering.
Hints:
- Your script will be called something like this:
sep_clust.sh all_snps_bedfile_root cluster_assignment.csv
- You will have a loop.
- You will generate a list of cluster numbers from the
CMD(«cluster_assignment.csv») file.
- The cluster file has a header! CMD(«tail -n+2») will skip to the
second line.
- CMD(«grep “something$”») matches CMD(«something») at the
end of a line.
- You will generate a “keep” list for each cluster and supply
that to plink.
- CMD(«cut») expects tab-delimited input, but ours is
comma-delimited. Use CMD(«-d ","»).
- The keep list needs the ecotypes printed twice per line. The easiest
thing to use in this case is CMD(«awk»):
awk -v OFS='\t' '{print $1, $1}'
- Here, CMD(«-v») changes an internal value (CMD(«OFS»), the
“Output Field Separator”), CMD(«\t») specifies the delimiter
(a tab), CMD(«{...}») is the command, and CMD(«print $1, $1»)
is the command to print column 1, column 1.
- Remember:
- CMD(«uniq») requires sorted input.
- CMD(«sort -n») specifies a numerical sort.
- Generate as few temporary files as possible.
»)
HOMEWORK(«
If important data have been copied from one system to another,
one might want to check that both copies are identical. This is
fairly easy if they both live on the same system, but can be quite
tricky if they don't. For example, imagine files copied from an
NFS-mounted directory to the local hard drive.
- Write a bash script which checks that all files have been
copied and that all copies have the same size as the original. Hint:
You might want to check out
find
. Generate two almost
identical folders with mkdir a b && touch {a,b}/{1..10} &&
rm b/4 && echo foo > a/7
to try out the script.
- Run
echo bar > b/7
. Will the script detect that
a/7
and b/7
are different?
- Come up with a different script which runs
cmp
to detect content changes. Analyze and compare the running time of
both scripts.
- Now suppose the two directories are stored on different
systems. Assume further that they are so large (or the network
so slow) that transferring the file contents over the network
would take too long. Argue how a cryptographic hash function can
be employed to detect content changes. Write a script which runs
sha1sum
to implement this idea and analyze its running
time.
», «
- Script which prints file names and sizes of all regular files
in the given two directories and compares the result.
#!/bin/bash
list_files() { (cd "$1" && find -type f -printf '%h/%f\t%s\n' | sort); }
(($# != 2)) && exit 1
{ list_files "$1" && list_files "$2"; } | sort | uniq -u
- The above script will not look at file contents. Since
a/7
and b/7
have the same size and the same
base name, the script won't notice they are different.
- Script which compares the contents of all regular files
in directory
$1
against the files in directory
$2
:
(($# != 2)) && exit 1
{ cd "$1" && find -type f; } | while read -r file; do
cmp "$1/$file" "$2/$file"
done
- The running time of the find command in first script is proportional
to the number of files,
n
. The sort command runs in time
proportional to n * log(n
). Since the two commands run
in parallel, the total running time is the maximum of the two. In
practice, since the find
command needs at least one
system call per file (stat(2)
) to get the metadata, and the
kernel must load this information from disk, the running time of the
pipeline is dominated by the running time of find
. Note
that it is independent of the file sizes. The running time of the
second script is proportional to the number of files, plus the sum of
all file sizes, since in the worst case cmp
must read
all files completely to perform its task. In the common situation
where files are much bigger than a few hundred bytes, the sum of all
file sizes dominates the running time. The second script might easily
run orders of magnitute slower than the first.
- Script which computes the sha1 hash of all regular files in
directory
$1
and checks the hashes against the files in
directory $2
on host $3
:
#!/bin/bash
(($# != 3)) && exit 1
cd "$1" && find -type f -print0 \
| xargs -0 sha1sum \
| ssh "$3" "cd $2 && sha1sum -c"
The sha1sum
command reads the full contents of each file,
so the running time is the same as for the script that executed
cmp
. However, only the hashes but no contents are
transferred over the network, and the hashes are computated locally on
each system. Therefore, this approach performs best in practice.
»)
SECTION(«Substitution and Expansion»)
- expansion is performed on the command line after it has been split
into words
- several kinds of expansion: tilde, brace, arithmetic, pathname,
parameter and variable, history
- command substitution
EXERCISES()
- Give an example for each type of expansion.
- Which expansions can change the number of words?
- Create a list of "words" with CMD(«apg -c 42 -n 10000 >
foo»). Transform each word to upper case, using the case-modification
operator CMD(«^^») as follows: CMD(«while read a; do echo ${a^^};
done < foo > bar»). Compare the running time of this command with (a)
CMD(«tr [a-z] [A-Z] < foo > bar») and (b) CMD(«while read a; do tr
[a-z] [A-Z] <<< "$a"; done < foo > bar»). Try to beat the fastest
implementation using your favorite tool (CMD(«sed»), CMD(«perl»),
CMD(«python»), ...).
- The command CMD(«find . -maxdepth 1 -mindepth 1 -type d») lists
all directories in the CWD. Describe an CMD(«ls») command which
does the same.
- Scripts often contain code like CMD(«find . | while read f; do
something with "$f"; done»). While this code works for file names
which contain spaces or tab characters, it is not bullet-proof because
file names may also contain the newline character. The only character
that can not be part of a file name is the null character. This is
why CMD(«find(1)») has the CMD(«--print0») option to separate the
file names in its output by null characters rather than the newline
character. Find a way to make the CMD(«while») loop work when it
is fed a file list produced with CMD(«find --print0»). Check the
correctness of your command by using CMD(«printf 'file\n1\0file 2'»)
as the left hand side of the pipe.
»)
HOMEWORK(«
- Write a shell script CMD(«varstate») which takes the name of a
variable as CMD(«$1») and determines whether the named variable
is (1) set to a non-empty string, (2) set to the empty string, or
(3) unset. Verify the correctness of your script with the following
commands:
- CMD(«foo=bar ./varstate foo # case (1)»)
- CMD(«foo= ./varstate foo # case (2)»)
- CMD(«unset foo; ./varstate foo # case (3)»)
»)
SECTION(«Functions»)
- code block that implements a set of operations
- for tasks which repeat with only slight variations
- syntax: CMD(«f() {commands; }»)
- positional parameters (CMD(«$1»), CMD(«$2»), ...)
- special parameters: CMD(«$*»), CMD(«$@»), CMD(«$#»)
EXERCISES()
- Understand your smiley of the day (and run it if you are brave):
CMD(«:() { :& :& };:»)
- Write a function which checks whether the passed string is a
decimal number.
- Consider this simple function which returns its first argument:
CMD(«foo() { return $1; }»). Find the largest positive integer this
function can return.
- Write a function which returns the sum of the first and the 10th
argument.
SECTION(«Arrays and Hashes»)
- bash-2: arrays, bash-4: hashes (associative arrays)
- zero-based, one-dimensional only
- three ways of assigning values
- negative parameters in arrays and string-extraction (bash-4.2)
EXERCISES()
- The following three array assignments are equivalent:
CMD(arr=(one two three)), CMD(«arr=([0]=one [1]=two [2]=three)»),
CMD(«arr[0]=one; arr[1]=two; arr[2]=three»). Discuss the pros and
cons of each version.
- Define an array with CMD(«arr=(one two three)»).
- Learn how to determine the number of elements that have been assigned
(three in this example).
- Convert all entries to upper case without iterating.
- Print all entries which do not contain an CMD(«"o"») character,
again without iterating (result: CMD(«three»)).
- Use arrays to write a bash script that lists itself, including
line numbers, and does not call any external command (CMD(«sed,
awk, perl, python, ...»)). Try to get rid of the loop in this
REFERENCE(«self-list.bash», «solution»),
- CMD(«rot13») is a simple "encryption" algorithm which shifts each
letter in the alphabet a-z by 13 characters and leaves non-letter
characters unchanged. That is, CMD(«a») maps to CMD(«n»),
CMD(«b») maps to CMD(«o»), ..., CMD(«m») maps to CMD(«z»),
CMD(«n») maps to CMD(«a»), and so on. Implement CMD(«rot13»)
using an associative array. Compare your solution with this
REFERENCE(«rot13.bash», «implementation») which reads from
stdin and writes to stdout. Verify that "encrypting" twice with
CMD(«rot13») is a no-op.
- Examine the CMD(BASH_VERSINFO) array variable to check whether the
running bash instance supports negative array indices.
- Write a bash script which reads from stdin and prints the last word
of the input.
HOMEWORK(«
Bash-4.2 added support for negative array indices and string
extraction (count backward from the last element). Apply this feature
to print all but the first and last character of the last word of
each input line.
», «
The script below implements a loop which reads lines from stdin into
an array. In each interation of the loop we use CMD(«${arr[-1]}»)
to get the last word of the line. Substring expansion with -1 as the
offset value refers to the last character within the word.
#!/bin/bash
# The -a option assigns the words of each line to an array variable.
while read -a arr; do
#
# If the input line contains only whitespace, there is
# nothing to do.
((${#arr[@]} == 0)) && continue
#
# Negative array indices count back from the end of the
# array. In particular the index -1 references the last
# element. Hence ${arr[-1]} is the last word.
#
# To print the first and the last character of the last
# word, we use substring expansion:
# ${parameter:offset:length} expands to up to length
# characters of the value of parameter starting at the
# character specified by offset. As for array indices,
# negative offsets are allowed and instruct bash to use
# the value as an offset from the *end* of the value,
# with -1 being the last character.
#
# A negative offset must be separated from the colon by
# a space to avoid confusion with the :- expansion (use
# default values). For example, ${var:-1:1} expands to
# the string "1:1" if a is unset (and the value of var
# otherwise).
echo "${arr[-1]: 0: 1} ${arr[-1]: -1: 1}"
done
»)
SECTION(«Signals»)
- trap
- exit code 128 + n
EXERCISES()
- Run CMD(«sleep 10»), interrupt the command with CMD(«CTRL+C») and
examine CMD(«$?»). Hint: CMD(«trap -l») prints all signal numbers.
- The REFERENCE(«stale_tmpfile.bash», «script») below is flawed
in that it leaves a stale temporary file when interrupted with
CMD(«CTRL+C»). Fix this flaw by trapping CMD(«SIGINT»).
SECTION(«Shell Options»)
- Confusing:
- _many_ options, some really weird ones
- two ways to set options: CMD(«set»), CMD(«shopt»)
- CMD(«set +option») _disables_ CMD(«option»)
- aim: Introduce examples for the most useful options
- CMD(«-x»): debugging
- CMD(«-u»): parameter expansion is treated as error for unset variables
- CMD(«-e»): exit on first error
- pipefail: Get _all_ exit codes of a pipeline
- nullglob: avoid common pitfalls with pathname expansion
- extglob: activate extended pattern matching features
EXERCISES()
- Find at least two bugs in the REFERENCE(«catch_the_bug.bash»,
«script») below. Run the script twice, once with
CMD(«bash catch_the_bug.bash») and once with CMD(«bash -x
catch_the_bug.bash»). Compare the output.
- There is a subtle bug in the the
REFERENCE(«HungarianCamelSquare.bash», «HungarianCamelSquare.bash»)
script below. Run the script with and without bash's CMD(«-u») option
and compare the error messages. Discuss whether it is reasonable to
add CMD(«set -u») to existing scripts.
- What's the exit code of the pipeline CMD(«/notthere | wc -l»)?
Run CMD(«set -o pipefail»), then repeat the command. Search the bash
man page for CMD(«pipefail») and learn about the CMD(«PIPESTATUS»)
array variable. Repeat the above command and examine the contents
of CMD(«PIPESTATUS»).
- Assume that CMD(«/etc») contains only "reasonable" file
names (without space or other "funny" characters). Yet the
REFERENCE(«count_config_files.bash», «count_config_files.bash»)
script is buggy. Point out the flaw _before_ you try it out, then run
it to confirm. Insert CMD(«shopt -s nullglob») before the loop and
run the script again. Search the bash manual page for CMD(«nullglob»)
for an explanation.
HOMEWORK(«
The REFERENCE(«rm_tmp.bash», «rm_tmp.bash») script is seriously
flawed and would sooner or later create major grief if the CMD(«rm»)
command was not commented out. Find at least three bugs in it. Run
CMD(«bash rm_tmp.bash /notthere») and CMD(«bash -e rm_tmp.bash
/notthere») to see the CMD(«-e») option in action.
», «
- If the CMD(«cd») command fails, the CMD(«rm») command will be
executed in the current directory. This can happen for several reasons:
- CMD(«$1») does not exist,
- CMD(«$1») is not a directory,
- The executing user has no permissions to change into CMD(«$1»),
- CMD(«$1») contains whitespace characters,
- CMD(«$1») is a directory on a network share which is currently
unavailable. This does not happen with NFS, but may happen with CIFS
(Microsoft's Common Internet File System).
- If no argument is given, the CMD(«rm») command will be executed
in the home directory.
- The CMD(«rm») command does not remove all files: filenames starting
with a dot will be omitted.
- If the directory contains more files than the maximal number of
arguments in a command line, the CMD(«rm») command fails. The limit
depends on the system, but is often as low as 32768.
- If the directory contains a file named CMD(«-r»), the directory
will be removed recursively.
- If CMD(«$1») is an empty directory, the command fails because
there is no file named CMD(«"*"»). See the CMD(«nullglob») shell
option if you don't know why.
- The command fails if CMD(«$1») contains subdirectories.
- Even the CMD(«echo») command is buggy: If there is a file
CMD(«-n»), it will be treated as an option to CMD(«echo»).
»)
HOMEWORK(«
- Suppose you'd like to remove all leading occurences of the character
CMD(«"a"») from each input line. The script should read input lines
from CMD(«stdin») and write its output to CMD(«stdout»). For
example, the input line CMD(«aabba») should be transformed into
CMD(«bba»).
- Write a bash script that runs a suitable external command of
your choice (e.g., CMD(«sed»), CMD(«awk»), CMD(«perl») or
CMD(«python»)) for each input line.
- Come up with an alternative script that does not run any commands.
- Implement yet another version that uses extended globbing.
- Create a suitable input file with 100000 lines by running
CMD(«base64 < /dev/urandom | head -n 100000 > foo»). Test the
performance of the three implementations of the above script by
executing CMD(«time script < foo») and discuss the result.
», «
- Bash script with external command:
#!/bin/bash
while read line; do
sed -e 's/^a\+//' <<< "$line"
done
- Bash script without external command (note that CMD(«printf»)
is a shell builtin):
#!/bin/bash
while read line; do
n=0
while [[ "${line:$n:1}" == 'a' ]]; do
let n++
done
printf '%s\n' "${line:$n}"
done
- Bash script with extended globbing:
#!/bin/bash
shopt -s extglob
while read line; do
printf '%s\n' "${line/*(a)}"
done
- Running times:
- external command: 289s
- without external command, without extglob: 4s
- extglob: 8s
- Discussion: External commands hurt plenty. Try to avoid them
inside of loops which execute many times. The extglob feature is
handy but is still twice as expensive than the open-coded version
which avoids pattern matching alltogether. Note that the simple
CMD(«sed -e 's/^a\+//' foo») also does the job, and is even two
orders of magnitude faster than the fastest bash version. However,
this approach is not very flexible, hence unsuitable for real world
applications which do more than just write the transformed string
to stdout.
»)
SECTION(«Miscellaneous»)
- IFS
- read -ie
- ** (globbing)
- prompt
- Indirect variable referencing (eval, ${!x}, nameref)
EXERCISES()
- Write a bash script which prints the username and login shell of
each user defined in CMD(«/etc/passwd»). Hint: Set CMD(«IFS»)
and employ the bash CMD(«read») builtin with suitable options to
read each line of CMD(«/etc/passwd») into an array. Compare your
solution with this REFERENCE(«print_login_shells.bash», «script»).
- Run CMD(«read -p "> " -ei "kill -9 -1" t; echo "you entered:
$t"») and note how it provides nice readline-based editing. Check
CMD(«bash») man page for other options to the CMD(«read») builtin,
like CMD(«-s») and CMD(«-t»).
- Run CMD(«ls ~/**/*.pdf»). Search the bash manual page for
CMD(«**») and CMD(«globstar») to understand the meaning of the
CMD(«**») pattern in pathname expansion. Next, run CMD(«shopt -s
globstar && ls ~/**/*.pdf») and marvel.
- Is there a way in bash to distinguish between undefined variables
and variables which have been set to the emtpy string? Hint: examine
the difference between CMD(«${x-42}») and CMD(«${x:-42}»).
- Setting the environment variable CMD(«PROMPT_COMMAND»)
to a function instructs bash to call this function prior to
issuing the prompt. Run CMD(«prompt_command() { PS1="$PWD > ";
}; PROMPT_COMMAND=prompt_command») to change your prompt. Modify
the function to replace the middle part of the path by '...' if
CMD(«$PWD») exceeds 10 characters.
- During parameter expansion, if the first character of a parameter
is an exclamation point (!), bash uses the value of the variable
formed from the rest of parameter as the name of the variable rather
than the value of the parameter itself. This is known as _indirect
expansion_. Run CMD(«a=42; x=a; echo ${!x}») to see the effect.
- Examine and run the REFERENCE(«minmax.bash», «minmax script»)
whose CMD(«minmax()») function is given the _name_ CMD(«X») of a
variable, and a sequence of positive integers. The function computes
the minimum and the maximum given value and sets the variables
CMD(«X_min») and CMD(«X_max») accordingly.
HOMEWORK(«
Get rid of the CMD(«eval») statement in the
REFERENCE(«minmax.bash», «minmax script») by passing
variables declared with CMD(«-n») to assign the CMD(«namref»)
attribute. Hint: search for (nameref) in the bash manual.
»)
HOMEWORK(«
Read the CMD(«bashbug») manual page and discuss
under which circumstances one should file a bug report.
Download the source code of latest version of bash from
XREFERENCE(«ftp://ftp.gnu.org/pub/gnu/bash», «gnu ftp server»),
apply all patches found in the CMD(«bash-4.3-patches») subdirectory
and compile the package. Run the compiled executable and execute
CMD(«echo ${BASH_VERSINFO[@]}»).
»)
SECTION(«Job Control»)
- suspend/resume selected processes
- POSIX.1 (1988)
- aim: understand foreground/background jobs, Ctrl+Z, Ctrl+C,
CMD(«fg»), CMD(«bg»)
- job <=> process group <=> pipeline (+descendants) <=> PGID
- (interactive) session := collection of process groups
- setsid() syscall creates new session, PGID := PID of calling process
- session leader: process which called setsid(), SID: PID of session
leader
- terminal's current process group (TPGID)
- TPGID determines foreground PG = CMD(«{P: PGID(P) == TPGID}»)
EXERCISES()
- Examine all fields in the output of CMD(«ps j»).
- Assume a typical scenario with one background process and another
process running in the foreground. How many sessions are there? Which
of the three processes are session leaders? Determine all process
groups. Verify your result by running CMD(«sleep 100 & ps j»).
- What happens if a background process tries to read from
CMD(«stdin»)? Verify your answer by executing CMD(«cat &»).
- What happens if the session leader terminates while there are
still processes running in a background process group? To find out,
open a terminal, run CMD(«sleep 100&») and kill the session leader
(the shell) with CMD(«kill -9 $$»). Open another terminal and
execute CMD(«ps -aj») and examine the row that corresponds to the
CMD(«sleep») process.
- Look at how bash handles a pipeline by executing CMD(«ps xj |
cat»).
- Verify that in the output of CMD(«ps j») the TPGID and the PID
columns coincide while the two columns differ if the command is run
in the background (CMD(«ps j &»)). Determine the foreground process
group in both cases.
- Read the section on job control in the bash manual and make yourself
familiar with the various ways to refer to a job in bash (CMD(«%»),
CMD(«%n»), CMD(«%-,»), CMD(«%+»)).
SUPPLEMENTS()
SUBSECTION(«stale_tmpfile.bash»)
#!/bin/bash
f=$(mktemp) || exit 1
echo "starting analysis, temporary file: $f"
sleep 100
echo "done, removing $f"
rm -f "$f"
SUBSECTION(«self-list.bash»)
#!/bin/bash
IFS='
'
a=($(cat $0))
for ((i = 0; i < ${#a[@]}; i++)); do
echo "$((i + 1)): ${a[$i]}"
done
SUBSECTION(«rot13.bash»)
#!/bin/bash
declare -A h=(
[a]=n [b]=o [c]=p [d]=q [e]=r [f]=s [g]=t [h]=u [i]=v [j]=w [k]=x
[l]=y [m]=z [n]=a [o]=b [p]=c [q]=d [r]=e [s]=f [t]=g [u]=h [v]=i
[w]=j [x]=k [y]=l [z]=m
)
while read -r line; do
for ((i =0; i < ${#line}; i++)); do
c="${line:$i:1}"
echo -n ${h[$c]:-$c}
done
echo
done
SUBSECTION(«catch_the_bug.bash»)
#!/bin/bash
if (($# == 0)); then
# no argument given, choose a random number instead
x=$(($RANDOM / 3276 + 1)) # between 1 an 10
else
x=$1
fi
echo "1/$x is approximately $((100 / $x))%"
SUBSECTION(«HungarianCamelSquare.bash»)
#!/bin/bash
declare -i ThisVariableIsATemporaryCounter
for ((ThisVariableIsATemporaryCounter=0; ThisVariableIsATemporaryCounter < 10; ThisVariableIsATemporaryCounter++)); do
echo "$ThisVariableIsATemporaryCounter * $ThisVariableIsATemporaryCounter is $(($ThisVariableIsATenporaryCounter * $ThisVariableIsATemporaryCounter))"
done
SUBSECTION(«rm_tmp.bash»)
#!/bin/bash
echo "removing all temporary files in $1"
cd $1
echo removing *
# rm *
SUBSECTION(«count_config_files.bash»)
#!/bin/bash
for c in {a..z}; do
files=(/etc/$c*.conf)
echo "There are ${#files[@]} config files in /etc that start with $c: ${files[@]}"
done
SUBSECTION(«print_login_shells.bash»)
#!/bin/bash
while IFS=: read -ra a; do
echo "${a[0]} ${a[6]}"
done < /etc/passwd
SUBSECTION(«minmax.bash»)
minmax()
{
local var min max
var="$1"
shift
min=$1
max=$1
shift
while (($#)); do
(($1 < $min)) && min=$1
(($1 > $max)) && max=$1
shift
done
eval ${var}_min=$min
eval ${var}_max=$max
}
print_minmax()
{
local var="$1"
local min="${var}_min" max="${var}_max"
echo "min: ${!min}, max: ${!max}"
}
minmax a 3 4 2 9 4
print_minmax a