| Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. --Brian Kernighan | 
The Bash shell contains no built-in debugger, and only bare-bones debugging-specific commands and constructs. Syntax errors or outright typos in the script generate cryptic error messages that are often of no help in debugging a non-functional script.
Example 29-1. A buggy script
| 1 #!/bin/bash 2 # ex74.sh 3 4 # This is a buggy script. 5 # Where, oh where is the error? 6 7 a=37 8 9 if [$a -gt 27 ] 10 then 11 echo $a 12 fi 13 14 exit 0 | 
Output from script:
| ./ex74.sh: [37: command not found | 
Example 29-2. Missing keyword
| 1 #!/bin/bash 2 # missing-keyword.sh: What error message will this generate? 3 4 for a in 1 2 3 5 do 6 echo "$a" 7 # done # Required keyword 'done' commented out in line 7. 8 9 exit 0 | 
Output from script:
| missing-keyword.sh: line 10: syntax error: unexpected end of file | 
Error messages may disregard comment lines in a script when reporting the line number of a syntax error.
What if the script executes, but does not work as expected? This is the all too familiar logic error.
Example 29-3. test24: another buggy script
| 1 #!/bin/bash 2 3 # This script is supposed to delete all filenames in current directory 4 #+ containing embedded spaces. 5 # It doesn't work. 6 # Why not? 7 8 9 badname=`ls | grep ' '` 10 11 # Try this: 12 # echo "$badname" 13 14 rm "$badname" 15 16 exit 0 | 
Try to find out what's wrong with Example 29-3 by uncommenting the echo "$badname" line. Echo statements are useful for seeing whether what you expect is actually what you get.
In this particular case, rm "$badname" will not give the desired results because $badname should not be quoted. Placing it in quotes ensures that rm has only one argument (it will match only one filename). A partial fix is to remove to quotes from $badname and to reset $IFS to contain only a newline, IFS=$'\n'. However, there are simpler ways of going about it.
| 1 # Correct methods of deleting filenames containing spaces. 2 rm *\ * 3 rm *" "* 4 rm *' '* 5 # Thank you. S.C. | 
Summarizing the symptoms of a buggy script,
It bombs with a "syntax error" message, or
It runs, but does not work as expected (logic error).
It runs, works as expected, but has nasty side effects (logic bomb).
Tools for debugging non-working scripts include
Inserting echo statements at critical points in the script to trace the variables, and otherwise give a snapshot of what is going on.
|  | Even better is an echo that echoes only when debug is on. 
 | 
Using the tee filter to check processes or data flows at critical points.
Setting option flags -n -v -x
sh -n scriptname checks for syntax errors without actually running the script. This is the equivalent of inserting set -n or set -o noexec into the script. Note that certain types of syntax errors can slip past this check.
sh -v scriptname echoes each command before executing it. This is the equivalent of inserting set -v or set -o verbose in the script.
The -n and -v flags work well together. sh -nv scriptname gives a verbose syntax check.
sh -x scriptname echoes the result each command, but in an abbreviated manner. This is the equivalent of inserting set -x or set -o xtrace in the script.
Inserting set -u or set -o nounset in the script runs it, but gives an unbound variable error message at each attempt to use an undeclared variable.
Using an "assert" function to test a variable or condition at critical points in a script. (This is an idea borrowed from C.)
Example 29-4. Testing a condition with an assert
|    1 #!/bin/bash
   2 # assert.sh
   3 
   4 #######################################################################
   5 assert ()                 #  If condition false,
   6 {                         #+ exit from script
   7                           #+ with appropriate error message.
   8   E_PARAM_ERR=98
   9   E_ASSERT_FAILED=99
  10 
  11 
  12   if [ -z "$2" ]          #  Not enough parameters passed
  13   then                    #+ to assert() function.
  14     return $E_PARAM_ERR   #  No damage done.
  15   fi
  16 
  17   lineno=$2
  18 
  19   if [ ! $1 ] 
  20   then
  21     echo "Assertion failed:  \"$1\""
  22     echo "File \"$0\", line $lineno"    # Give name of file and line number.
  23     exit $E_ASSERT_FAILED
  24   # else
  25   #   return
  26   #   and continue executing the script.
  27   fi  
  28 } # Insert a similar assert() function into a script you need to debug.    
  29 #######################################################################
  30 
  31 
  32 a=5
  33 b=4
  34 condition="$a -lt $b"     #  Error message and exit from script.
  35                           #  Try setting "condition" to something else
  36                           #+ and see what happens.
  37 
  38 assert "$condition" $LINENO
  39 # The remainder of the script executes only if the "assert" does not fail.
  40 
  41 
  42 # Some commands.
  43 # Some more commands . . .
  44 echo "This statement echoes only if the \"assert\" does not fail."
  45 # . . .
  46 # More commands . . .
  47 
  48 exit $? | 
The exit command in a script triggers a signal 0, terminating the process, that is, the script itself. [1] It is often useful to trap the exit, forcing a "printout" of variables, for example. The trap must be the first command in the script.
Specifies an action on receipt of a signal; also useful for debugging.
| A signal is a message sent to a process, either by the kernel or another process, telling it to take some specified action (usually to terminate). For example, hitting a Control-C sends a user interrupt, an INT signal, to a running program. | 
A simple instance:
| 1 trap '' 2 2 # Ignore interrupt 2 (Control-C), with no action specified. 3 4 trap 'echo "Control-C disabled."' 2 5 # Message when Control-C pressed. | 
Example 29-5. Trapping at exit
| 1 #!/bin/bash 2 # Hunting variables with a trap. 3 4 trap 'echo Variable Listing --- a = $a b = $b' EXIT 5 # EXIT is the name of the signal generated upon exit from a script. 6 # 7 # The command specified by the "trap" doesn't execute until 8 #+ the appropriate signal is sent. 9 10 echo "This prints before the \"trap\" --" 11 echo "even though the script sees the \"trap\" first." 12 echo 13 14 a=39 15 16 b=36 17 18 exit 0 19 # Note that commenting out the 'exit' command makes no difference, 20 #+ since the script exits in any case after running out of commands. | 
Example 29-6. Cleaning up after Control-C
| 1 #!/bin/bash 2 # logon.sh: A quick 'n dirty script to check whether you are on-line yet. 3 4 umask 177 # Make sure temp files are not world readable. 5 6 7 TRUE=1 8 LOGFILE=/var/log/messages 9 # Note that $LOGFILE must be readable 10 #+ (as root, chmod 644 /var/log/messages). 11 TEMPFILE=temp.$$ 12 # Create a "unique" temp file name, using process id of the script. 13 # Using 'mktemp' is an alternative. 14 # For example: 15 # TEMPFILE=`mktemp temp.XXXXXX` 16 KEYWORD=address 17 # At logon, the line "remote IP address xxx.xxx.xxx.xxx" 18 # appended to /var/log/messages. 19 ONLINE=22 20 USER_INTERRUPT=13 21 CHECK_LINES=100 22 # How many lines in log file to check. 23 24 trap 'rm -f $TEMPFILE; exit $USER_INTERRUPT' TERM INT 25 # Cleans up the temp file if script interrupted by control-c. 26 27 echo 28 29 while [ $TRUE ] #Endless loop. 30 do 31 tail -n $CHECK_LINES $LOGFILE> $TEMPFILE 32 # Saves last 100 lines of system log file as temp file. 33 # Necessary, since newer kernels generate many log messages at log on. 34 search=`grep $KEYWORD $TEMPFILE` 35 # Checks for presence of the "IP address" phrase, 36 #+ indicating a successful logon. 37 38 if [ ! -z "$search" ] # Quotes necessary because of possible spaces. 39 then 40 echo "On-line" 41 rm -f $TEMPFILE # Clean up temp file. 42 exit $ONLINE 43 else 44 echo -n "." # The -n option to echo suppresses newline, 45 #+ so you get continuous rows of dots. 46 fi 47 48 sleep 1 49 done 50 51 52 # Note: if you change the KEYWORD variable to "Exit", 53 #+ this script can be used while on-line 54 #+ to check for an unexpected logoff. 55 56 # Exercise: Change the script, per the above note, 57 # and prettify it. 58 59 exit 0 60 61 62 # Nick Drage suggests an alternate method: 63 64 while true 65 do ifconfig ppp0 | grep UP 1> /dev/null && echo "connected" && exit 0 66 echo -n "." # Prints dots (.....) until connected. 67 sleep 2 68 done 69 70 # Problem: Hitting Control-C to terminate this process may be insufficient. 71 #+ (Dots may keep on echoing.) 72 # Exercise: Fix this. 73 74 75 76 # Stephane Chazelas has yet another alternative: 77 78 CHECK_INTERVAL=1 79 80 while ! tail -n 1 "$LOGFILE" | grep -q "$KEYWORD" 81 do echo -n . 82 sleep $CHECK_INTERVAL 83 done 84 echo "On-line" 85 86 # Exercise: Discuss the relative strengths and weaknesses 87 # of each of these various approaches. | 
Of course, the trap command has other uses aside from debugging, such as disabling certain keystrokes within a script (see Example A-45).
Example 29-8. Running multiple processes (on an SMP box)
|    1 #!/bin/bash
   2 # parent.sh
   3 # Running multiple processes on an SMP box.
   4 # Author: Tedman Eng
   5 
   6 #  This is the first of two scripts,
   7 #+ both of which must be present in the current working directory.
   8 
   9 
  10 
  11 
  12 LIMIT=$1         # Total number of process to start
  13 NUMPROC=4        # Number of concurrent threads (forks?)
  14 PROCID=1         # Starting Process ID
  15 echo "My PID is $$"
  16 
  17 function start_thread() {
  18         if [ $PROCID -le $LIMIT ] ; then
  19                 ./child.sh $PROCID&
  20                 let "PROCID++"
  21         else
  22            echo "Limit reached."
  23            wait
  24            exit
  25         fi
  26 }
  27 
  28 while [ "$NUMPROC" -gt 0 ]; do
  29         start_thread;
  30         let "NUMPROC--"
  31 done
  32 
  33 
  34 while true
  35 do
  36 
  37 trap "start_thread" SIGRTMIN
  38 
  39 done
  40 
  41 exit 0
  42 
  43 
  44 
  45 # ======== Second script follows ========
  46 
  47 
  48 #!/bin/bash
  49 # child.sh
  50 # Running multiple processes on an SMP box.
  51 # This script is called by parent.sh.
  52 # Author: Tedman Eng
  53 
  54 temp=$RANDOM
  55 index=$1
  56 shift
  57 let "temp %= 5"
  58 let "temp += 4"
  59 echo "Starting $index  Time:$temp" "$@"
  60 sleep ${temp}
  61 echo "Ending $index"
  62 kill -s SIGRTMIN $PPID
  63 
  64 exit 0
  65 
  66 
  67 # ======================= SCRIPT AUTHOR'S NOTES ======================= #
  68 #  It's not completely bug free.
  69 #  I ran it with limit = 500 and after the first few hundred iterations,
  70 #+ one of the concurrent threads disappeared!
  71 #  Not sure if this is collisions from trap signals or something else.
  72 #  Once the trap is received, there's a brief moment while executing the
  73 #+ trap handler but before the next trap is set.  During this time, it may
  74 #+ be possible to miss a trap signal, thus miss spawning a child process.
  75 
  76 #  No doubt someone may spot the bug and will be writing 
  77 #+ . . . in the future.
  78 
  79 
  80 
  81 # ===================================================================== #
  82 
  83 
  84 
  85 # ----------------------------------------------------------------------#
  86 
  87 
  88 
  89 #################################################################
  90 # The following is the original script written by Vernia Damiano.
  91 # Unfortunately, it doesn't work properly.
  92 #################################################################
  93 
  94 #!/bin/bash
  95 
  96 #  Must call script with at least one integer parameter
  97 #+ (number of concurrent processes).
  98 #  All other parameters are passed through to the processes started.
  99 
 100 
 101 INDICE=8        # Total number of process to start
 102 TEMPO=5         # Maximum sleep time per process
 103 E_BADARGS=65    # No arg(s) passed to script.
 104 
 105 if [ $# -eq 0 ] # Check for at least one argument passed to script.
 106 then
 107   echo "Usage: `basename $0` number_of_processes [passed params]"
 108   exit $E_BADARGS
 109 fi
 110 
 111 NUMPROC=$1              # Number of concurrent process
 112 shift
 113 PARAMETRI=( "$@" )      # Parameters of each process
 114 
 115 function avvia() {
 116          local temp
 117          local index
 118          temp=$RANDOM
 119          index=$1
 120          shift
 121          let "temp %= $TEMPO"
 122          let "temp += 1"
 123          echo "Starting $index Time:$temp" "$@"
 124          sleep ${temp}
 125          echo "Ending $index"
 126          kill -s SIGRTMIN $$
 127 }
 128 
 129 function parti() {
 130          if [ $INDICE -gt 0 ] ; then
 131               avvia $INDICE "${PARAMETRI[@]}" &
 132                 let "INDICE--"
 133          else
 134                 trap : SIGRTMIN
 135          fi
 136 }
 137 
 138 trap parti SIGRTMIN
 139 
 140 while [ "$NUMPROC" -gt 0 ]; do
 141          parti;
 142          let "NUMPROC--"
 143 done
 144 
 145 wait
 146 trap - SIGRTMIN
 147 
 148 exit $?
 149 
 150 : <<SCRIPT_AUTHOR_COMMENTS
 151 I had the need to run a program, with specified options, on a number of
 152 different files, using a SMP machine. So I thought [I'd] keep running
 153 a specified number of processes and start a new one each time . . . one
 154 of these terminates.
 155 
 156 The "wait" instruction does not help, since it waits for a given process
 157 or *all* process started in background. So I wrote [this] bash script
 158 that can do the job, using the "trap" instruction.
 159   --Vernia Damiano
 160 SCRIPT_AUTHOR_COMMENTS | 
|  | trap '' SIGNAL (two adjacent apostrophes) disables SIGNAL for the remainder of the script. trap SIGNAL restores the functioning of SIGNAL once more. This is useful to protect a critical portion of a script from an undesirable interrupt. | 
| 1 trap '' 2 # Signal 2 is Control-C, now disabled. 2 command 3 command 4 command 5 trap 2 # Reenables Control-C 6 | 
| Version 3 of Bash adds the following internal variables for use by the debugger. 
 | 
| [1] | By convention, signal 0 is assigned to exit. | 
Place for this material granted by linux and technology portal www.net4me.net