TITLE(« All software sucks, be it open-source of proprietary. The only question is what can be done with particular instance of suckage, and that's where having the source matters. -- Al Viro (2004) », __file__) SECTION(«Introduction»)
It's safe to bet that every non-trivial program contains bugs. Bugs in the operating system kernel are often fatal in that they lead to a system crash which requires a reboot. However, thanks to the concept of virtual memory, bugs in applications usually affect neither the operating system nor independent processes which happen to run at the same time. This makes user space programs easier to debug than kernel code because the debugging tools will usually run as a separate process.
We then look at valgrind
and gdb
, two popular tools
which help to locate bugs in application software. valgrind
is easy to use but is also limited because it does not alter the
target process. On the other hand, gdb
is much more powerful
but also infamous for being hard to learn. The exercises aim to get
the reader started with both tools.
A couple of exercises on gcc
, the GNU C compiler, ask the
reader to incorporate debugging information into an executable and
to see the effect of various diagnostic messages which show up when
valid but dubious code is being encountered. Warning messages can
be classified into one of two categories. First, there are
warnings which are based on static code analysis. These so-called
compile-time warnings are printed by the compiler when the
executable is being created from its source code. The second approach,
called code instrumentation, instructs the compiler to
add sanity checks to the executable, along with additional code that
prints a warning when one of these checks fails. In contrast to the
compile-time warnings, the messages of the second category show up at
run-time, and are generated by the compiled program rather
than by the compiler.
argv[1]
will be NULL
if
the program is invoked with no arguments. What is going to happen
in this case? Compile the program (cc deref.c -o deref
)
and run it (./deref
) to confirm. deref
program under strace (strace
deref
) and discuss the meaning of si_addr
at the
end of the output. deref
program is executed from the
wrapper script deref.sh
. Explain
why strace deref.sh
does not show very useful
information. Run strace -f deref.sh
compare the output
to the output of the first strace
command where you ran
the binary without the wrapper script. Discuss the implications of
your findings with respect to debugging. strerror.c
program works by compiling and running it, passing 1
as the only argument. Discuss in how far it is possible to prove
correctness of a program by running it. strerror.c
has more flaws than
lines. Point out as many as you can. strerror.c
code is full of bugs,
the code compiles cleany. Discuss the reasons and the implications
of this fact. pinfo gcc
and read Invoking GCC->Warning
Options
to get an idea about the possible options to activate
diagnostic messages. strerror.c
with -Wall
and
explain all warnings shown. valgrind strerror 1
and explain the message
about the "conditional jump". valgrind strerror
with no arguments and explain
the part about the "invalid read". Why is it clear that this part
refers to a NULL
pointer dereference? strerror 2
instead of strerror 1
? Run valgrind strerror
1
and valgrind strerror 2
to confirm your
answer. print_arg1.c
program
crashes due to NULL
pointer dereference when it is called
with no arguments. Compile the program and run it to confirm. gdb print_arg1
. At the (gdb
)
prompt, execute the following commands and explain the output: run
bt
ls -l print_arg1; size print_arg1
and
discuss the meaning of the text
, data
and bss
columns in the output of the size
command. Compile the program again, this time with -g
to add debugging information to the executable. Run ls -l
print_arg1; size print_arg1
again and discuss the difference,
in particular the impact on performance due to the presence of the
debugging information. gdb
commands and note how the
debugging information that has been stored in the executable makes
the bt
output much more useful. Discuss how debugging
could be activated for third-party software for which the source code
is available. -O0
and with -O3
to optimize at different levels (where
-O0
means "don't optimize at all"). Rerun the above
gdb
commands on either excutable and note the difference
regarding print_it(
). ubsan.c
and run the
program as follows: ./ubsan 123456 123456
. Explain why
the the result is not the square of 123456
. Recompile
it with -fsanitize=undefined
. Then run the program again
with the same options. #includeSUBSECTION(«deref.sh»)#include int main(int argc, char **argv) { printf("arg has %zu chars\n", strlen(argv[1])); }
#!/bin/sh ./derefSUBSECTION(«strerror.c»)
#include "stdio.h" #include "stdlib.h" #include "string.h" #include "assert.h" /* print "system error: ", and the error string of a system call number */ int main(int argc, char **argv) { unsigned errno, i; char *result = malloc(25); /* 5 * 5 */ /* fail early on errors or if no option is given */ if (errno && argc == 0) exit(0); errno = atoi(argv[1]); sprintf(result, strerror(errno)); printf("system error %d: %s\n", errno, result, argc); }SUBSECTION(«print_arg1.c»)
#includeSUBSECTION(«ubsan.c»)#include static int print_it(char *arg) { return printf("arg is %d\n", atoi(arg)); } int main(int argc, char **argv) { return print_it(argv[1]); }
#include#include int main(int argc, char **argv) { int factor1 = atoi(argv[1]), factor2 = atoi(argv[2]), product = factor1 * factor2; return printf("%d * %d = %d\n", factor1, factor2, product); }