Last Updated: 2023-03-13 Mon 12:16

CSCI 2021 HW08: Assembly Debugging and Stack Security

CODE DISTRIBUTION: hw08-code.zip

  • Download the code distribution
  • See further setup instructions below

CHANGELOG: Empty

1 Rationale

Code that deals with linked data structures such as lists, trees, and hash tables can be difficult to understand even in high-level programming languages and viewing such algorithms as assembly code only increases this difficulty. This HW covers a debugging scenario that shows some data structure processing as assembly code in a debugging exercise.

The HW also introduces a class of problems that can arise when working with data in the function call stack. Since return addresses for functions are also stored in the stack, modifying stack data incorrectly can "clobber" return addresses. Modern compilers like GCC insert some safety measures to detect this but they usually result in programs crashing. This HW studies such a case.

Associated Reading / Preparation

  • Bryant and O'Hallaron: Ch 3.1-3.9 on assembly code instructions in x86-64 are essential for understanding how to follow assembly code in the debugger.
  • Bryant and O'Hallaron: Ch 3.10 on data layout in the stack can provide insight to the behavior in the smash programs of problem 2.
  • CSCI 2021 Quick Guide to gdb: Run Assembly Without Source Code Available: Tricks to deal with binary only files where no source code is available.

Grading Policy

Credit for this HW is earned by taking the associated HW Quiz which is linked under Gradescope. The quiz will ask similar questions as those that are present in the QUESTIONS.txt file and those that complete all answers in QUESTIONS.txt should have no trouble with the quiz.

Homework and Quizzes are open resource/open collaboration. You must submit your own work but you may freely discuss HW topics with other members of the class.

See the full policies in the course syllabus.

2 Codepack

The codepack for the HW contains the following files:

File Description
QUESTIONS.txt Questions to answer
quote_main.c Problem 1 main() function to analyze
quote_data.o Problem 1 binary file used in debugging
smash1.c Problem 2 erroneous program to analyze
smash2.c Problem 2 erroneous program to analyze

3 What to Understand

Ensure that you understand

  • How to gather information about a compiled a.out or .o file using binary utilities like nm and objdump -d
  • How step through compiled code in the debugger to analyze what assembly code is doing even when no source code is available
  • The basics of stack layout and how return addresses can get clobbered if when writing of the end of stack arrays

4 Questions

Analyze the files in the provided codepack and answer the questions given in QUESTIONS.txt.

                           _________________

                            HW 08 QUESTIONS
                           _________________


Write your answers to the questions below directly in this text file to
prepare for the associated quiz. Credit for the HW is earned by
completing the associated online quiz on Gradescope.


PROBLEM 1: Quote Binary Debugging
=================================

  The two files `quote_main.c' and `quote_data.o' can be compiled
  together to form an executable as in the following.
  ,----
  | > gcc quote_main.c quote_data.o
  | > ./a.out
  | Complete this sentence by C++ creator Bjarne Stroustrup:
  | C makes it easy to shoot yourself in the foot; ...
  | 
  | enter a number from 0 to 15: 2
  | 
  | 2: This is why most programmers are such poor dancers.
  | 
  | Have a nice tall glass of ... NOPE.
  `----
  As in a previous exercise, the intention is to use the debugger to
  detect the correct response. In this case however, the correct
  completion is present in `quote_main.c'. However, one must enter a
  number which selects from several responses in an attempt to match the
  correct completion. It is possible to "brute force" the solution by
  trying all solutions. However, the intent of the activity is to
  explore the running code with the debugger to answer the questions
  below. This will give insight into some stages of the binary bomb
  assignment.


A
~

  Use some binary utility programs to print out the symbols that are
  present in `quote_data.o'.  Review the previous lab if you have
  forgotten programs like `nm' and `objdump' can print symbols in a
  binary object file.  Speculate as to which data might pertain to where
  potential options are stored.


B
~

  The entry point into the assembly code in `quote_data.o' is the
  function `get_it'.  Use either the debugger or a disassembly of the
  object to trace whether this functions performs the entire computation
  or if other functions are also used. Use this along with other
  observations of the content of the `quote_data.o' file to infer what
  kind of data structure the choices are stored in.


C
~

  Use the debugger to step through the functions in `quote_data.o'.
  Keep in mind that the parameters to the function follow the standard
  convention: 1st param in register `%rdi', second in `%rsi', and so
  forth.  You should be able to identify a loop in a critical function
  in which the choices are present.  Use `print' and `x' commands in gdb
  to examine data pointed to be registers to help identify where the
  correct response is located.

  Keep in mind that often the `testX' instruction is used to determine
  truthy/falsey qualities about a register. This takes several forms
  that were discussed in lecture
  - `testl %edx, %edx' may be used to check if `%edx' is 0 or negative
  - `testq %rax, %rax' may be used to check if `%rax' is a `NULL'
    pointer
  You are likely to see some of the above uses for the test instruction
  while examining the assembly code in `quote_data.o'

  Recall that you can examine memory addresses pointed to registers with
  gdb commands like the following.
  ,----
  | (gdb) x/d $rax   # print memory pointed to by rax as a decimal integer
  | (gdb) x/x $rax   # print memory pointed to by rax as a hex number
  | (gdb) x/s $rax   # print memory pointed to by rax as a string
  `----

  Include what debugger techniques you used in your answer along with
  the index of the correct completion.


PROBLEM 2: Stack Smashing
=========================

A
~

  Examine the obviously flawed code in `smash1.c'.
  ,----
  |  1  // smash1.c: demonstrates problems with the function call stack
  |  2  #include <stdio.h>
  |  3  
  |  4  void fill_seq(int *a);
  |  5  void demo();
  |  6  
  |  7  int main(){
  |  8    printf("About to do the demo\n");
  |  9    demo();
  | 10    printf("Demo Complete\n");
  | 11    return 0;
  | 12  }
  | 13  
  | 14  void demo(){
  | 15    int arr[4];
  | 16  
  | 17    fill_seq(arr);
  | 18  
  | 19    for(int i=0; i<4; i++){
  | 20      printf("[%d]: %d\n",i,arr[i]);
  | 21    }
  | 22  }
  | 23  
  | 24  #define END 8
  | 25  void fill_seq(int *a){
  | 26    for(int i=0; i<END; i++){
  | 27      a[i] = (i+1)*2;
  | 28    }
  | 29  }
  `----
  Compiling and running this code with gcc on a Linux system (and
  perhaps in other compiler/OS configurations) will yield the following
  kind of behavior.
  ,----
  | > gcc smash1.c
  | > ./a.out
  | About to do the demo
  | [0]: 2
  | [1]: 4
  | [2]: 6
  | [3]: 8
  | --- stack smashing detected ---: <unknown> terminated
  | Aborted (core dumped)
  `----
  Describe the kind of error that is occurring in this code and why it
  is referred to as "stack smashing." Identify which part of the code is
  causing the problem.


B
~

  Consider the alternative file `smash2.c'.  Note the similarities to
  the previous program but also slight changes in it.  Compile and run
  this program and report what errors you see. Describe if the errors
  are similar or not and relate these to changes in the program.


C
~

  Recompile both smash programs as shown and use Valgrind to run them
  again. Show your output and comment on how much information Valgrind
  provides about each program run and the out-of-bounds array accesses
  that are performed.  Based on your observations, mention which memory
  area Valgrind monitors more effectively.

  ,----
  | > gcc -g smash1.c
  | > valgrind ./a.out
  | ...
  | 
  | > gcc -g smash2.c
  | > valgrind ./a.out
  | ...
  `----


stack smashing detected ***: terminated
---------------------------------------

  `=8832=' `=8832=' Process terminating with default action of signal 6
  (SIGABRT): dumping core `=8832=' at 0x48B1615: raise (in
  /usr/lib/libc-2.32.so) `=8832=' by 0x489A861: abort (in
  /usr/lib/libc-2.32.so) `=8832=' by 0x48F35E7: __libc_message (in
  /usr/lib/libc-2.32.so) `=8832=' by 0x4983A39: __fortify_fail (in
  /usr/lib/libc-2.32.so) `=8832=' by 0x4983A03: __stack_chk_fail (in
  /usr/lib/libc-2.32.so) `=8832=' by 0x1091EF: demo (smash1.c:22)
  `=8832=' by 0x109172: main (smash1.c:9) `=8832=' `=8832=' HEAP
  SUMMARY: `=8832=' in use at exit: 0 bytes in 0 blocks `=8832=' total
  heap usage: 1 allocs, 1 frees, 1,024 bytes allocated `=8832=' `=8832='
  All heap blocks were freed -- no leaks are possible `=8832=' `=8832='
  For lists of detected and suppressed errors, rerun with: -s `=8832='
  ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) Aborted
  (core dumped)


  > gcc -g smash2.c valgrind ./a.out `=8848=' Memcheck, a memory error
  > detector `=8848=' Copyright (C) 2002-2017, and GNU GPL'd, by Julian
  > Seward et al.  `=8848=' Using Valgrind-3.16.1 and LibVEX; rerun with
  > -h for copyright info `=8848=' Command: a.out `=8848=' About to do
  > the demo `=8848=' Invalid write of size 4 `=8848=' at 0x109233:
  > fill_seq (smash2.c:28) `=8848=' by 0x1091B7: demo (smash2.c:17)
  > `=8848=' by 0x109182: main (smash2.c:9) `=8848=' Address 0x4a40490
  > is 0 bytes after a block of size 16 alloc'd `=8848=' at 0x483A77F:
  > malloc (vg_replace_malloc.c:307) `=8848=' by 0x1091A7: demo
  > (smash2.c:15) `=8848=' by 0x109182: main (smash2.c:9) `=8848=' [0]:
  > 2 [1]: 4 [2]: 6 [3]: 8 Demo Complete `=8848=' `=8848=' HEAP SUMMARY:
  > `=8848=' in use at exit: 0 bytes in 0 blocks `=8848=' total heap
  > usage: 2 allocs, 2 frees, 1,040 bytes allocated `=8848=' `=8848='
  > All heap blocks were freed -- no leaks are possible `=8848='
  > `=8848=' For lists of detected and suppressed errors, rerun with: -s
  > `=8848=' ERROR SUMMARY: 4 errors from 1 contexts (suppressed: 0 from
  > 0) #+END_SRC


  ,----
  | > gcc -g smash1.c
  | > valgrind ./a.out
  | ==8522== Memcheck, a memory error detector
  | ==8522== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
  | ==8522== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
  | ==8522== Command: ./a.out
  | ==8522== 
  | [0]: 2
  | [1]: 4
  | [2]: 6
  | [3]: 8
  |   *** stack smashing detected ***: terminated
  | ==8522== 
  | ==8522== Process terminating with default action of signal 6 (SIGABRT): dumping core
  | ==8522==    at 0x48B1615: raise (in /usr/lib/libc-2.32.so)
  | ==8522==    by 0x489A861: abort (in /usr/lib/libc-2.32.so)
  | ==8522==    by 0x48F35E7: __libc_message (in /usr/lib/libc-2.32.so)
  | ==8522==    by 0x4983A39: __fortify_fail (in /usr/lib/libc-2.32.so)
  | ==8522==    by 0x4983A03: __stack_chk_fail (in /usr/lib/libc-2.32.so)
  | ==8522==    by 0x1091B6: main (smash1.c:16)
  | ==8522== 
  | ==8522== HEAP SUMMARY:
  | ==8522==     in use at exit: 0 bytes in 0 blocks
  | ==8522==   total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
  | ==8522== 
  | ==8522== All heap blocks were freed -- no leaks are possible
  | ==8522== 
  | ==8522== For lists of detected and suppressed errors, rerun with: -s
  | ==8522== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
  | Aborted (core dumped)
  | 
  | 
  | > gcc -g smash2.c
  | > valgrind ./a.out
  | ==8540== Memcheck, a memory error detector
  | ==8540== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
  | ==8540== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
  | ==8540== Command: ./a.out
  | ==8540== 
  | ==8540== Invalid write of size 4
  | ==8540==    at 0x1091FA: fill_seq (smash2.c:21)
  | ==8540==    by 0x10917A: main (smash2.c:9)
  | ==8540==  Address 0x4a40050 is 0 bytes after a block of size 16 alloc'd
  | ==8540==    at 0x483A77F: malloc (vg_replace_malloc.c:307)
  | ==8540==    by 0x10916A: main (smash2.c:7)
  | ==8540== 
  | [0]: 2
  | [1]: 4
  | [2]: 6
  | [3]: 8
  | ==8540== 
  | ==8540== HEAP SUMMARY:
  | ==8540==     in use at exit: 0 bytes in 0 blocks
  | ==8540==   total heap usage: 2 allocs, 2 frees, 1,040 bytes allocated
  | ==8540== 
  | ==8540== All heap blocks were freed -- no leaks are possible
  | ==8540== 
  | ==8540== For lists of detected and suppressed errors, rerun with: -s
  | ==8540== ERROR SUMMARY: 4 errors from 1 contexts (suppressed: 0 from 0)
  | 
  `----

Author: Chris Kauffman (kauffman@umn.edu)
Date: 2023-03-13 Mon 12:16