CSCI 224 / ECE 317 -- Assembler and Debugger Tutorial

Introduction

This tutorial will introduce you to the GNU tools in the Linux programming environment, which we will be using while covering machine-level programming in CSCI 224 / ECE 317. It introduces the C compiler, gcc, the assembler, as, and the debugger, gdb. You will learn how to compile, assemble, execute, and debug C and assembly language programs.

Assembling and Executing

Goto and startup a Linux machine (a computer running the Linux operating system on an Intel x86 computer).

At SLU, the Linux lab is in Ritter Hall, room 121. You will be given an account and password on this machine, so that you can use it for the assembly programming assignments. You can also remotely login in to turing.slu.edu -- see Remote Access for details.

If you have access to another Linux machine, you may elect to use that for the class instead. Be certain though, that it is an x86-based Linux machine (Intel or AMD processor), and that it can produce 32-bit x86 code (vs. 64-bit x86 code).
Login and open a terminal window.

Login using your account (Note: Please talk with the instructor if you don't have an account or password for your account in the Linux lab.).

After you are logged in, click on the program menu (usually in the bottom-left corner or top-left corner of the display). Select the Applications tab, and then select the System -> Terminal application to open a new terminal window.

The terminal window will default to the home directory for your account. If you haven't already, use the mkdir csci224 command to create a sub-directory for this course. Then switch to this directory using the cd csci224 command.
Download the files for the tutorial.

Download the files needed for the tutorial into the directory you just created by left-clicking with the mouse on the following link (if left-clicking doesn't work, right-click and select "Save Link Target As..."): csci224_tutorial.zip

Unzip the files using the Linux command unzip csci224_tutorial.zip. This will create a directory named tutorial in your directory. This directory contains the C and assembly files needed for the tutorial.
Compile/Assemble the tutorial programs.

After unzipping the files, use the cd tutorial command to switch to the tutorial/ directory. Then compile and assemble the programs using either the make or the make all command (make and make all amount to the same command since typing make by itself defaults to make all).

In the process of "making" the programs, you will see a listing of the commands used to compile and assemble the C and assembly programs using gcc and as. It is recommended that you try executing each one of these commands on your own and use ls -lt to check what files are generated as a result of each command. Be sure to use the make clean command to delete all the executables and .o files before executing the "make" commands on your own.

You may also want to use the less Makefile command to see how the Makefile performs the compilation/assembly process.
Execute the C and assembly executables.

After "making" the programs, try running the C and assembly programs using the tut_cprog and tut_asmprog executables, respectively. Remember, to run an executable that resides in the current directory, precede the executable name by ./ For example, to run tut_cprog, type ./tut_cprog
Debug the C executable, tut_cprog.
To begin debugging the C executable using the gdb debugger, execute the command: gdb tut_cprog.

We will start off by introducing a few simple debugging commands in gdb.
- First of all, start running the program in gdb.
  
  To begin debugging, you must first begin executing the program within gdb. When gdb starts up, the program is merely loaded; it is not yet executing. Before running the program, use the break main command to set a break point at the first instruction in main(). Then type run (or r) to begin running the program. You will note that the program breaks immediately at the entrance to main().
- Use the step (or s) command to trace into instructions.
  
  The step (or s) command traces into each instruction. Effectively, this means that it executes the next single machine/assembly instruction. For procedure or interrupt calls, this means that it executes only the instruction that transfers control to the procedure, causing it to jump into that procedure/interrupt (i.e. it steps 'into' the procedure/interrupt).
  
  Use the s command a couple times to watch the program's progress as you step through the code. When you get to the function call for get_number(), notice that you actually "step into" the procedure so that you can watch the execution of each individual statement in that procedure. If you ever want to re-start from the beginning of the program, simply use the run (r) command again, which will re-start execution of the program from the beginning, stopping immediately at the first breakpoint (at main()).
- Use the next (or n) command to trace over instructions.
  
  The next (or n) command traces over each instruction. Effectively, this means that it executes the next single instruction in the active procedure. For procedure or interrupt calls, this means that it executes the full procedure or interrupt (i.e. it steps 'over' the procedure/interrupt).
  
  Try stepping through the program again, but this time use the n command instead. Notice that this time, you don't jump into the get_number() function. Stepping "over" this function simply causes it to execute "behind the scenes", so that you don't have to watch all the gory details if you don't want/need to.
- Watching variables.
  
  Sometimes it is desirable to watch how the value of a variable changes as you trace through the program. It is possible to watch variables in this fashion using the display {variable name} command.
  
  For example, re-start execution from the beginning of the program again, but this time use the command display n to watch the value of variable "n" as you step through the beginning of the program. To see a full listing of all the variables that are currently being watched, use the info display command. Likewise, if you want to delete a variable from being watched, you can use the command delete display {variable number}.
  
  Note: gdb can only watch variables that haven't been optimized away by the compiler. For example, some simple local variables, such as a loop index (like i in the tut_cprog.c) are optimized by the compiler such that they're only stored in registers, and never updated in memory. Consequently, attempts to watch them will fail -- you can only view their current values by watching the register(s) in which they are stored.
- Display memory.
  
  Often, you will want to watch a whole block of memory as opposed to a small set of variables. There's no way you can automatically have a block of memory displayed after each step like you can with variables, but you can still easily display the contents of memory with a single command after each step. The command x/{number of elements} {start address} will display the specified number of elements starting at the specified address (e.g. x/32 0x841a964 will show the first 32 words of memory between addresses 0x841a964 and 0x841a9E3, inclusive). The default element and display mode is a 4-byte element displayed in hex, but you can specify the desired size and format using appropriate formatting codes. For example, x/32d will display the elements in decimal (d). When debugging C, gdb will attempt to infer the desired size and format from the variable stored at the starting address, but again, you can change as desired -- see help x for more info.
  
  The primary problem in displaying memory is that you must know the address(es) of the memory locations you which to view. Fortunately, when debugging in C, you can simply use the name of the variable, along with the ampersand (&) to denote the address of the variable. For example, to display the first 64 elements of an array myArray, simply use the command x/64 &myArray.
- Display registers.
  
  It is also often useful (moreso with assembly debugging than C debugging) to view the values in the register file. Use the info registers (or i r) command to do this.
- View (local) program code.
  
  While the s and n commands display the next instruction to execute, you may often want to see more complete sections of code. You can display the 10 lines of code around the next statement to execute in the debugger using the list (or l) command. Or if you want to display the 10 lines of code around a function declaration, you can use the list {function name} command (e.g. l main).
- View assembly code.
  
  An alternate way to view the assembly code (or when debugging in C and needing to view the equivalent assembly), use the disassemble {label/function name} (or disas {label/function name}) command (e.g. disas main).
- Other debugging functions:
  
  Exit the debugger using the quit (or q) command.
  
  Continue executing a program using the continue or (c) command. Execution will only stop after reaching a breakpoint, the end of the program, or a similar end-of-program procedure/interrupt call. This function is particularly useful in conjunction with breakpoints.
  
  Also feel free to experiment with other debugging functions. You can use the help (or h) command to find directions for using other debugging utilities.
Continue to trace through tut_cprog in gdb and examine what each instruction does. Be sure to use s and n as needed.
Debug the assembly executable, tut_asmprog.

To begin debugging the assembly executable using the gdb debugger, execute the command: gdb tut_asmprog.

Proceed to trace through the tut_asmprog executable just like you did with the C program -- first start by setting a break point at the main function, run the program, and then proceed to step through following the break point.

Note 1: In assembly, you'll want to use si and ni as opposed to s and n. To see the difference, try using si and ni while debugging C... can you infer the difference?

Note 2: In assembly, any label in the source code (e.g. main:, i_loop:, exit_i_loop:, print_loop, etc.) is treated as a function name, so it works best to use si when stepping through the code, but making sure to use ni on those assembly function calls to external functions (e.g. get_number and printf).
Questions:
Answer the following questions:
1. What number series is produced by the C program? By the assembly program? Are the two programs equivalent?
2. What happens when the user enters a value not in the allowable range?
3. In tut_asmprog.s, what do the CMP and JAE instructions do? What does JMP do? How are they similar? How are they different?
4. In tut_asmprog.s, the i_loop: loop contains the instruction movl %edx, (%ebx, %esi, 4). Explain the purpose/importance of the 4 in the second operand.
5. Use either objdump -d or use gcc's -S option to generate the assembly code corresponding to tut_cprog.c... Inside the first for loop the compiler is using the lea (leal) instruction. What does it compute? Why isn't it using shll like the assembly program?
6. What are the #define preprocessing directives for in the C program? Is there an equivalent in the assembly program (and if so, what)?
7. You likely found the experiences of debugging in C versus debugging in assembly both somewhat similar and somewhat different. What debugging commands did you find more useful for debugging C? For debugging assembly?
8. How would you change the C code so that it printed out every 4^th number, starting from 2 (i.e. 2, 6, 10, 14, etc.)? How would you similarly change the assembly code?
9. Challenge Question: How would you need to change the C code so that it requested two numbers, n and x, so that the program would print out n numbers including every x^th number starting from 1 (i.e. if user input 7 for x, would print out 1, 8, 15, 22, 29, ...) ? How about in the assembly code (Note: this is the tricky part...) ?

Linux software debugging with GDB by David Seager

Debugging with GDB the official GDB manual

GDB and Assembly/Machine Code