CSCI 3500 - Operating Systems

CSCI 3500: Studio 3

Pointers

One tricky part of writing good C code is using pointers correctly. Understanding how pointers work, how to use them to index arrays, and how to reference and dereference data correctly is vital.

In this studio, you will:

Use pointers to index into an array
Write a simple makefile to make compilation easy

Please complete the required exercises below, as well as any optional enrichment exercises that you wish to complete.

As you work through these exercises, please record your answers in a text file. When finished, submit your work via the git repository.

Make sure that the name of each person who worked on these exercises is listed in the first answer, and make sure you number each of your responses so it is easy to match your responses with each exercise.

Required Exercises

As the answer to the first exercise, list the names of the people who worked together on this studio.
Download this file as a starting point (remember that you can use the wget command to download a file directly from a Linux terminal). Open up the file and look at linString and winString. These variables demonstrate two different ways that you can declare character strings in C. Convince yourself of this by printing both strings with printf(). The correct format specifier to print a string looks like this:

printf("%s\n", string_pointer);
Before we go any further, let's set up a Makefile in order to make compiling your program easy. This is a special file that specifies how to compile your project, so rather than having to invoke GCC every time you want to re-build your program, you can simply use the command make.

Create a new file called "Makefile". Inside, on the first line, create a new target by typing "all:". On the next line, first press TAB, and then type your compilation instructions (gcc -o pointers pointers.c). Save and quit your file.
Now type make at the Linux terminal. This command automatically looks for a Makefile in the current directory, and if it can find one, it will execute the instructions found under all:. The make program will print out the commands it executes so you can verify if it is working correctly. Makefiles can get much more complicated, but this simple method is suitable for small software projects.

Leave the answer to this exercise blank, but attach your Makefile when you submit this studio.
Now back to pointers. In C, a string is simply a consecutive sequence of characters in memory with an associated char pointer . (Notice that the definition of our strings both use char as their base type.)

We can access a string by dereferencing the string pointer. A pointer points to data in memory, and dereferencing that pointer gives us the value of the data. You've already done dereferencing through the use of the square bracket index notation. The code linString[0] gives you the first character of linString, the code linString[1] gives you the second character, etc.

Print out each character of linString using a loop with index notation. You can print a character as such: "printf("%c\n", char_to_print). Your output should look like:
```
L
i
n
u
x
!
```
The dereference operator in C is the asterisk (*) and is fundamental to using pointers. We already know that a pointer stores the memory address of data (i.e. it points to data). Just like indexing a pointer, the dereference operator obtains the value of the data that is pointed to.

If the pointer winString is a pointer to a character, what character does it point to? In other words, what do you think is the value of the dereference operation on winString?
Check your answer to the last exercise by dereferencing winString and printing it out. The dereference operator is the asterisk when placed to the left of a pointer. You can print out a single character like so:

printf("%c\n", *pointer_to_string);

What was printed?
Another way to use pointers is with pointer arithmetic. Suppose we have a regular string pointer called ptr, as seen above this points to the first character of the string. To access the next character we could add to this pointer as such:

ptr + 1 //same as saying ptr[1]

or we could access the fourth element of the string by adding:

ptr + 3 //accesses fourth element, same as ptr[3]

The index notation you just used is essentially pointer arithmetic (in fact the C standard defines index notation in terms of pointer arithmetic).

What character is stored in the byte after the first character of winString? Try printing the value of the next few bytes of winString using pointer arithmetic. To do so, add one, two, or three to the pointer before dereferencing. For example: *(pointer_to_string + 1).
Use a loop to print the entire contents of winString using pointer arithmetic, one character at a time.
Recall that a properly formatted C-style string is always null-terminated (ends with the character '\0'). In fact, all that the function printf() does when you ask it to print a string is to start printing characters until encounters that null-terminator.

Copy your solution to the previous exercise and modify it to use the string format specifier ("%s"). Make sure you don't dereference the pointer with an asterisk, but you should do the same pointer arithmetic. What do you think the output is going to be? Run your program and compare your prediction with the output.
You can also use pointers to assign values rather than simply read values. This is done by assigning a value to a dereferenced pointer (using either index notation or pointer arithmetic). For example, the following two statements turn the 'L' in "Linux" into a 'M':

linString[0] = 'M';

*(linString) = 'M';

Write the code to change the string "Linux!" into the string "Minix!". Then, print linString again. Copy and paste your program output as the answer to this exercise.
It's very important to note that the declarations of linString and winString are not functionally identical. The declaration of linstring creates a static array just large enough to hold the string "Linux!". Static arrays are located in the program .data segment or on the stack, depending on whether they are global or local declarations. The important part is that somewhere the compiler allocates seven contiguous bytes to hold the values {'L', 'i', 'n', 'u', 'x', '!', '\0'}, and that these values exist in writeable memory..

To contrast, where linString is a static array, winString is just a pointer. The compiler places the string literal "Windows!" in read-only memory and then assigns the pointer winString to point to there. Try modifying winString like you did with linString. What happens?
Lastly, one thing to keep in mind is how the pointer type interacts with pointer arithmetic. First, let's figure out how big a given data type is, in bytes, using the special sizeof() command in C. The sizeof() command will return the number of bytes in a data type, for example:

sizeof( char ) //returns 1

Write a short segment of code to print out the size of the char data type and the int data type.
Pointer arithmetic is actually based on these data type sizes. If you have a pointer to an integer then the associated size is four bytes. So the statement pointer[2] is actually doing the computation:

pointer + 2*sizeof(int) = pointer + 2*4 bytes = pointer + 8 bytes

We can check this for ourselves directly. Declare a pointer to the first and third elements of the numbers array, as such:
```
int *first = numbers;
int *third = &(numbers[2]);
```
Now, print out these two pointers with the %p format specifier. What is the difference between them? Note that the values are displayed in hexadecimal.
Lastly, explain the declaration of the pointer called third above. What does the ampersand operator do?

Optional Enrichment Exercises

As has been said a few times already, pointers simply point to data that resides in memory. Here's an experiment to convince you of this, if you don't believe me already.

Open up your compiled executable file in a text editor. You will see a lot of gibberish but you'll also see a few recognizable things. Use the search function of your editor to search for the strings "Linux!" and "Windows!". Can you find them?

Now, being slightly careful, use your text editor to change Windows! to Solaris! and save the file. What happens when you re-run the program?

Note: This is not the advisable way to modify a program!