CSCI 3500: Lab 1

File Encryption


System calls are the fundamental API provided by an operating system to application programs. For example, the read() and write() system calls are the basic mechanisms for doing file input and output in Linux. All other file I/O routines, such as C++'s stream operators (<< and >>) or Python's file methods (file.read() and file.write()) are built upon these C-interface system calls.

To make sure we never take these high-level interfaces for granted, we're going to work with Linux's system calls directly! We will use Linux's file manipulation system calls to write a program that encrypts and decrypts files.

In this lab, you will:

  1. Use the open(), close(), read(), and write() system calls to do file I/O
  2. Use the malloc() and free() standard library functions to allocate raw memory, and use pointers to access that memory
  3. Use the ecb_crypt() GNU function to encrypt and decrypt data, which implements a type of encryption called DES
  4. Perform proper Linux-style error checking on all functions that may return an error

Program name

encrypt - encrypt or decrypt a file using DES encryption

Usage

encrypt <key> <input file> <output file> <mode>

key: an 8 character string used as the DES key to encrypt/decrypt

input file: the file to encrypt/decrypt

output file: the result of the operation

mode: specifies whether to encrypt or decrypt- if mode=0, then encrypt the input file, if mode=1 then decrypt the input file

Description

encrypt will encrypt and decrypt files using the GNU C library function ecb_crypt(). You must use the read() and write() system calls (documented at man 2 read and man 2 write, respectively) to read the input file and write to the output file.

encrypt detects the following errors and quits gracefully:

Upon encountering any error, print a useful message and exit() with a negative status code.

If no error is encountered then the program should not produce any output to standard output.

Test Input Files

You can download these files to your local machine with the wget program from the Linux command line. See man wget for details.

Hints

  1. Worry about the encryption last! First construct a working program that just reads the input file and copies it to the output file. Once that works you can add the encryption/decryption step.
  2. Here's a demo file showing how to use ecb_crypt()
  3. Three test files are provided for you to experiment with. Each of them test a different aspect of your program- make sure your program works with all of them!
  4. Look at the man pages for all the functions you use. All of them will give the possible return values as well as how errors are specified.
  5. Use the diff program to compare files and highlight any differences. This is an easy way to detect whether or not a decrypted file is identical to the original source, especially when the files are too large to inspect visually!
  6. Use the wc program to count how many characters are in a file.
  7. The read() system call returns how many bytes it has read. This is useful info needed for both ecb_crypt() and write() . Keep reading the input until read() returns a 0 (end of file) or -1 (error).
  8. You need to call the des_setparity() function on your key before using the encryption function. This is due to the mathematics of the DES encryption algorithm implemented by ecb_crypt().
  9. The ecb_crypt() function works on arbitrarily long arrays of data, but the total size must be evenly divisible by 8. If your message is not divisible by 8, then you will need to pad it with blank space ' ' characters.
  10. If the number of characters in the input file is not divisible by 8 then you don't need to remove the extra padding when you decrypt the file. If you compare your two files with diff you will find a difference on the last line. You can suppress this by using diff -Z.

Documentation

The following man pages will be useful:

Questions

  1. As the answer to the first exercise, list the name(s) of the people who worked together on this lab.

  2. Linux includes the nifty command time, which records how long a command runs. Record how long it takes to encrypt the file test3. The syntax in this case is "time ./encrypt key test3 outfile 0". This will report three measurements: real, user, and sys. How long does your program take to run in real time?

  3. The third parameter to the read() system call controls how many characters can be read at a time. Try modifying this parameter to use a few different values: 8, 80, 800, and 4096. Record how long it takes to encrypt test3 with each setting.

  4. Is it faster to make many system calls that read a few characters, or to make a few system calls that read many characters? Why do you think this is?

  5. Indicate which, if any, extra credit exercises have you have attempted.

Optional Enrichment Exercises


Submission

Create a .tgz archive of your lab directory and email it to dferry@slu.edu. Your submission must include a makefile that will compile your program by simply issuing the command make. You must also include a text file with your answers to the required exercises. Please include your name and the names of any partners in the body of your email.

The simple syntax for creating a .tgz archive is as follows:

tar -zvcf new_archive.tgz lab_directory

The syntax for unpacking a .tgz archive is:

tar -zvxf archive.tgz

Note that your archive must not include any binary executable files, meaning any compiled programs or intermediate build objects (.o files, for example). This will cause your email to be rejected by most services.