File Input and Output

Reading: Ch. 4.3-4.4


Code from class: fileIO.m

Overview of File I/O

When reading or writing large amounts of data, it is often much more convenient to interact with files. There are two basic approaches to working with files. MATLAB has built-in support for reading or writing using many common high-level file formats (e.g., Excel files, comma-separated values, XML, WAV, AVI, and much more). It also provides low-level support for reading from or writing to files one character at a time.

Low-Level File I/O

Opening a File

The first step to reading or writing a file is to formally open the file from within MATLAB. This is done using a call of the form fopen(filename, permission) where

File Identifiers

Since a program might open several files simulatneously for different purposes, there must be a way to keep track of which file is which for further commands. MATLAB does this by using unique integer known as a file identifier. The call to fopen sends back this ID as a return value. So a typical syntax for opening a file would be

fid = fopen('sample.txt');           % opens the file for reading
We can later use that ID value with other file operations (e.g., fprintf, fscanf, fclose).

It is possible that the call to fopen fails to properly open the file, either because it cannot be found or because the user does not have the necessary permissions for accessing the file in that way. In this case, the call to fopen returns the value -1 as a result, and in fact, returns an error message as a second return value. The program can then look for that as a signal of a problem. For example, here is an example from the MATLAB docuemntation that promps a user for the name of a file to open, repeating until successful.

fid=0;
while fid < 1
   filename=input('Open file: ', 's');
   [fid,message] = fopen(filename, 'r');
   if fid == -1
     disp(message)
   end
end

MATLAB also uses certain IDs for some standard file-like objects
fid purpose
0 standard input
1 standard output
2 standard error

Closing a File

When you are finished accessing an open file, you should close it by calling fclose(fid) using the appropriate file identifier. This ensures that the file is properly saved and it frees us the numeric identifier for reuse. You may also use the syntax fclose('all') to close all open files (other than the "standard" filelike objects).

Writing to a File

Once a file has been opened for writing, we can use fprintf to write text to it. By default, a command such as fprintf('Hello\n') sends the characters to the standard output. But we can send those characters to an open file by giving the file identifier as a first parameter. As an example, here is a small script that creates a new text file.

fid = fopen('sample.txt', 'w');      % opens file sample.txt for writing
fprintf(fid, 'Hello\n');             % writes to that file, based on identifier fid
fclose(fid);                         % closes and saves the file

Reading from a File

There are several functions for reading from a file.

data = fread(fid)
This reads the entire remainder of the open file into a string.

data = fread(fid, n)
This reads a maximum of the next n bytes of the open file.

line = fgetl(fid)
Reads the next line from the file and returns that line (but without the ending newline character) as a result.

line = fgets(fid)
Almost identical to fgetl, but the returned string for this form includes the ending newline character (if any).

fscanf is the most versatile function for reading formatted data from an open file. It uses formatting strings in a way reminiscent of the way fprintf is used to write to a file. Most notably, it uses %s to read a string (delimited by whitespace), %d to read an integer, and %g to read a floating-point value.

Furthermore, fscanf is vectorized in the sense that the call fscanf(fid, '%d') will not just read a single integer, but will attempt to read as many integers as it can until reaching a portion of the file that does not match an integer format. Those integers will be returned as a vector.

If you want to limit the scan to a maximum number of elements to read, you can specify that limit as a third element. For example, fscanf(fid, '%d', 1) will only attempt to read the next integer.

Position within a file

When a file is open, MATLAB internally maintains the position at which it is currently reading or writing. This is essentially like the cursor in a word processor. When initially opened for reading or writing, the position is at the beginning of the file (or when opened for appending, the position is at the end of the file).

We can query the current position using the syntax

position = ftell(fid)
where position measures the number of bytes from the beginning of the file (with -1 returned when the file is not properly opened).

It is also posible to reposition the current cursor using a syntax

fseek(fid, offset, origin)
where offset is a number that can be positive or negative, and origin is a string designate the frame of reference for the seek, as

High-Level File I/O


Originally by
Michael Goldwasser