Node:Input and output, Next:, Previous:Strings, Up:Top



Input and output

Input and output. Talking to the user. Why your printer is a file.

In order for a program to do anything useful, it usually must do some kind of input and output, whether input from the keyboard and output to the screen, or input from and output to the computer's hard disk. While the C language itself does not provide much in the way of input and output functions, the GNU C Library contains so many facilities for input and output that a whole book could be written about them. In this chapter, we will focus on the basics. For more information on the functions described in this chapter, and many more, we urge you to consult Table of Contents.

Most objects from which you can receive input and to which you can send output on a GNU system are considered to be files -- not only are files on your hard disk (such as object code files, C source code files, and ordinary ASCII text files) considered to be files, but also such peripherals as your printer, your keyboard, and your computer monitor. When you write a C program that prompts the user for input from the keyboard, your program is reading from, or accepting input from, the keyboard, in much the same way that it would read a text string from a text file. Similarly, when your C program displays a text string on the user's monitor, it is writing to, or sending output to, the terminal, just as though it were writing a text string to a text file. In fact, in many cases you'll be using the very same functions to read text from the keyboard and from text files, and to write text to the terminal and to text files.

This curious fact will be explored later in the chapter. For now it is sufficient to say that when C treats your computer's peripherals as files, they are known as devices, and each one has its own name, called a device name or pseudo-device name. On a GNU system, the printer might be called /dev/lp0 (for "device line printer zero") and the first floppy drive might be called /dev/fd0 (for "device floppy drive zero"). (Why zero in both cases? Most objects in the GNU environment are counted by starting with zero, rather than one -- just as arrays in C are zero-based.)

The advantage of treating devices as files is that it is often not necessary to know how a particular device works, only that it is connected to the computer and can be written to or read from. For example, C programs often get their input from the keyboard, which C refers to with the file name stdin (for "standard input"), and C programs often send their output to the monitor's text display, referred to as stdout. In some cases, stdin and stdout may refer to things other than the keyboard and monitor; for example, the user may be redirecting the output from your program to a text file with the > command in GNU/Linux. The beauty of the way the standard input/output library handles things is that your program will work just the same.

Before you can read from or write to a file, you must first connect to it, or open it, usually by either the fopen command, which returns its stream, or the open command, which returns its file descriptor. You can open a file for reading, writing, or both. You can also open a file for appending, that is, writing data after the current end of the file.

Files are made known to functions not by their file names, except in a few cases, but by identifiers called "streams" or "file descriptors". For example, printf uses a stream as an identifier, not the name of the file. So does fclose:

fprintf (my_stream, "Just a little hello from fprintf.\n");
close_error = fclose (my_stream);

On the other hand, fopen takes a name, and returns a stream:

my_stream = fopen (my_filename, "w");

This is how you map from names to streams or file descriptors: you open the file (for reading, writing, or both, or for appending), and the value returned from the open or fopen function is the appropriate file descriptor or stream.

You can operate on a file either at a high level or at a low level. Operating on a file at a high level means that you are using the file at a high level of abstraction. (See Introduction, to refresh your memory about the distinction between high and low levels of abstraction.) Using high-level functions is usually safer and more convenient than using low-level functions, so we will mostly concern ourselves with high-level functions in this chapter, although we will touch on some low-level functions toward the end.

A high-level connection opened to a file is called a stream. A low-level connection to a file is called a file descriptor. Streams and file descriptors have different data types, as we shall see. You must pass either a stream or a file descriptor to most input/output functions, to tell them which file they are operating on. Certain functions (usually high-level ones) expect to be passed streams, while others (usually low-level ones) expect file descriptors. A few functions will accept a simple filename instead of a stream or file descriptor, but generally these are only the functions that initialize streams or file descriptors in the first place.

You may consider it a nuisance to have to use a stream or a file descriptor to access your file when a simple file name would seem to suffice, but these two mechanisms allow a level of abstraction to exist between your code and your files. Remember the "black box" analogy we explored at the beginning of the book. By using the data in files only through streams or file descriptors, we are guaranteed the ability to write a rich variety of functions that can exploit the behavior of these two "black box" abstractions.

Interestingly enough, although streams are considered to be for "high-level" input/output, and file descriptors for "low-level" I/O, and GNU systems support both, more Unix-like systems support streams than file descriptors. You can expect any system running ISO C to support streams, but non-GNU systems may not support file descriptors at all, or may only implement a subset of the GNU functions that operate on file descriptors. Most of the file descriptor functions in the GNU library are included in the POSIX.1 standard, however.

Once you have finished your input and output operations on the file, you must terminate your connection to it. This is called closing the file. Once you have closed a file, you cannot read from or write to it anymore until you open it again.

In summary, to use a file, a program must go through the following routine: