Node:String overflows with scanf, Previous:scanf, Up:Deprecated formatted string input functions



String overflows with scanf

If you use the %s and %[ conversions improperly, then the number of characters read is limited only by where the next whitespace character appears. This almost cetainly means that invalid input could make your program crash, because input too long would overflow whatever buffer you have provided for it. No matter how long your buffer is, a user could always supply input that is longer. A well-written program reports invalid input with a comprehensible error message, not with a crash.

Fortunately, it is possible to avoid scanf buffer overflow by either specifying a field width or using the a flag.

When you specify a field width, you need to provide a buffer (using malloc or a similar function) of type char *. (See Memory allocation, for more information on malloc.) You need to make sure that the field width you specify does not exceed the number of bytes allocated to your buffer.

On the other hand, you do not need to allocate a buffer if you specify the a flag character -- scanf will do it for you. Simply pass scanf an pointer to an unallocated variable of type char *, and scanf will allocate however large a buffer the string requires, and return the result in your argument. This is a GNU-only extension to scanf functionality.

Here is a code example that shows first how to safely read a string of fixed maximum length by allocating a buffer and specifying a field width, then how to safely read a string of any length by using the a flag.

#include <stdio.h>

int main()
{
  int bytes_read;
  int nbytes = 100;
  char *string1, *string2;

  string1 = (char *) malloc (25);

  puts ("Please enter a string of 20 characters or fewer.");
  scanf ("%20s", string1);
  printf ("\nYou typed the following string:\n%s\n\n", string1);

  puts ("Now enter a string of any length.");
  scanf ("%as", &string2);
  printf ("\nYou typed the following string:\n%s\n", string2);

  return 0;
}

There are a couple of things to notice about this example program. First, notice that the second argument passed to the first scanf call is string1, not &string1. The scanf function requires pointers as the arguments corresponding to its conversions, but a string variable is already a pointer (of type char *), so you do not need the extra layer of indirection here. However, you do need it for the second call to scanf. We passed it an argument of &string2 rather than string2, because we are using the a flag, which allocates a string variable big enough to contain the characters it read, then returns a pointer to it.

The second thing to notice is what happens if you type a string of more than 20 characters at the first prompt. The first scanf call will only read the first 20 characters, then the second scanf call will gobble up all the remaining characters without even waiting for a response to the second prompt. This is because scanf does not read a line at a time, the way the getline function does. Instead, it immediately matches attempts to match its template string to whatever characters are in the stdin stream. The second scanf call matches all remaining characters from the overly-long string, stopping at the first whitespace character. Thus, if you type 12345678901234567890xxxxx in response to the first prompt, the program will immediately print the following text without pausing:

You typed the following string:
12345678901234567890

Now enter a string of any length.

You typed the following string:
xxxxx

(See sscanf, for a better example of how to parse input from the user.)