Part 5. Arrays

There is no "array" type in C. Instead, C uses pointers. The topic of pointers can be difficult to grasp at first, but once you get the hang of them, you'll find them to be a powerful tool of the C language.

You can think of a pointer as an indicator to say, "That's me, but over there." The pointer doesn't store data per se, but points to where memory has been set aside to store data.

Fixed-size arrays

Let's start with a simple example. You can declare a fixed-size array when you write your program. You define a fixed-size array by indicating a variable type, such as int, then the variable name followed by square brackets.

The square brackets indicate to the compiler that this is an array. The compiler will make sure the program has all the "behind the scenes" code when it starts up to reserve ("allocate") enough memory for the array.

You can define arrays in two ways using this method:

  1. You can use a number inside the brackets, to indicate how many entries should be in the array. For example, int numlist[10]; will create an integer array that holds 10 numbers.
  2. You can use empty brackets, and initialize the array with a list. Just like you can initialized a single variable with something like int count = 0; you can initialize a list by assigning a value to it when you declare the array. Put the list of values inside curly braces, such as int tenlist[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };.

Once you've declared the array, you can reference values or members using square brackets. References start from zero; the first member of the array tenlist is actually tenlist[0]. The tenth member of the array is tenlist[9].

Let's use this in a sample program to see how to use a fixed-size array:

#include <stdio.h>

int
main()
{
   int numlist[10];
   int tenlist[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
   int i;

   /* print the values from tenlist .. we know there are 10 entries */

   /* arrays count from zero, so tenlist has entries as 0-9 */

   for (i = 0; i <= 9; i = i + 1) {
      printf("tenlist[%d] is %d\n", i, tenlist[i]);
   }

   /* copy the values from tenlist to numlist */

   /* we already know both arrays are the same size (10 entries) */

   for (i = 0; i <= 9; i = i + 1) {
      numlist[i] = tenlist[i];
      printf("numlist[%d] is now %d\n", i, numlist[i]);
   }

   return 0;
}

In that example, we can copy values from one array to the other, using the same method that we would use to copy two variables of the same type. But because this is an array, we need the square brackets to indicate the specific member of the array. To copy the ith array element from one array into the other, we used numlist[i] = tenlist[i];

Strings

In part 2, we learned about basic data types in C. The char data type stores a single character. But how do you store a string of characters, to form a word or sentence or phrase? You create strings through arrays!

There's one key difference between an array of chars and a string. A string always has a null value (\0) at the end. So if you declared a variable array called string and initialized it using char string[] = "hi"; this string (char array) is actually three chars long. The first two members are h and i, followed by a \0 null value.

C uses these null values to recognize the end of the string. While this might seem a little odd, it actually makes the language a lot easier to use. Without these null-terminated strings, you would need to also know the length of the string every time you used it. The null-terminated string means you can just use the string, and C will take care of knowing where the string ends.

Here's an example of using strings in a C program:

#include <stdio.h>

int
main()
{
   char hello[] = "Hi there!";

   /* how long is the string in 'hello'? */

   printf("<%s>\n", hello);
   puts("<123456789> <-- counting characters");
   puts("<012345678> <-- positions in the array");

   return 0;
}

If you compile and run this program, you'll see this output:

<Hi there!>
<123456789> <-- counting characters
<012345678> <-- positions in the array

The string (char array) is actually 10 characters long, you just don't see the 10th character (the 9th member of the string array) because C doesn't print the null value. Because the null indicates the end of the string, it doesn't get printed.

Programmers often need to figure out the length of a string, so let's write our own function string_length() that figures out the length of any string you give it. The string_length() function walks through the string until it finds the null, then returns the length.

#include <stdio.h>

int
string_length(char string[])
{
   int len = 0;

   /* walk through the string until you find a '\0' */

   /* the '\0' (Null) ends the string */

   while (string[len] != '\0') {
      len = len + 1;
   }

   return len;
}

int
main()
{
   char hello[] = "Hi there!";
   int hello_len;
   int i;

   /* how long is the string in 'hello'? */

   puts("using string_length:");
   hello_len = string_length(hello);
   printf("<%s> is %d characters long\n", hello, hello_len);

   puts("<123456789> - counting characters");
   puts("<012345678> - positions in the array");

   /* print the characters from 'hello' */

   for (i = 0; i <= hello_len; i = i + 1) {
      printf("hello[%d] is <%c>\n", i, hello[i]);
   }

   return 0;
}

Such a function is so useful that C includes a version of it in the standard C library of functions. The strlen() function is defined in the string.h header file, and is used the same way we used string_length(). Let's update the sample program to use the standard strlen() function:

#include <stdio.h>
#include <string.h>                    /* you need this for strlen */

int
main()
{
   char hello[] = "Hi there!";
   int hello_len;
   int i;

   /* how long is the string in 'hello'? */

   puts("using strlen:");
   hello_len = strlen(hello);
   printf("<%s> is %d characters long\n", hello, hello_len);

   puts("<123456789> - counting characters");
   puts("<012345678> - positions in the array");

   /* print the characters from 'hello' */

   for (i = 0; i <= hello_len; i = i + 1) {
      printf("hello[%d] is <%c>\n", i, hello[i]);
   }

   return 0;
}

If you compile and run this sample program, you can see how C treats strings as just arrays of char values. That means you can reference individual members of the string in the same way you would reference individual members of any array, using the square brackets.

using strlen:
<Hi there!> is 9 characters long
<123456789> - counting characters
<012345678> - positions in the array
hello[0] is <H>
hello[1] is <i>
hello[2] is < >
hello[3] is <t>
hello[4] is <h>
hello[5] is <e>
hello[6] is <r>
hello[7] is <e>
hello[8] is <!>
hello[9] is <>

Pointers

Hiding in the technical details of arrays is a concept called pointers. C actually uses pointeres to indicate arrays. Why? Because when the C programming language was invented, computers didn't have much memory, and you can leverage the most from your system if you reference arrays through pointers.

When we defined an array of ten numbers using int numlist[10]; we actually just defined a pointer, and let the compiler allocate enough memory for it. (Technically, this kind of fixed-size array is defined in heap memory, but we don't need to get into that distinction here.) The steps that happened behind the scenes were similar to:

  1. Declare the pointer to the array
  2. Allocate memory, enough to store all members of the array
  3. Set the pointer to indicate the allocated memory

The above example of int numlist[10]; is a simple case because we knew the size of the array when we wrote the program. But what if you don't know how big the array should be until you run the program? For that, you need to do the three steps yourself: declare the pointer, allocate memory, and set the pointer.

Declaring a pointer is similar to declaring any other variable. Except rather than int to declare an array of integers, we need to use int *. For example:

int *numlist;

The second step is to allocate memory, which is done through the standard C function, malloc(). To use malloc(), you need to know how big to make it. To do that, you'll need to do a little math: how many things do you want to store, and how big is each thing? You might know "how many" by the value in an int variable, such as numlist_len. And your program can figure out the "how big" by using the sizeof() operator. This is actually an operator, even though it looks like a function, but you can treat it as a function.

So if we wanted to allocate enough memory for ten int values, we would use malloc() this way:

malloc(10 * sizeof(int));

The malloc() function returns a pointer to a section of memory that is big enough to store the array. Assign this value to the pointer you declared earlier. It's a good idea to do what's called a cast to make sure that the return value of malloc() matches the variable type of your array pointer. For example, if you declared a pointer with int *numlist; then you should cast the return value of malloc() to an int * this way:

numlist = (int *) malloc(numlist_len * sizeof(int));

If malloc() was not able to allocate enough memory, it will instead return a value of NULL to indicate the error. You should always test to make sure you were allocated the memory you asked for.

Once you've done the three steps (declare the pointer, allocate memory, and set the pointer) you can use the array just as we did in the earlier example. Here's a sample program that allocates 10 elements of an integer array, assigns values to each element, and prints them:

An important thing to remember is that if you allocate memory for an array, you should always tell the system when you are done using it. This is usually not a problem if your program is short enough, and allocates memory, uses it, then the program ends. In those simple cases, the allocated memory will be released anyway when the program exits back to the operating system. But in most programs, if you don't remember to release the memory that you've allocated, you'll end up allocating more and more memory until you run out. This is called a memory leak.

To release the memory that you've asked for, you need to use the free() function. I've included that in my sample program:

#include <stdio.h>
#include <stdlib.h>                    /* you need this for malloc */

int
main()
{
   int *numlist;
   int numlist_len = 10;
   int i;

   /* allocate memory to numlist */

   /* this requires a cast to a pointer type using (int *) */

   numlist = (int *) malloc(numlist_len * sizeof(int));

   if (numlist != NULL) {
      puts("numlist has been allocated successfully");
   }
   else {
      puts("something went wrong! numlist was not allocated");
      return 1;
   }

   /* copy values into numlist */

   for (i = 0; i < numlist_len; i = i + 1) {
      numlist[i] = i * 10;
      printf("numlist[%d] is now %d\n", i, numlist[i]);
   }

   /* free the memory when you're done */

   free(numlist);
   puts("numlist has been freed");

   return 0;
}

And remember that strings are just arrays of char values. So you can use malloc() to allocate memory for strings in the same way you allocate memory for other kinds of arrays. Let's update the previous string example using malloc() to reserve the memory:

As we explore this example, remember that you can't just set the contents of an array at once when you use this method. You can initialize an array when you declare the array, but later in the program you cannot just "set" the contents of an array.

And since strings are just arrays of char values, you can't just "set" a string (char array) using the = assignment. Instead, you need to use some function that will copy the individual elements of the character array. In this example, I've called this function string_copy(). My function copies the char values from one string to another, and returns the number of characters that were copied. This should be the same as what strlen() would give you.

How many characters should string_copy() copy? In theory, the source string will be a null-terminated string. But we don't know for sure. To make sure we don't copy more characters than will fit into the destination string, the string_copy() function also takes a maxsize value that limits how many characters should be copied:

#include <stdio.h>
#include <stdlib.h>                    /* you need this for malloc */
#include <string.h>                    /* you need this for strlen */

int
string_copy(char *destination, char *source, int maxsize)
{
   int i;

   /* fill destination with Nulls first */

   for (i = 0; i < maxsize; i = i + 1) {
      destination[i] = '\0';
   }

   /* copy characters from source into destination, but only up to
      maxsize number of characters from source */

   /* WARNING! if source is longer than maxsize, then we won't copy
      everything from source .. and destination will NOT be terminated
      with a Null */

   i = 0;
   while ((source[i] != '\0') && (i < maxsize)) {
      destination[i] = source[i];
      i = i + 1;
   }

   /* return the number of characters copied */

   return i;
}

int
main()
{
   char *hello;
   int hello_size = 80;
   int hello_len;
   int i;

   /* allocate memory to hello */

   hello = (char *) malloc(hello_size * sizeof(int));

   if (hello != NULL) {
      puts("hello has been allocated successfully");
   }
   else {
      puts("something went wrong! hello was not allocated");
      return 1;
   }

   /* copy values into hello */

   string_copy(hello, "Hello world!", hello_size);

   /* print the value of hello */

   hello_len = strlen(hello);
   printf("<%s> is length %d\n", hello, hello_len);

   for (i = 0; i < hello_len; i = i + 1) {
      printf("hello[%d] is <%c>\n", i, hello[i]);
   }

   /* free the memory when you're done */

   free(hello);
   puts("hello has been freed");

   return 0;
}

Many programs need to assign values to strings, so the function to copy a string is included in the standard C library. Let's update the sample program using the strcpy() function. The strcpy() function is basically the same as my string_copy() function, except strcpy() returns a pointer to the new string.

But strcpy() doesn't know how to stop copying characters if the source string isn't null-termianted. So I don't recommend you use strcpy() and instead use a similar function strncp() that has the same usage as my string_copy() function:

#include <stdio.h>
#include <stdlib.h>                    /* you need this for malloc */
#include <string.h>                    /* you need this for strlen and strcpy */

int
main()
{
   char *hello;
   int hello_size = 80;
   int hello_len;
   int i;

   /* allocate memory to hello */

   hello = (char *) malloc(hello_size * sizeof(int));

   if (hello != NULL) {
      puts("hello has been allocated successfully");
   }
   else {
      puts("something went wrong! hello was not allocated");
      return 1;
   }

   /* copy values into hello */

   /* strcpy(hello, "Hello world!"); */
   strncpy(hello, "Hello world!", hello_size);

   /* print the value of hello */

   hello_len = strlen(hello);
   printf("<%s> is length %d\n", hello, hello_len);

   for (i = 0; i < hello_len; i = i + 1) {
      printf("hello[%d] is <%c>\n", i, hello[i]);
   }

   /* free the memory when you're done */

   free(hello);
   puts("hello has been freed");

   return 0;
}

Combining arrays

Arrays are very useful to keep your data organized. You will find lots of uses for arrays. Whenever you need to store a list of something, an array makes it easy.

And you can combine arrays in different ways. Do you need to create a two-dimensional array of integers? That's basically the same as defining a list of pointers for each row, and each row (pointer) is a list of columns.

You create a single list of integers with int *. To create a list of lists, you add another star: int **. Just remember how you reference the array; if you define a list of pointers for each row, then each row is a list of columns, then you need to reference individual elements as array[y][x] so the row (y) goes first, then the column (x).

Do you remember the first way we learned to declare an array, using square brackets? The declaration char **array; is the same as writing char *array[]; because the empty square brackets just defines a pointer, so you still end up with a list of pointers.

#include <stdio.h>
#include <stdlib.h>

int
main()
{
   int **array;
   int rows = 3;
   int cols = 4;
   int y, x;

   /* allocate the rows */

   array = (int **) malloc(rows * sizeof(int *));

   if (array == NULL) {
      puts("something went wrong! array was not allocated");
      return 1;
   }

   /* allocate the columns */

   for (y = 0; y < rows; y = y + 1) {
      array[y] = (int *) malloc(cols * sizeof(int));

      if (array[y] == NULL) {
         printf("oh no! could not allocate row %d\n", y);

         /* this would be a big problem, so we would want to abort the
            program. free the memory allocated so far, then quit. */

         /* since this is a demo program, we'll just quit */

         return 1;
      }
   }

   /* store data in each cell */

   for (y = 0; y < rows; y = y + 1) {
      for (x = 0; x < cols; x = x + 1) {
         array[y][x] = 100 + (y * 10) + x;

         printf("array[%d][%d] is now %d\n", y, x, array[y][x]);
      }
   }

   /* free the memory */

   for (y = 0; y < rows; y = y + 1) {
      free(array[y]);
   }

   free(array);

   /* done */

   return 0;
}

If you compile and run this program, you'll see that each cell is assigned a value to represent the y and x values. I've added 100 so you can see the values in row 0 clearly; otherwise, the first row would print 0, 1, 2, 3.

array[0][0] is now 100
array[0][1] is now 101
array[0][2] is now 102
array[0][3] is now 103
array[1][0] is now 110
array[1][1] is now 111
array[1][2] is now 112
array[1][3] is now 113
array[2][0] is now 120
array[2][1] is now 121
array[2][2] is now 122
array[2][3] is now 123

Parsing the command line

Do you need to create a list of strings? That's basically a list of pointers, and each pointer is a string.

To create a single list of characters (a string) you first declared a char * pointer. To create a list of strings, you also add another star: char **.

And that brings us to our last example for this section: parsing the command line. When you run a program from the command line, you usually type the name of the program and a list of arguments. Programs written in C will share this list in the argument list for the main() function. To keep things consistent, C programs always set this argument list as:

int
main(int argument_count, char *argument_list[])
{
⋮
}

More commonly, you'll see these referenced as argc for the argument count, and argv for the argument vector. In technical terms, a vector is another way to refer to a pointer or an array.

int
main(int argc, char **argv)
{
⋮
}

Using argc and argv makes it really easy to parse the command line, because you really only need to parse a list of strings. In the simple case, we can print the argument list this way:

#include <stdio.h>

int
main(int argc, char **argv)
{
   int i;

   /* print the list of arguments */

   for (i = 0; i < argc; i = i + 1) {
      printf("argv[%d] is %s\n", i, argv[i]);
   }

   return 0;
}

If you compile and run this sample program, you'll see that the first element of the argument vector (argv[0]) is the name of the program itself. To read only the arguments that were passed on the command line, we need to count from 1. That's all we need to write a version of the FreeDOS ECHO command. ECHO simply prints back what it was given, but on a single line:

#include <stdio.h>

int
main(int argc, char **argv)
{
   int i;

   /* print the list of arguments, on one line */

   for (i = 1; i < argc; i = i + 1) {
      printf("%s ", argv[i]);
   }

   /* print the newline */

   putchar('\n');

   return 0;
}

Since the system created the argv array, you don't need to free it when you exit your program.

Standard C library of functions

Now that you've learned about arrays and strings, you might want to examine the standard C library of functions. A few useful functions defined in string.h include:

char *strcat(char *dest, char *src)
char *strncat(char *dest, char *src, int n)
Appends (the first n characters of) the string src to the end of dest
char *strchr(char *str, int c)
Searches for the first occurence of c in the string str. Returns a pointer to the first position of c
char *strrchr(char *str, int c)
Searches for the last occurence of c in the string str. Returns a pointer to the last position of c
int strcmp(char *str1, char *str2)
int strncmp(char *str1, char *str2, int n)
Compares (the first n characters of) two strings and returns a negative number if str1 should be sorted alphabetically before str2, zero if they are the same string, or a positive number if str1 should be sorted after str2
char *strcpy(char *dest, char *src)
char *strncpy(char *dest, char *src, int n)
Copies (the first n characters of) the string src into the string dest
int strlen(char *str)
The length of the string str
char *strstr(char *bigstr, char *str)
Finds the first occurrence of the string str in the string bigstr (assumes bigstr is at least the same length, or longer than, the string str)

REVIEW

To define and use arrays:

  1. Declare the pointer to the array:
    int *array;
  2. Allocate memory, enough to store all members of the array:
    malloc(size * sizeof(int))
  3. Set the pointer to indicate the allocated memory:
    array = malloc(…);
  4. Free the memory when you're done:
    free(array);

Remember that strings are just arrays of characters:

char *string;

All strings should end with a null character (\0). This should happen automatically, but be careful when using functions like strncpy().

PRACTICE

Practice program 1.

In this unit, we wrote our own versions of the strlen() and strncpy() functions, as int string_length(char *string) and int string_copy(char *dest, char *dest, char maxsize). Now write your own version of the strchr() function, as int is_charstring(char *bigstring, char c). Return a true value if the character c exists in the string bigstring, or a false value if not.

Practice program 2.

Write a function char *ltrim_string(char *string) that "trims off" any whitespace characters (space, tab, or newline) from the beginning of a string. Also write the program to test this function.

Practice program 3.

Write a function int is_number(char *string) that tests if a string contains only numbers. This is a test for integers.

Practice program 4.

Write a function char *uppercase(char *string) that converts any letters in a string to uppercase letters. (It is helpful to remember that C tracks characters as their value from the character encoding. Virtually all systems today are based on ASCII, which conveniently puts letters together, in order. So 'a' through 'z' and 'A' through 'Z' are always in sequence.)

Need help? Check out the sample solutions.