Book 9: Custom Data Types

Previous | Next

Preparing for our second useful program

It’s been a while since we did our last useful program, so before y’all doze off, let’s create a little database for our CD collection. I guess you can already see the basic structure our program will have: We can do lists by faking an array with malloc(). But right now, our lists are very limited, as an array can only hold one type of data. So we’d have to create several arrays, one for each type of data, and we’d have to resize each of these arrays to add a new CD to our database. Lots of work. Didn’t I claim programmers were lazy?

They are, and that’s why they invented data structures. A data structure is a type you define yourself that is made up of existing types. Sound complicated? It really isn’t. Think of what each entry for a CD will look in our little database:

  • Artist
  • Composer
  • Name of Album
  • Number of Tracks
  • Is a sampler?

So, how would you translate this into C?

struct CDDatabaseEntry
{
    char    artist[40];
    char    composer[40];
    char    albumName[40];
    int     trackCount;
    bool    isSampler;
};

Not hard to read, is it? A data structure is defined using the keyword struct. What follows is a name you give it (which may not contain spaces or other odd characters, just like a variable name). And then you list all the variables this one variable is supposed to group together between curly brackets. You end it with a semicolon.

That’s how you make up your own data type in C. To use it, write something like:

#include <stdio.h>
#include <stdbool.h>

struct CDDatabaseEntry
{
    char    artist[40];
    char    composer[40];
    char    albumName[40];
    int     trackCount;
    bool    isSampler;
};

int main()
{
    struct CDDatabaseEntry    myEntries[10];

    myEntries[0].isSampler = false;

    return 0;
}

You simply write struct structName instead of another data type like int. You can even have arrays of them. And to change the sub-variables, or structure fields as they are called, you use the “.”-operator. So, the line

    myEntries[0].isSampler = false;

above means:

  1. Take element 0 of the array myEntries
  2. Take the isSampler field of this first entry
  3. assign the value false to it.

Similary, you can work with most other data types.

So, let’s see what we need to do: Like every program, we’ll need a main event loop, i.e. a while loop that keeps asking the user what to do. And in addition, we need to be able to create a new entry for the database and show us the entries in our database. Let’s start with our main event loop first:

#include <stdio.h>
#include <stdbool.h>
#include <string.h>

struct CDDatabaseEntry
{
    char    artist[40];
    char    composer[40];
    char    albumName[40];
    int     trackCount;
    bool    isSampler;
};

int main()
{
    bool keepRunning = true;
    char userInput[11];

    while( keepRunning == true )
    {
        printf( "Type NEW , LIST, or QUIT:\n> " );
        scanf( "%10s", userInput );
        fpurge( stdin );

        if( strcmp( userInput, "NEW" ) == 0 )
            printf( "I'll write the code for NEW later...\n\n" );
        else if( strcmp( userInput, "LIST" ) == 0 )
            printf( "I'll write the code for LIST later...\n\n" );
        else if( strcmp( userInput, "QUIT" ) == 0 )
            keepRunning = false; // We're finished.
        else
            printf( "ERROR: Unknown command \"%s\"!\n\n", userInput );
    }

    return 0;
}

Several things I’ll have to point out here:

userInput is an array to hold some text characters. Programmers typically call this a string. When you write text in double quotes in C, it also gives you a string. A string is simply an array of characters. After the last character C always puts the Zero-Character (i.e. the character whose ASCII number is 0 — ‘A’ for example has the ASCII number 65). That way C knows when the string ends, even if, like in our case, the variable in which scanf places the text is 10 characters in length. So, “NEW” is almost the same as writing:

char  myNewString[4];
myNewString[0] = 'N';
myNewString[1] = 'E';
myNewString[2] = 'W';
myNewString[3] = 0;

The %10s-part in our call to scanf is the same as %s. However, the number 10 between the % and the s limits the number of characters it will write to userInput. Since userInput is 11 characters large and (as mentioned above) there has to be a 0-character at the end of the text string, we only let it give us 10 characters (10 + 1 zero-char = 11) so our call to scanf doesn’t run off the end of userInput. And again, we don’t specify an &-operator before userInput on the scanf-line because userInput is an array. And as you know, an array is simply a pointer to a whole chunk of bytes. So, it already is a pointer.

Since both a constant string (e.g. “NEW”) and a string variable (i.e. userInput) are pointers, It doesn’t make much sense to compare them using the == operator. After all, the ==-operator would simply compare the two pointer addresses, and userInput has a completely different memory address than the string constant “NEW”, even if they basically contain the same characters.

You also can’t dereference two strings using the *-operator to compare their characters, because a string is simply a pointer to data of type char, and C wouldn’t know how many chars that pointer points to. In addition, there is a limit on the == operator that essentially means you can’t use it to compare anything but the “short” atomic data types like char, int and bool (as well as pointers, because they’re basically the same as an int). So, you can’t compare two structs, for example.

So, to compare two strings, we’d have to use a while loop to go over each character in the two arrays of characters and compare them, until we either encounter the 0-character in both arrays simultaneously (the strings are equal) or encounter two characters that aren’t the same (not equal). Luckily, the file string.h contains a handy function:

int strcmp( char* strA, char* strB );

which compares the two strings strA and strB and returns 0 if they are the same, and something else otherwise. So we’ll just use that. Note that this function compares case-sensitively, i.e. our program will not accept “New” or “new” as equivalent to “NEW”.

Compile this program and play with it a little. In the next chapter, we’ll put in the actual functionality.

strcmp() is a bit of a multi-purpose function. It is there for checking if two strings are identical, but it is also intended to allow you to compare two strings so you know how to sort them when you want to display a list in alphabetical order. strcmp returns 0 when both strings match, 1 if strA would be sorted before strB, and -1 if strB would be sorted before strA.

We’re not really using that capability here, but it could come in handy one day.

Previous | Next

This entry was posted in C Tutorial. Bookmark the permalink.

5 Responses to Book 9: Custom Data Types

  1. Bill Polhemus says:

    Here’s the part I’m having trouble with:

    Since both a constant string (e.g. “NEW”) and a string variable (i.e. userInput) are pointers, It doesn’t make much sense to compare them using the == operator.

    Where in the code shown here are either the “constant string (e.g. “NEW”)” and/or the string variable userInput referenced as pointers? From what I can see userInput is being assigned a string value [“NEW”, “LIST”, or “QUIT”] from stdin, and the input value in userInput is compared to the list of possible choices, using an if() statement and calling the function strcmp() from the standard library string.h.

    What am I missing?

    • Uli Kusterer says:

      Arrays (and therefore strings) are *always* pointers, implicitly. I noticed that I could have made this clearer, so I updated Chapter 8: Lists of Stuff to explain that better. Scroll down to “Now one last thing” and you’ll have the newly-added explanations.

  2. kareem says:

    could you explain this part of the article?

    “which compares the two strings strA and strB and returns 0 if they are the same, 1 if strA would be sorted before strB, and -1 if strB would be sorted before strA. So we’ll just use that. Note that this function compares case-sensitively, i.e. our program will not accept “New” or “new” as equivalent to “NEW”.”

    i don’t understand the “strA sorted before strB”, etc

    • Bill Polhemus says:

      kareem:

      He means that if for example strA is assigned the value “boo” and strB is assigned the value “hoo,” then (FIRST) they aren’t the same string, and (SECOND) “boo” would come BEFORE “hoo” if they were sorted alphabetically.

      Therefore since “boo” < "hoo" using alphabetic sorting, the strcmp() function would return the value 1.

      If both strA AND strB were assigned the value of “boo” , then the two strings would match and strcmp would return 0.

      Finally, if strA == “hoo” and strB == “boo,” then strB would come first in the alphabetic sort and strcmp() would return the value -1.

    • Uli Kusterer says:

      Bill’s explanation is right on the point. To reduce confusion, I’ve moved the details of strcmp into a “further reading” block.

Leave a Reply

Your email address will not be published. Required fields are marked *


*