Book 10: Stupid CD Database

Previous | Next

Doing the work

Now that we have the framework for our database program, let’s fill it with meaning. Instead of the printf()s, we’ll need to actually do some work. Since it would get kind of unreadable if we stuffed all of this into main(), we’ll write our own functions, DoNewCommand() and DoListCommand() for doing that. I already wrote about the benefits of extracting code that is used repeatedly into a function of its own, but it’s also useful for just making the program easier to read. You’ll spend a lot of time reading your source code – definitely more time than writing it – so do what you can to make your code more readable – it’ll pay off in the thousands. But now, let’s get back to the code:

#include <stdio.h>      // declares printf(), scanf() and fpurge().
#include <stdbool.h>    // declares bool.
#include <string.h>     // declares strcmp().
#include <stdlib.h>     // We'll need that later for malloc() and realloc().

// Data structure:
struct CDDatabaseEntry
    char    artist[40];
    char    composer[40];
    char    albumName[40];
    int     trackCount;
    bool    isSampler;

// Global variables:
int                        gNumDatabaseEntries = 0;
struct CDDatabaseEntry*    gDatabase = NULL;

// Main event loop:
// Fetches user input and calls our DoXXX functions to do the work.
int main()
    bool keepRunning = true;
    char userInput[11];

    while( keepRunning == true )
        printf( "Type NEW , LIST, or QUIT:\n> " );
        scanf( "%10s", userInput );
        fpurge( stdin );

        if( strcmp( userInput, "NEW" ) == 0 )
        else if( strcmp( userInput, "LIST" ) == 0 )
        else if( strcmp( userInput, "QUIT" ) == 0 )
            keepRunning = false; // We're finished.
            printf( "ERROR: Unknown command \"%s\"!\n\n", userInput );
    return 0;

One thing we’ll need to be able to write these functions, though, is a place to keep our database. That’s what gDatabase above is. It’s a pointer-to-CDDatabaseEntry, which we’ll be abusing as a dynamic array. You may be surprised that, so far, we only defined variables at the top of our functions. gDatabase is outside our functions. What gives? Well, when you define a variable at the top of a function, it’s only usable inside the curly brackets for that function. When you define a variable outside any functions, it’s usable by all code below that. That’s what we call a global variable.

Global variables are good when you have one piece of data that needs to be accessed from lots of places. When you have a variable inside a function (called a local variable), only this function can mess with it (unless you hand a pointer to that variable to another function, of course). So, why don’t we always use global variables? Well, we’ll later see additional advantages of local variables, but the main point is that a local variable’s name only has to be unique inside the function you declare it in, while a global variable’s name has to be unique across the whole program, because the whole program can see and use it! So, if you used a global for every x you use, you’d eventually end up with names like x123466.

In addition, if a global can be changed from any function, you’d have to be very careful when calling another function. I’m currently using x27, what if the author of that other function also uses and changes x27 and overwrites my number? Admitted, with a tiny project like this, you can just scroll up two lines and check, but as soon as you do a project that’s actually useful, or you work on a project together with someone else, you’ll love the peace of mind you get from knowing nobody can screw up your local variables.

Note the NULL that we assign to gDatabase. NULL is simply a fancy way of saying “0 as a pointer”. When your computer starts up, it usually loads the operating system somewhere right at the start of your memory. So, the address 0 is guaranteed to lie in the operating system’s code, and can’t be used by your program for anything. So, as a convention, programmers use the memory address 0 to mean “I’m not using that memory yet”.

Okay, so now we have to write those two functions. DoNewCommand() is supposed to create a new array element in gDatabase and fill it out with info it gets from the user, and DoListCommand() is supposed to use a while() loop to print the info for each array element, and DoCleanUp() is supposed to get rid of the memory we malloced for our array. Now, three hints:

  1. There’s no way to find out the size of a pointer created using malloc(), so we’ll also have to keep track of the number of items so we know where our array ends.
  2. Make sure you put the three functions in the source file above main(). The C compiler reads your source file from top to bottom, so you’ll get odd error messages if you try to call a function in main() that the compiler hasn’t seen yet because it’s defined below main().
  3. If the user starts up our application and immediately types in QUIT, we will never have malloc()ed the memory to go in gDatabase (because you can’t malloc() a memory block of size 0, and we never created a database entry). So, be sure that DoCleanUp() can cope with this situation.

Want to give it a try yourself?

Below, I’ll provide my versions of the two functions we need.

scanf() will only read the first word of what you type in. So, you have two options: You can just not write any spaces in the names (e.g. write “Simon_and_Garfunkel”), or you could use the getchar() function in a loop to get the whole line out, until you encounter the '\n' character. I’m leaving that as an exercise to the reader.
void DoNewCommand();

void DoNewCommand()
    char yesOrNo;

    // First, create a new array element (or a new array if we don't have one yet):
    if( gDatabase == NULL )
        gDatabase = malloc( sizeof(struct CDDatabaseEntry) ); // size of 1 element.
        if( gDatabase == NULL )    // Still NULL? malloc() must have returned NULL due to error.
            printf( "ERROR: Couldn't create a new entry!\n" );
        struct CDDatabaseEntry* newPtr = NULL;
        newPtr = realloc( gDatabase, (gNumDatabaseEntries +1) *sizeof(struct CDDatabaseEntry) );
        if( newPtr == NULL )    // Error! Out of memory?
            // We just keep the old pointer in gDatabase.
            printf( "ERROR: Couldn't create a new entry!\n" );
        // newPtr is our new ptr, gDatabase is no longer valid!
        gDatabase = newPtr;    // Remember newPtr in gDatabase.

    // Make sure we remember we have one more entry:
    gNumDatabaseEntries += 1;

    // Now replace the garbage data in the new, last entry with data the user entered:
    printf( "Artist Name: " );
    scanf( "%39s", gDatabase[ gNumDatabaseEntries -1 ].artist );
    fpurge( stdin );

    printf( "Composer: " );
    scanf( "%39s", gDatabase[ gNumDatabaseEntries -1 ].composer );
    fpurge( stdin );

    printf( "Album Name: " );
    scanf( "%39s", gDatabase[ gNumDatabaseEntries -1 ].albumName );
    fpurge( stdin );

    printf( "No. of Tracks: " );
    scanf( "%d", &gDatabase[ gNumDatabaseEntries -1 ].trackCount );
    fpurge( stdin );

    printf( "Sampler? (y/n): " );
    scanf( "%c", &yesOrNo );
    fpurge( stdin );

    gDatabase[ gNumDatabaseEntries -1 ].isSampler = (yesOrNo == 'y' || yesOrNo == 'Y');

Not much special in this function. We’re pretty much just applying what we learned in earlier chapters. Only two things to point out, and they’re all in the lines that mess with gDatabase:

scanf( "%d", &gDatabase[ gNumDatabaseEntries -1 ].trackCount );

The easy one here is that we need to say gNumDatabaseEntries -1 because the number of entries is always 1 bigger than our highest index (our indices start at 0, while a 0 count means no items). And here, we want the number of our newest, last element, which always has the index gNumDatabaseEntries -1.

The other thing to watch out for is called precedence. When you use several operators in a row, there’s a certain order they are evaluated in. Just like

5 + 6 * 4

is evaluated as

5 + (6 * 4)

(because the * and / operators have precedence over the + and – operators), the other operators have an order. In the line above, the critical ones are the &, [] and . operators. The way the compiler will read the above is:

&((gDatabase[gNumDatabaseEntries -1]).trackCount)

I.e. it will first get our last entry, then it will get the trackCount field from that, and only then will it get the address. This will not get the address of gDatabase and then try to use that as an array, and it will not get the last element’s address and try to get a field from that pointer. Obviously, both wouldn’t make sense, but C wouldn’t know that. If you’re in doubt what operator has precedence, you’ll want to either get a good C reference book where you can look it up, or use brackets to make sure C uses the right order. Don’t worry about “unnecessarily” using brackets. Brackets don’t generate any additional code, they simply control the order code is generated in. And they make things more readable, and you know that that’s a Good Thing(tm).

On to our listing function:

void DoListCommand();

void DoListCommand()
    int    x = 0;

    if( gDatabase == NULL )
        printf("There are no CDs in the database.\n");

    while( x < gNumDatabaseEntries )
        printf( "Artist Name: %s\n", gDatabase[ x ].artist );
        printf( "Composer: %s\n", gDatabase[ x ].composer );
        printf( "Album Name: %s\n", gDatabase[ x ].albumName );
        printf( "No. of Tracks: %d\n", gDatabase[ x ].trackCount );
        if( gDatabase[ x ].isSampler )
            printf( "\tThis CD is a sampler.\n" );
        printf( "\n" );    // Add an empty line for space to the next CD.

        x += 1;

This is a pretty common thing, and you’ll probably write lots of loops like this. You’ll always have some sort of counter variable with an initial value (here x = 0), a termination condition that controls when the loop will end (when x < gNumDatabaseEntries is no longer true), and a statement that adds one to the counter (x += 1, a shorter form of writing x = x +1).

Loops like this are actually so common, that C has added a few things to save you some time typing: the for loop and the ++ increment operator. Usually, you’ll be using them in cases like:

    int    x;

    for( x = 0; x < gNumDatabaseEntries; ++x )
        // actual code goes here.

When I started out, this was unreadable gibberish to me. Not only was it pretty much unlabeled and I had no idea what goes where, no, it’s also one command that contains semicolons, so it looked like three commands on one line. And strictly spoken, that’s what it is. If you feel more comfortable using while(), feel free to stick to that. I introduced you to while() first because it can do everything you can do with any of C’s other loop constructs. Everything else is just syntactic sugar. The advantage of for() is that you write the looping stuff in one go. The start value, the end value, the step. When I originally wrote the while() loop above, I forgot to add the x +=1; line, and when I tested my program it got stuck in an endless loop and I had to abort it.

A few more words about the ++ prefix increment operator: If you want to, you can replace ++x above with x = x +1 or with x += 1. That’s fine. You can have loops that take bigger steps than 1 that way. You can also use ++x anywhere you use the others, it will work exactly the same. There’s just one thing you rarely want to do: Don’t write x++ unless you know what you’re doing (i.e. put the ++ after the variable instead of before it). You see, every operation in C has a return value. Yes, even = and +=. Usually, it’s the same as the result of the operation. So, if you write

foo = bar +=1;

This will add 1 to bar, and then assign that value to foo. The same will happen if you write:

foo = bar = bar +1;
foo = ++bar;

But when you write

foo = bar++;

It will first remember bar‘s current value, then add 1 to bar, then use bar‘s old value as the result of the operation and assign that to foo. Confused? Let’s say bar was 20. The three statements above will result in both foo and bar containing 21. The line above, with the postfix increment operator on the other hand will result in foo containing 20, and bar 21. So, whenever you use the ++ operator, be mindful of this difference.

And yes, there’s also a -- operator in both prefix and postfix varieties that you can use to subtract 1 from a variable, too.

Now, let’s quickly cover our clean-up function:

void    DoCleanUp( void );

void    DoCleanUp( void )
    if( gDatabase != NULL )    // We have allocated memory?
        free( gDatabase );
        gDatabase = NULL;                // Not really necessary, but good style.
        gNumDatabaseEntries = 0;

Not much happening here. gDatabase starts out being NULL if we never created an item, so to cover the instant-quit situation, we check for that and do nothing in that case (if we don’t malloc(), we don’t need to free() anything). Otherwise, we free the database and, just to be nice, we set gDatabase back to NULL and gNumDatabaseEntries to 0. In this program that’s pretty unnecessary, but if we were in a bigger program, someone could call DoCleanUp() at some other time to empty the array. This way, we make sure that the rest of the code can still work and won’t crash trying to talk to a pointer that has already been freed (and maybe reused).

Previous | Next

This entry was posted in C Tutorial. Bookmark the permalink.

10 Responses to Book 10: Stupid CD Database

  1. Bill Polhemus says:

    These are old comments and I see that Uli never answered – I hope he’s okay!

    Anyway, I think the problem originally was that Uli didn’t mention that the functions must be declared BEFORE main() starts. For instance, I got the program to run just fine – and I insist on putting the function calls BELOW where main() ends due to my old-timey FORTRAN ways – because I declared the functions as part of the “header” to the main.c file.

    So the first several lines of my code looks like this:

    #include // declares printf(), scanf() and fpurge().
    #include // declares bool.
    #include // declares strcmp().
    #include // We'll need that later for malloc() and realloc().

    // Data structure:
    struct CDDatabaseEntry
    char artist[40];
    char composer[40];
    char albumName[40];
    int trackCount;
    bool isSampler;

    // Global variables:
    int gNumDatabaseEntries = 0;
    struct CDDatabaseEntry* gDatabase = NULL;

    void DoNewCommand();
    void DoListCommand();
    void DoCleanUp( void );

    THEN comes main() and the rest of the code.

    • Uli Kusterer says:

      I’m fine, thanks for asking, just got swamped with work stuff 🙂

      Eero, Reid, Daniel, what Bill mentions won’t be covered until Book 11: Organizing your Code. Either skip ahead to that chapter and then come back, or do like I mention in earlier chapters and make sure that you put each function *above* any functions that need to use it.

  2. Reid says:

    Im getting the same problem!

  3. Tim says:

    Hi Uli,

    Thanks for the wonderful tutorial!

    One question I have on this chapter is:
    scanf( “%d”, &gDatabase[ gNumDatabaseEntries -1 ].trackCount );
    Since gDatabase is a pointer to a struct, shouldn’t the code above be like below?
    scanf( “%d”, &gDatabase[ gNumDatabaseEntries -1 ]->trackCount );

    • B Polhemus says:

      From the Bible ( K&R 2d Edition that is)

      A member of a particular structure is referred to in an expression by a construction of the form
      structure-name . member

    • Uli Kusterer says:

      Good question! I’ve updated Book 8: Lists of Stuff to explain the [] operator a bit better. Look for “Now one last thing” near the bottom.

      Basically, when you do foo[1] or whatever, the result already is de-referenced. You don’t need to do it again.

  4. Daniel says:

    I also got the error messages saying the same thing Eero mentioned.

  5. Anurag Pandey says:

    it is giving me an apple match error?

    • Uli Kusterer says:

      Can you give the exact error message and where in Xcode this is shown? “Apple Match error” sounds like nothing that should come out of the C compiler, so maybe this is unrelated.

  6. Eero says:

    i get error message that says implicit declaration of fuction ‘DoNewCommand’ is invalid in C99 same thing for DoListCommand and DoCleanUp

Leave a Reply

Your email address will not be published. Required fields are marked *