[cs631apue] Using fts sort functions

Jan Schaumann jschauma at stevens.edu
Sat Oct 12 17:21:17 EDT 2013


Paul-Anthony Dudzinski <pdudzins at stevens.edu> wrote:
 
> The first thing I noticed is that there are two levels to the sort
> functions: (these are the definitions at the top)
> 
> static int mastercmp(const FTSENT * const *, const FTSENT * const *);
> static int (*sortfcn)(const FTSENT *, const FTSENT *);

These are examples from FreeBSD:
http://svnweb.freebsd.org/base/stable/9/bin/ls/ls.c?revision=244075&view=markup

NetBSD, for example, has a slightly different definition:
http://cvsweb.netbsd.org/bsdweb.cgi/src/bin/ls/ls.c?rev=HEAD&content-type=text/x-cvsweb-markup&only_with_tag=MAIN

static int mastercmp(const FTSENT **, const FTSENT **);
static int (*sortfcn)(const FTSENT *, const FTSENT *);

> So formost:
> 1. Why the aliasing of the sort function and why do the types not match the
> man page for what fts is expecting?

The types do match the manual pages, but you have to compare the manual
pages of the given OS.  For FreeBSD that would be:
http://www.freebsd.org/cgi/man.cgi?query=fts&sektion=3

For NetBSD, that's:
http://netbsd.gw.com/cgi-bin/man-cgi?fts+3

> 2. Logically what exactly is this type : const FTSENT * const * and why was
> it set up like this?

On FreeBSD, the function prototype is set up to declare each of it's
argument to be

const FTSENT * const *;

Let's first look at what

FTSENT **a

would be:

Just like a 'char **argv' is a pointer to a list of pointers to a char.
A pointer to a char is commonly called a "string" (ie 'char *foo'), and
a pointer to a list of those things means that you have an array of
char pointers (or strings).

Now what's an 'FTSENT **a'?  Pretty much the same thing: a pointer to a
list of pointers to an FTSENT (ie an abstract datatype).  Or, in other
words, an array of FTSENT pointers.

The use of the keyword 'const' declares that something cannot be
changed.  So if you have a function

func(const FOO *a, BAR *b)

then you know that 'func' cannot manipulate the first argument given.
The first argument is a pointer to a FOO, so you know that this pointer
cannot be manipulated.  On the other hand, the second argument to 'func'
may be manipulated by 'func' (though it may choose not to do so as
well).

So far so good, but when you have a pointer to a pointer (ie a '**'
construct), you have multiple memory locations that may or may not be
modified depending on how they're protected.

Consider the following program (available on the course website at
http://www.cs.stevens.edu/~jschauma/631/const.c):

     1	/*
     2	 * show the difference between placements of the const keyword
     3	 */
     4	
     5	
     6	#include <stdio.h>
     7	
     8	char * const foo[] = {
     9		"one",
    10		"two",
    11		"three",
    12		NULL
    13	};
    14	
    15	const char *bar[] = {
    16		"one",
    17		"two",
    18		"three",
    19		NULL
    20	};
    21	
    22	int main(int argc, char **argv) {
    23	
    24		foo[0][1] = 'w';
    25		foo[0] = "blah blah blah";
    26	
    27		bar[0][1] = 'w';
    28		bar[0] = "blah blah blah";
    29	
    30		return 1;
    31	}

When you compile this program, you will get different errors for 'foo'
and 'bar':

$ cc -Wall const.c
const.c: In function 'main':
const.c:25:2: error: assignment of read-only location 'foo[0]'
const.c:27:2: error: assignment of read-only location '*(bar[0] + 1u)'


'foo[]' is the same as '*foo', only I can immediately assign values to
the array; otherwise, arrays and pointers are pretty much
interchangeable, which is why you also sometimes see 'int maint(int
argc, char *argv[])'.

In the first case, we have 'char * const foo[]'. In other words, we have
declared that 'foo' is a pointer to a bunch of strings ('char *'s), the
location of which cannot be changed.  Hence, the compiler detects an
error when we attempt to change the first element of 'foo' to become
"blah blah blah".  But note that it lets us change the first character
of the first string, since the first pointer in the array 'foo' still
points to the same memory location -- only the contents at that location
have been changed.

In the second case, we have a 'const char *bar[]'.  In other words, we
have delcared that 'bar' is a pointer to a bunch of strings ('char *'s),
but that each one of _those_ strings is constant.  Hence the compiler
errors when we attempt to manipulate the first character of the string
"one", since "one" was declared to be constant.  We are allowe, however,
to simply have the first element of 'bar' point to a different memory
location, since those pointers are not const.


Now finally, you can also declare _both_ to be constant.  Try this out
by adding a third array of strings to this program declared as

const char * const baz[] = { "one", "two", "three", NULL };

Then try to manipulate that array as above.  You should find that the
compiler will error on _both_ manipulation attempts, as now _both_
pointers are protected.

The same thing occurs in the 'mastercmp' function prototype used by
FreeBSD:  they simply went ahead and made a promise to whoever invokes
that function that neither the pointers in the array nor the FTSENTs
each one points to can be manipulated by 'mastercmp'.


Related to all of this is the more general concept of how C handles
variables:
http://denniskubes.com/2013/04/23/how-to-think-about-variables-in-c/


> 3. Why are the actual sort functions prototyped like (*sortfcn) and named
> differently? Is this some sort of wild card function name replacement?

'sortfcn' is a function pointer, which is assigned the correct sorting
function during getopt processing.  This allows the 'mastercmp' function
to simply fall through to calling 'sortfcn' without having to retain the
logic of 'if option X was given, call sortX' etc.

Each function to which 'sortfcn' can point to takes as arguments two
single const FTSENT pointers, which are passed to it from 'mastercmp',
which follows the function prototype required by fts(3).


I hope this all makes some sense.  If not, try to revisit this after
you've implemented your solution and see if this fits with the above
explanation.  If that is still confusing, let me know and I can go over
it again in class in one of our future lectures.

-Jan


More information about the cs631apue mailing list