[cs631apue] Openmax.c fails earlier than expected

Jan Schaumann jschauma at stevens.edu
Sun Sep 20 14:41:54 EDT 2020


Charles Magyar IV <cmagyar at stevens.edu> wrote:
> 
> pstat -f shows 164 / 3404 open files.
> 
> 3404 - 164 = 3240
> 
> File # 3248 fails with errno 23.
> 
> On a later test
> pstat shows 166/3404 open files
> 3404 -166 = 3238
> 
> File # 3246 fails with errno 23.
> 
> Both times these numbers are off by 8.

This looks correct.  Let's figure out why we're off by
8:

pstat(8) reports 166 files to be open.  We try to open
files and are able to open 3246 files (i.e., start
counting at #0, manage to open #3245, ergo 3246 files
open).  3404 - 3246 = 158.

This suggests that when our program is running, there
are 158 other files open on the system.

But pstat(8) reports 166 files to be open.

What's different between when our program is running
and when pstat(8) is running?

Well, when our program is running, pstat(8) is _not_
running, and when pstat(8) is running, our program is
_not_ running.

How does pstat(8) figure out which files are open?
Per the manual page, it looks at /dev/kmem.  So it
must open that file in the process.  Maybe it's
opening other files?

Let's trace it:

Since /dev/kmem is protected, and pstat(8) is setgid,
we'll need to trace it as the superuser:

$ sudo ktrace -i pstat -f

The ktrace(1) utility allows you to inspect which
system calls a process makes.  (On Linux, a similar
program exists: strace(1))

It writes the analytical data to the file
'ktrace.out', and you can turn it into human-readable
format via kdump(1).

$ kdump > dump

Now let's grep(1) out all open(2) calls:

$ grep -A2 "open(" dump

You'll find that it opens a bunch of libraries, but
then closes them again (the file descriptor returned
by the next open(2) call will be the same as the one
before), but then it opens a few files and keeps them
open (the file descriptors increase).  The files in
question are:

/dev/ksyms (on fd3)
/dev/mem (on fd4)
/dev/kmem (on fd5)
/dev/drum (on fd6)

In addition, the program will have stdin, stdout, and
stderr (fds 0, 1, and 2) open, so that's a total of 7
files, meaning we're missing one.

I'm going to guess that you didn't just run "pstat
-f", but perhaps "pstat -f | wc -l" or "pstat -f |
head" or another pipeline.  In that case, you get an
extra file handle for the pipe, meaning you will see 8
open files when running "pstat -f | wc -l" that you
won't see when running "a.out".

In other words, we are seeing the Heisenberg principle
in action: our observing the system changes the
system. :-)

> My hunch about fstat vs pstat difference (3):
> fstat VS pstat has something to do with
>  stderr, stdout, and stdin

Indeed.  In our case, we ran

$ fstat -A | sort -k1,1 | grep pts/0
ffffcf1371e74040 jschauma sh          1995    0 /dev/pts        3 crw--w----   pts/0 rw
ffffcf1371e74040 jschauma sh          1995    1 /dev/pts        3 crw--w----   pts/0 rw
ffffcf1371e74040 jschauma sh          1995    2 /dev/pts        3 crw--w----   pts/0 rw
...
$ fstat -A | sort -k1,1 | grep -c pts/0
17
$ pstat -f | grep ffffcf1371e74040
ffffcf1371e74040 file        RW   14    0 ffffcf136cc22970   0  372401

But we didn't _just_ run "fstat", we ran "fstat | sort
| grep".

Every command in a pipeline adds three file
descriptors initially connected to the terminal when
the process starts up, so

$ fstat -A | sort -k1,1 | grep -c pts/0
11
$ fstat -A | sort -k1,1 | grep pts/0 | wc -l
      14
$ fstat -A | sort -k1,1 | grep pts/0 | cat | wc -l
      17

This explains why you see three more file descriptors
in use when running "fstat | sort | grep" compared to
"pstat | grep", and you can check the other way around
as well:

$ pstat -f | grep ffffcf1371e74040
ffffcf1371e74040 file        RW   14    0 ffffcf136cc22970   0  372401
$ pstat -f | grep ffffcf1371e74040 | cat
ffffcf1371e74040 file        RW   17    0 ffffcf136cc22970   0  372401
$ pstat -f | grep ffffcf1371e74040 | cat | cat
ffffcf1371e74040 file        RW   20    0 ffffcf136cc22970   0  372401


There's still a bit more going on here, if you like
another challenge:

(I said that every command in the pipeline adds three
file descriptors connected to the terminal, but in a
pipeline like "cmd | cmd1 | cmd2", aren't stdin and
stdout of "cmd1" connected to the pipe, and not the
terminal?  If so, can you make sense of the numbers
shown above?)

-Jan


More information about the cs631apue mailing list