Error Reporting
===============

Liam R. Quin
------------


What's wrong with the way things are?

Well, they're probably OK most of the time.
But that doesn't mean that things can't be improved.

What I want from an error message is to understand what I need to do to
correct the problem.  In other words, I want to know the "why" as well as
the "what".
Whilst tereseness can be excellent, it is really succinctness that I value
more.  I don't want my car to say ``?'', I want it to say it's run out of
petrol.

Really I ought to be able to have an on-going dialogue with the computer,
in which case I might be able to interrogate it for more detail.
But current Unix systems don't keep enough context around for a shell
to know why (say) a `cp' or `ls' failed, so I can't have that right now.

The best improvement I could make was to try to ensure that as many as
possible of the programs I used on a daily basis gave good error messages
whenever possible.

So I needed a scheme that could be more or less retrofitted, as well as
providing significant benefits for new programs.

The advantages of an error library are:
* it's so easy to give consistent error messages that there's no excuse
  not to do so!
* the code can be centrally mantained, and messages relating to new
  facilities can be added easily
* the code is easier to port to new or different operating systems,
  because the individual programs don't need to know much detail about the
  underlying opperating system

The main features of this particular library are:
* errors are generally reported in as neutral a tone as possible -- the
  idea is not to scold users for daring to make mistakes!  The single
  most common cause of error situations is bad program design leading
  either to bugs or to difficult-to-use interfaces.
* you can start using it right away just by making a 1-line source change
  and recompiling
* messages are of the form

    program: what-went-wrong -- why-it-failed

  prefixed by $CMDNAME if it's set.
  The consistent form is intended to help -- let me know if it does!

Motivating Examples:

From System V Release 2:
    $ mkdir /bin/hello
    mkdir: can't access /bin
    $
Clearly it means
    You don't have permission to create new entries in "/bin".

Now consider the case where you wrote a shell script, "alf", say:
that contains the lines
    mkdir $dir
    cd $dir

The user might not know you're doing this.

Some Unix systems let you say
    CMDNAME=alf
    export CMDNAME
so the user might see (at best, if the above error message was fixed)
    alf: mkdir: "/bin/alfA2134": permission denied

But woudn't it be easy for the programmer who wrote mkdir(1) to have
done some checking?  How about
    alf: mkdir: can't make directory "/bin/alfA2134" --
		no write permission in "/bin"

Well, it wouldn't be very easy.  There are anawful lot of possible errors,
and the number grows with each new system facility.
There is an array of strings (sys_errlist[errno]) that give some
descriptions, but these are too general to be really useful.
So a library routine can help.

See the documentation for more details.

The rest of this file is for people altering or porting the code...

* If you add stuff, put it in ifdefs wherever possible.

For example,
#ifdef SIGSTOP
	sighold(SIGSTOP);
#endif
is better than
#ifdef BSD
	sighold(SIGSTOP);
#endif
because then non-BSD systems (like Unix V.4 or CCI Sys V) get your new
code automatically where appropriate...

* Try to keep the split between a "what went wrong" and a "why it failed"
  consistently.

* Always give at least as much information as perror() would, so people
  can never moan that the library is on;y good for beginners!  Actually it
  seems to be just as bug a help to programmers!

* Please don't implement a -ken option to print nothing but `?' :-) :-) :-)
  I am not as good at diagnosing problems as him!
  Really, this is a plea against creaping featurism.  The programmer using
  the library can use her own error printing function... there's no need
  for complexity.  In particular, if the interface is noticeably more
  complex than declaring errno and then using perror(3), it won't get
  used at all, so it is *much* better to be simple but not as good.
  LRQELIB is ALREADY TOO COMPLEX.  AND IT'S NOT EVEN FINISHED YET!

  Lee
