Status Codes (and OS/2 error codes)

Thu Jul 11 19:17:20 GMT 2002

OS/2 had 16 bit errors - basically the ERRdos range (SMB error class) is
mostly error codes introduced in OS/2 development and could be just as
easily named "ERRos2"

Summary:
A) DOS error class - the most important one.   Most programs had to
understand this range of errors.   It had at least four specific ranges:
      1 - 0x00 to 0x12      reserved for DOS 2.x errors ie the most basic,
common situations
      2 - 0x13 to 0x1F      reserved for remapping of "critical errors"
(now ERRhrd)
      3 - 0x20 to 0x58      reserved for "extended DOS errors"  (DOS 3.x
errors)
      > 0x58 to 2100 were miscellaneous "system" errors added in OS/2 and
perhaps in later versions of DOS
      > 2100 were "net" errors  (error codes introduced in OS/2 relating to
security, user management, file and print sharing)
      > 5300 were "NCB" errors (relating to Netbios applications)
There were various other reserved ranges that were less interesting.

B) SRV error class - these are meant to be understood only by the
redirector but not passed back to the client application.   Think of them
as a private error range specific to "smb server as an application" not
general - something that most programmers needed to be aware of

C) HRD error class - these are "critical errors" (think of "hardware
problems" that could cause "abort, retry, ignore" ...) and are actually
just a reflection of the reserved range in DOS for remapping of "critical
errors" to 0x13 to 0x1F

(Looking back on it, these hacks seem weird ... fortunately I didn't come
up with these schemes)

Here is some more history:
1) oldest versions of DOS had a tiny range of errors codes (0 to 0x12 and a
few "critical errors"),
2) DOS 3.0 and later had one byte error codes which they called "extended
error codes" (they were returned in the lower 1/2 of the AX register and
the carry flag was set to indicate to the calling program to look there )
3) OS/2 went to 16 byte error codes with reserved ranges for key
subsystems, then later some local components started using 32 bit error
codes.
4) NT eventually switched from 16 bit error codes to 32 bit status codes.

OS/2 did not send 32 status codes via SMB, although we did have
applications (such as PM and the Workplace Shell) which passed 32 bit error
codes among local components and some applications probably returned longer
error codes via DCE/RPC on OS/2 - but IBM (and Microsoft) did not send 32
bit status codes via SMB in the OS/2 era.    To understand what OS/2 did
(and why SMB/CIFS is where it is) requires a little history, so here goes
...   The SMB "ERRdos" codes are really the OS/2 error set (which also
obviously NT knew about too) but these were an expansion of the relatively
small set of DOS 3.0 errors (which were more similar in scope to the POSIX
set of errors).

Early DOS (pre-DOS 3.0) had very strange error handling but back in the DOS
days there was created an INT 59H that returned "extended error" info -
basically an "error class" (e.g. "hardware failure") a "recommended action"
(e.g. retry) and an "error locus" (e.g. "network related") when errors
occurred.    This was not particularly extensible or workable (although
remnants of this error/action/component way of thinking about errors
persisted for more than ten years) so a standard set of generic DOS errors
were developed - which you can still see in the first 80 or so error codes
in the SMB "ERRdos" error class.    The older DOS errors were still
important for "critical errors" or hard[ware] errors (such as trying to
read from a floppy disk when nothing is in the drive).    The Win32 errors
are mostly overlaps of the OS/2 errors in the ERRdos range (>2100
networking errors, print errors etc.) but there was some divergence
eventually in error codes for Win32 print and Windows desktop and those
used by OS/2 print (PMSPOOL) and desktop (Presentation Manager and
Workplace Shell).     The reason OS/2 had some many (10 or 20) reserved
error ranges was to allow interesting generic errors to be passed back
transparently through layers of the Operating System to applications
without constant remapping, losing information and without having to live
with a tiny set of generic errors (such as Unix/POSIX apps often do).
So on OS/2 you could know that small error codes were "system" errors and a
program could do the equivalent of what users could do by typing "HELP
SYS14" and get information from a system "message" file and system "help"
file (something along the lines of "Not enough storage."   "Free some space
and try again.")    Similarly "HELP NET2223" would like in the files
net.msg and neth.msg and retrieve network related information (in this case
something like "you tried creating a group with a name that already
exists....  try using a different group name" - you can see the same thing
on NT by typing "NET HELPMSG xxx" where xxx is the OS/2 or SMB ("ERRdos"
class) error number (NT assumes that these error ranges do not overlap
which is mostly true).   Similarly on OS/2 if you added an application
named "foo" you could request a reserved error range, add foo.msg and a
fooh.msg files to the local system and various system utilities and error
message management APIs would automagically be able to resolve requests for
more information about an error e.g. "help foo754" and perhaps more
importantly it was easy to handle translations of errors into multiple
languages.     In any case the assumption is always that both local and
networked applications need to handle the common errors (in some cases the
actual expected errors were noted carefully) and it does avoid some of the
need for remapping of errors in every layer of the operating system,
kernel, filesystem, runtime libraries etc.   The disadvantage of the OS/2
(ERRdos) approach vs. the POSIX errors is (among other reasons) the sheer
size - there are lots of error codes that could conceivably be passed back
to your application on any particular call - indirectly passed back from
code way below the library function that you are invoking.

When Microsoft changed their error handling in NT, they added some useful
features (like being able to register a DLL to handle certain error
lookups) but it is more confusing due to the sheer number of status codes
(it makes me wish for the error handling and exception handling flexibility
of Java and RMI).

Date: Thu, 11 Jul 2002 00:03:19 -0500
From: "Christopher R. Hertel" <crh at ubiqx.mn.org>
To: Tim Potter <tpot at samba.org>
Cc: Andrew Bartlett <abartlet at samba.org>, samba-technical at samba.org,
 jcifs at samba.org
Subject: Re: Status Codes.

There is obviously a lot more to know about the Status codes than I was
aware.  I had heard it was a royal pain...

Does the client negotiate the set of codes it will use?  If it wants DOS
codes will it only get DOS codes, etc., or is there some overlap?  How
does a client know/control which codeset it's getting.

I know that CAP_STATUS32, but all the docs say (the docs I have) is that
it means 32-bit status codes will be returned.  How do you know which set.

(...and does anybody--Steve? You there?--know anything about the OS/2
Status codes?)

Thanks!

Chris -)-----

Steve French
Senior Software Engineer
Linux Technology Center - IBM Austin
phone: 512-838-2294
email: sfrench at us.ibm.com