[distcc] Network protocol

Martin Pool mbp at sourcefrog.net
Wed Feb 5 11:30:48 GMT 2003


On  5 Feb 2003, Brad Hards <bhards at bigpond.net.au> wrote:

> I'm working on an ethereal dissector for distcc, and it is starting to come 
> together.

Why?  Just for completeness in ethereal, or something else?  Not that
I mind.

> However I have some queries.	I might have some of it wrong too. Any 
> corrections appreciated.
> 
> The protocol starts off with the client (distcc application) sending DIST 
> (literally), then the version number packet into 8 ascii digits.

If the version changes, then the sequence of following packets will
change too.  (For example, we might introduce a compression control
packet.)

Also, in the future, the client may be able to send multiple requests
over a single socket.  But this may not happen, so don't worry too much.

> The client then sends ARGC (literally) followed by the number of arguments, 
> again packed into 8 ascii digits.Then comes the various arguments as ascii 
> text where each argument is preceded by ARGV and the length as as an ascii 
> encoded hex string.
> 
> Next is the liternal DOTI, followed by the length of the preprocessed C text, 
> followed by the preprocessed C text itself. This normally spans more than one 
> packet - the following packets just have the TCP/IP header, and the text 
> continues.
> Question: What does DOTI mean?

The canonical extension for preprocessed C source is ".i" -> therefore
"dot i".

Of course the protocol is defined at the level of the TCP stream.
Since it uses TCP corks to try to force a small number of packets, it
may be hard to decode unless Ethereal can work on the reassembled
stream.  My understanding is that at the moment it cannot.

Of course there is this common format of 

  TOKEN PARAMETER DATA

Whether the data is present or not depends on the token.  If it is,
it's length is always given by the parameter.

> Then the volunteer (running distccd) responds.
> It starts with DONE (liternally) then the version number.
> Then it responds with STAT (literally), and some error code (normally zero).
> Question: What do the various codes mean?

It's a Unix waitstatus.  The bottom 8 bits give the signal that
terminated the program; the second-lowest byte gives the exit code.
For success, both are 0.

> Then it sends the literal SERR (which I guess indicates the output to standard 
> error), and then the length of the string (again, ascii encoded hex string), 
> and the standard error message. I have seen this span multiple packets, 
> although I don't currently handle this.
> 
> Then it sends the literal SOUT (which I guess indicated the output to the 
> standard output stream), and the length of the string, and then the string. I 
> guess that thsi can span multiple packets, but I don't handle it, nor have I 
> seen it.

Yes.  Compilers should never (?) write to stdout, so this will almost
always be 0. 

> Then you get the literal DOTO, followed by the length of the compiled output, 
> and the compiled output. This normally spans multiple packets too.
> Question: What does DOTO mean?

".o" -> "dot o".

foo.c->foo.i->foo.s->foo.o

Kind of a pun.

-- 
Martin

Prefer not to receive HTML mail?  Click here.


More information about the distcc mailing list