[clug] DGSH - directed graph shell. adding parallelism to shell & pipes

Brenton Ross rossb at fwi.net.au
Tue Jul 25 05:29:18 UTC 2017


Luke,

Thanks for your comments.

It would appear that "dgsh-wrap" is for the purpose of interfacing
normal stdin/stdout programs to dgsh, as you suggested.

Good point about not working if the program does a seek. I will need to
add something to the user guide to warn users if this ever gets
implemented. It would be hard for users to know if a program was going
to seek on one of its files.

The manifold should be relatively straightforward to implement with the
existing internals of VICI - its mostly just more threads and pipes that
make up the bulk of the runtime anyway. I suspect creating an icon for
it will be the most time consuming part.

Brenton

On Tue, 2017-07-25 at 13:41 +1000, Luke Mewburn wrote:

> On Fri, Jul 14, 2017 at 01:58:33PM +1000, Brenton Ross via linux wrote:
>   | I've had a preliminary look at dgsh, and I'm not overly taken with the
>   | approach they took.
>   | They have replaced the normal Unix pipe interface for stdin and stdout
>   | with sockets, which means that the core utilities (and anything else you
>   | want to use via pipes) has to be the modified version for dgsh.
> 
> Does the "dgsh-wrap" tool they provide assist with interfacing with
> existing stdin/stdout tools?
> 	https://www.spinellis.gr/sw/dgsh/dgsh-wrap.html
> 
> Could you just (ab)use socat to interface between stdin/stdout and
> the dgsh sockets? I've used that technique elsewhere; socat is awesome,
> (if complex to use):
> 	http://www.dest-unreach.org/socat/
> 
> 
>   | However, it got me wondering if there was another way, one that did not
>   | require modifying the programs.
>   | 
>   | I think I could add a couple of extensions to VICI that would cover a
>   | lot of dgsh's capabilities, and have some further advantages.
>   | 
>   | The first change would be to introduce named streams - the data flows
>   | could be given a label. If a program connected to a named stream used
>   | the name as a filename parameter, then VICI would substitute the label
>   | with the path to a Unix named pipe. This would allow programs to connect
>   | to multiple pipes. Of course it would not help for the cases where dgsh
>   | has modified the actual interface to the program, such as grep having
>   | multiple inputs and outputs, but you could create a modified grep with
>   | that capability that would still be compatible with bash etc.
> 
> If your platform provides /dev/fd/* (which Linux does), creative
> use of shell redirection to fds in the invocation of the command,
> and providing /dev/fd/.. as filenames may just work.
> (This can fail when tools assume that a file is seekable.)
> 
> 
>   | The second change is to introduce what I call a "manifold". This object
>   | can have any number of stdin and stdout streams. It would have several
>   | modes of operation:
>   | 
>   |      1. Sequential, where it reads from its first stream until its
>   |         exhausted (closed), then reads from the second until that is
>   |         finished, etc
>   |      2. Merge, where any input is sent immediately to the output (line
>   |         by line)
>   |      3. Parallel, where reading blocks until something is ready on all
>   |         the input streams. This would help to synchronise processing.
>   |      4. Copy, where each input is sent to all the output streams
>   |      5. Distribute, where the input lines are sent to the output streams
>   |         in round-robin fashion.
>   | 
>   | The manifold would start a new thread for each of its output streams to
>   | achieve the multiprocessing capability of dgsh.
>   | 
>   | Hence, I think it would have been possible to create dgsh without having
>   | to fork the core utility programs to create an new set of incompatible
>   | programs.
> 
> That manifold idea is interesting.
> 
> 
> As an implementation detail, personally I would probably experiment /
> prototype that tool in python using an async I/O mechanism and some
> generator trickery, rather than using a thread per stream. 
> 
> (Or just write it in C++ and play with boost::asio; only using
> threads as a thread pool behind the boost::asio io_service runner.
> I digress :)
> 
> That's just a personal choice - YMMV.
> 
> 
> 
> cheers,
> Luke.




More information about the linux mailing list