[clug] Shell scripting problem using 'process substitution' [ >( pipeline ) ]

Thu Nov 12 01:17:15 UTC 2015

On 12/11/15 09:12, steve jenkin wrote:
> Following a line problem, I have a monitoring script running on a low-power Linux box that uses wget to get the stats page from my (ADSL) Netcomm router/firewall.
> The script runs continuously and I rsync the file back to another machine.
> "Just because", the monitoring script only ever appends to the file, so it gets large.
>

Hi Steve,

Not really addressing your question, but I wonder why "Just because"
you are appending to one file? This won't scale very well and will
become harder to manage as time goes on.

In bash, it is trivial to create a per-day file with the date encoded
into the file name. I would recommend something like RFC 3339 format
(YYYY-MM-DD) to keep lexical ordering of the files consistent.

rsync will have no problem copying these to your destination box.

removing old data is as easy as: rm ADSL_WGET_2012-* etc.

and your low-power Linux box won't break when your monolithic file
fills the file-system and doesn't leave you any wiggle room to
edit it etc.

grep'ing over the data multiple times will then be a non-issue.

cheers,

Bob Edwards.

> Looking at the data, I could break the file into “per day” files, and then analyse them.
> It’s easier admin to leave everything in the one file and just select the day/s I want to process.
> [Manually stop / start the monitor & break-out the days into their own files. Can’t do this solely on the destination m/c because of ‘rsync’.]
>
> I found myself running the same grep (for the day) twice over the long file & counting different things, and wondered if there was a way to use a pipeline. There is no performance reason to do this - doesn’t take much time, it’s “just because” :)
> [A problem I had encountered for work a few times and never came up with a solution I liked.]
>
> I’ve got two variants below that work, but I’m not happy with the result…
> To see something useful, the ">( process )” have to write to STDERR (or /dev/tty).
>
> If I pipe the output of the ‘inner’ count to STDOUT, then it gets sucked up by the next step in the main pipeline and I won’t see it.
>
> I _could_ play with file descriptors (clone STDOUT to FD-3 for ‘inner processes' and STDOUT of last process to /dev/null at the end), but that seems a bit clumsy.
>
> The fragment as it is now can’t be ‘just used’ in a pipeline because it doesn’t output to STDOUT [throws it away], but STDERR.
>
> Anyone do anything like this?
>
> Any suggestions?
>
> Thanks in Advance
>
> steve
>
>
> Using two outputs to STDERR
>> for i in {10..12}
>> do d=15-11-${i}; echo $d
>>     grep "$d" netcomm-link-SNR | tee >( (echo 'tot: ' `wc -l`) >&2 ) >( (echo 'drops: ' `grep '0$'|wc -l`) >&2 )|cat >/dev/null
>> done
>
> variant: one output to STDERR
>> grep "$d" netcomm-link-SNR|tee >( (echo 'tot: ' `wc -l`) >&2 ) |(echo 'drops: ' `grep '0$'|wc -l`)
>
>
> File Descriptor Fiddling. Haven’t tried this properly… Has the problem of connecting to later process with a pipe.
>   the last /dev/null creates a problem.
>> exec 3>&1
>
>> grep "$d" netcomm-link-SNR | tee >( (echo 'tot: ' `wc -l`) >&3 ) >( (echo 'drops: ' `grep '0$'|wc -l`) >&3 )|cat >/dev/null
>
>
>
> --
> Steve Jenkin, IT Systems and Design
> 0412 786 915 (+61 412 786 915)
> PO Box 48, Kippax ACT 2615, AUSTRALIA
>
> mailto:sjenkin at canb.auug.org.au http://members.tip.net.au/~sjenkin
>
>