[clug] Splitting a file using bash
Hal Ashburner
hal at ashburner.info
Sun Sep 14 19:01:54 MDT 2014
Hadn't encountered before now.
Looks like this is exactly it and exactly what I'd hoped for asking CLUG.
Thanks Andrew!
On 15 September 2014 10:57, Andrew Janke <a.janke at gmail.com> wrote:
> Isn't this what csplit is made for?
>
>
> a
>
> On 15 September 2014 10:54, Hal Ashburner <hal at ashburner.info> wrote:
>> I want to turn stdout into multiple files. I have markers where I
>> would like this to happen.
>>
>> stdout:
>> data
>> data
>> data
>> this_is_a_marker
>> data2
>> data2
>> data2
>> data2
>>
>>
>> only it's very large
>>
>> I have this function which works, but is slow.
>> Better ideas would include:
>> 1) re-write everything in another language eg python
>> 2) re-write split reports in C
>> 3) ask CLUG if anyone has a faster way of doing this using standard
>> bash 4.1.2 or older on a redhat enterprise/centos system.
>>
>> Yeah I just asked about optimising a shell script, I already feel bad
>> and you don't have to point out that I should. ;-)
>>
>> function split_reports()
>> {
>> local input_file="$1"
>> local first_report="$2"
>> local second_report="$3"
>> # generalise the above using $@ if more than 2 needed
>> local breaks_seen=0
>> local line=""
>> while read line
>> do
>> if [[ $line =~ start_report ]]; then
>> breaks_seen=$((breaks_seen + 1))
>> # clobber it before using it
>> # don't write out the marker
>> : > ${input_file}.${breaks_seen}
>> else
>> case $breaks_seen in
>> [0-9]) echo "${line}" >> ${input_file}.${breaks_seen} ;;
>> *) echo_stderr "error breaks_seen is ${breaks_seen} -
>> should be 0-1";;
>> esac
>> fi
>> done < "${input_file}"
>>
>> mv "${input_file}" "${input_file}.orig"
>> mv "${input_file}.0" "${first_report}"
>> mv "${input_file}.1" "${second_report}"
>> }
>> --
>> linux mailing list
>> linux at lists.samba.org
>> https://lists.samba.org/mailman/listinfo/linux
More information about the linux
mailing list