[clug] Splitting a file using bash

Hal Ashburner hal at ashburner.info
Sun Sep 14 19:01:54 MDT 2014


Hadn't encountered before now.
Looks like this is exactly it and exactly what I'd hoped for asking CLUG.
Thanks Andrew!

On 15 September 2014 10:57, Andrew Janke <a.janke at gmail.com> wrote:
> Isn't this what csplit is made for?
>
>
> a
>
> On 15 September 2014 10:54, Hal Ashburner <hal at ashburner.info> wrote:
>> I want to turn stdout into  multiple files. I have markers where I
>> would like this to happen.
>>
>> stdout:
>> data
>> data
>> data
>> this_is_a_marker
>> data2
>> data2
>> data2
>> data2
>>
>>
>> only it's very large
>>
>> I have this function which works, but is slow.
>> Better ideas would include:
>> 1) re-write everything in another language eg python
>> 2) re-write split reports in C
>> 3) ask CLUG if anyone has a faster way of doing this using standard
>> bash 4.1.2 or older on a redhat enterprise/centos system.
>>
>> Yeah I just asked about optimising a shell script, I already feel bad
>> and you don't have to point out that I should. ;-)
>>
>> function split_reports()
>> {
>>     local input_file="$1"
>>     local first_report="$2"
>>     local second_report="$3"
>>     # generalise the above using $@ if more than 2 needed
>>     local breaks_seen=0
>>     local line=""
>>     while read line
>>     do
>>         if [[ $line =~ start_report ]]; then
>>             breaks_seen=$((breaks_seen + 1))
>>             # clobber it before using it
>>             # don't write out the marker
>>             : > ${input_file}.${breaks_seen}
>>         else
>>             case $breaks_seen in
>>                 [0-9]) echo "${line}" >> ${input_file}.${breaks_seen} ;;
>>                 *) echo_stderr "error breaks_seen is ${breaks_seen} -
>> should be 0-1";;
>>             esac
>>         fi
>>     done < "${input_file}"
>>
>>     mv "${input_file}" "${input_file}.orig"
>>     mv "${input_file}.0" "${first_report}"
>>     mv "${input_file}.1" "${second_report}"
>> }
>> --
>> linux mailing list
>> linux at lists.samba.org
>> https://lists.samba.org/mailman/listinfo/linux


More information about the linux mailing list