[clug] Splitting a file using bash
Andrew Janke
a.janke at gmail.com
Sun Sep 14 18:57:41 MDT 2014
Isn't this what csplit is made for?
a
On 15 September 2014 10:54, Hal Ashburner <hal at ashburner.info> wrote:
> I want to turn stdout into multiple files. I have markers where I
> would like this to happen.
>
> stdout:
> data
> data
> data
> this_is_a_marker
> data2
> data2
> data2
> data2
>
>
> only it's very large
>
> I have this function which works, but is slow.
> Better ideas would include:
> 1) re-write everything in another language eg python
> 2) re-write split reports in C
> 3) ask CLUG if anyone has a faster way of doing this using standard
> bash 4.1.2 or older on a redhat enterprise/centos system.
>
> Yeah I just asked about optimising a shell script, I already feel bad
> and you don't have to point out that I should. ;-)
>
> function split_reports()
> {
> local input_file="$1"
> local first_report="$2"
> local second_report="$3"
> # generalise the above using $@ if more than 2 needed
> local breaks_seen=0
> local line=""
> while read line
> do
> if [[ $line =~ start_report ]]; then
> breaks_seen=$((breaks_seen + 1))
> # clobber it before using it
> # don't write out the marker
> : > ${input_file}.${breaks_seen}
> else
> case $breaks_seen in
> [0-9]) echo "${line}" >> ${input_file}.${breaks_seen} ;;
> *) echo_stderr "error breaks_seen is ${breaks_seen} -
> should be 0-1";;
> esac
> fi
> done < "${input_file}"
>
> mv "${input_file}" "${input_file}.orig"
> mv "${input_file}.0" "${first_report}"
> mv "${input_file}.1" "${second_report}"
> }
> --
> linux mailing list
> linux at lists.samba.org
> https://lists.samba.org/mailman/listinfo/linux
More information about the linux
mailing list