[clug] Debian/GNU 'find . -ls' oddity: outputs UTF-8 chars as quasi-octal strings - \314\201, not \0314\0201
David Deaves
David.Deaves at dd.id.au
Thu Sep 29 00:58:16 UTC 2016
> I have a work around using 'xargs ls -dlis', but Iâd prefer to be able use the supplied âfindâ argument.
> Right now I have a 1.5M line file sprinkled with âoctalâ strings that I need to convert back.
I would expect that this is a job for sed, something like:
sed 's/[^\]\\\([1-3][0-7][0-7]\)/\\0\1/g'
should handle the octal fix. Extend it for \r, \t, \b ...
I have always seen the '-ls' flag of gnu find to be for human consumption,
which matches the motivation for quoting listed in the man page (special
characters in file names may frob your terminal)
If going forward your goal is to have a script that wants a file list with
extra info (like provided by the -ls flag) then I would recommend using the
-printf flag to select the extra fields that you want with the file name:
eg:
find . -type f -printf '%T@ %k %p\n' |
sort -n |
while read mtime_in_sec_since_epoch size_in_k file
do
## Process list of files in time order knowing their size in kilobytes
done
Note: this one won't handle a '\n' in the file name, but that particular use
case I actually had control of the file names, this is more about showing the
power of the -printf flag. Before gnu find, I used to use my own binary
'lss' to get seconds_since_epoch on traditional Unix platforms.
Dave !
More information about the linux
mailing list