[clug] Debian/GNU 'find . -ls' oddity: outputs UTF-8 chars as quasi-octal strings - \314\201, not \0314\0201

David Deaves David.Deaves at dd.id.au
Thu Sep 29 00:58:16 UTC 2016


> I have a work around using 'xargs ls -dlis', but I’d prefer to be able use the supplied ‘find’ argument.
> Right now I have a 1.5M line file sprinkled with ‘octal’ strings that I need to convert back.

I would expect that this is a job for sed, something like:
  sed 's/[^\]\\\([1-3][0-7][0-7]\)/\\0\1/g'
should handle the octal fix.  Extend it for \r, \t, \b ...

I have always seen the '-ls' flag of gnu find to be for human consumption, 
which matches the motivation for quoting listed in the man page (special
characters in file names may frob your terminal)

If going forward your goal is to have a script that wants a file list with 
extra info (like provided by the -ls flag) then I would recommend using the 
-printf flag to select the extra fields that you want with the file name:

eg:

find . -type f -printf '%T@ %k %p\n' | 
 sort -n |
  while read  mtime_in_sec_since_epoch  size_in_k  file
  do
    ##  Process list of files in time order knowing their size in kilobytes
  done

Note: this one won't handle a '\n' in the file name, but that particular use 
case I actually had control of the file names, this is more about showing the 
power of the  -printf  flag.  Before gnu find, I used to use my own binary
'lss'  to get seconds_since_epoch on traditional Unix platforms.


Dave !




More information about the linux mailing list