Large text files have extra newlines inserted when read by COPY, cp, diff, etc.

BG - Ben Armstrong BArmstrong at dymaxion.ca
Thu Jul 29 18:28:23 GMT 2004


I am finding that large text files (type "variable", not "stream") have
extra newlines inserted into them when they are copied from an OpenVMS
samba share to Linux using cp, or from the same share to Windows using
COPY.  I can discern no particular pattern to the extra newlines being
introduced.  Here is a Linux diff illustrating the problem.  The diff
program itself also corrupts the file in the same way as cp.  However,
loading the file with vim and then saving it saves a pristine copy, so
the diff shows my pristine copy saved locally in this fashion vs. the
corrupted copy directly read by diff:

655c655,656
<      but up to this point C++ has been compiled manually. Should someone
---
>      but up to this point C++ has been compiled manually. Should someone
> 
2358c2359,2360
<      - Remove from (29,0)LIBRARY.LIB and (29,9)FILES.INF.
---
>      - Remove from (29,0)LIBRARY.LIB and (29,9)FILES.INF.
> 
15662c15664,15665
<        file present.
---
>        file present.
> 
19563c19566,19567
<        Enter, New order, Turn confirmations off, Modify or Clear
---
>        Enter, New order, Turn confirmations off, Modify or Clear
> 
21027c21031,21032
<        title to view from the brief display of hits; however, now
---
>        title to view from the brief display of hits; however, now
> 
28103c28108,28109
<        .OBS's:
---
>        .OBS's:
> 
32851c32857,32858
<        $!.BREAK /CMD/SWITCHES/ NEWLOG
---
>        $!.BREAK /CMD/SWITCHES/ NEWLOG
> 
37809c37816,37817
< ** 27-Sep-01 23:15 UPDATE JO -> PM&CS&SJ&JT&JO&DV
---
> ** 27-Sep-01 23:15 UPDATE JO -> PM&CS&SJ&JT&JO&DV
> 
38530c38538,38539
<      untranslated text is expected again, contact customer support to
---
>      untranslated text is expected again, contact customer support to
> 
46203c46212,46213
<    - Prevent users from reaching the Extend Booking Page from Re-Book
---
>    - Prevent users from reaching the Extend Booking Page from Re-Book
> 
53514c53524,53525
<    * WEBBIN:extord.cgi     DV:SH    wt:th    QA:KB
---
>    * WEBBIN:extord.cgi     DV:SH    wt:th    QA:KB
> 
58215c58226,58227
<  ej>added subreq for film/client and medium restrictions
---
>  ej>added subreq for film/client and medium restrictions
> 
62427c62439,62440
<  - pass wts. will prepare qa with 4=16129 (idxlku removing usage of
---
>  - pass wts. will prepare qa with 4=16129 (idxlku removing usage of
> 
67155,67156c67168
< ** 27-Jul-04 13:47 MOVE HD -> WM
< 
---
> ** 27-Jul-04 13:47 MOVE
\ No newline at end of file

We at first thought there was something funny about the original file,
and that the corruption was due to an unexpected extra character at the
ends of the lines showing corruption, but close examination of a binary
dump of the original on OpenVMS before copying shows no such extra
characters.

It is interesting to note that there are missing characters at the end
of the file shown in the above diff, probably due to the reported length
of the file being shorter than the file after corruption, along the same
lines as the earlier bug I reported.

Ben



More information about the samba-vms mailing list