GNU bug report logs -
#49925
cat -E interprets sentinel newline at the end of buffer as an actual newline after a \r
Previous Next
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 49925 in the body.
You can then email your comments to 49925 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-coreutils <at> gnu.org
:
bug#49925
; Package
coreutils
.
(Sat, 07 Aug 2021 13:08:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Michael Debertol <michael.debertol <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Sat, 07 Aug 2021 13:08:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Hi,
after https://lists.gnu.org/archive/html/coreutils/2021-02/msg00003.html
(unreleased), the behavior of cat -E was changed so that it prints "^M$"
for "\r\n" line endings.
Whenever it sees a \r "cat -E" checks if the byte after is a \n, however
that \n might be the sentinel value that is inserted at the end of a buffer.
This is a problem in two cases:
- When a \r is at the end of the input. `printf "\r" | cat -E` will
print "^M", even though there is no "\n" after the "\r". FWIW,
tests/misc/cat-E.sh expects a "^M" for a trailing "\r", but I think
that's wrong.
- When the file is too big to fit into one buffer. If you try to "cat
-E" a big file (mutliple megabytes) that consists of only "\r", cat will
print a few "^M" whenever it hits the end of a buffer in the middle of
the file and at the end.
Michael
Reply sent
to
Pádraig Brady <P <at> draigBrady.com>
:
You have taken responsibility.
(Sat, 07 Aug 2021 18:30:01 GMT)
Full text and
rfc822 format available.
Notification sent
to
Michael Debertol <michael.debertol <at> gmail.com>
:
bug acknowledged by developer.
(Sat, 07 Aug 2021 18:30:02 GMT)
Full text and
rfc822 format available.
Message #10 received at 49925-done <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 07/08/2021 14:07, Michael Debertol wrote:
> Hi,
>
> after https://lists.gnu.org/archive/html/coreutils/2021-02/msg00003.html
> (unreleased), the behavior of cat -E was changed so that it prints "^M$"
> for "\r\n" line endings.
>
> Whenever it sees a \r "cat -E" checks if the byte after is a \n, however
> that \n might be the sentinel value that is inserted at the end of a buffer.
>
> This is a problem in two cases:
>
> - When a \r is at the end of the input. `printf "\r" | cat -E` will
> print "^M", even though there is no "\n" after the "\r". FWIW,
> tests/misc/cat-E.sh expects a "^M" for a trailing "\r", but I think
> that's wrong.
This was intentional (as per the test) as I was thinking
we can provide more info here in the edge case that \r is the last char of a file.
However it's incorrect as you suggest, as cat can't treat files independently.
> - When the file is too big to fit into one buffer. If you try to "cat
> -E" a big file (mutliple megabytes) that consists of only "\r", cat will
> print a few "^M" whenever it hits the end of a buffer in the middle of
> the file and at the end.
That indeed is a bug.
So we need to track handling of \r across buffer and file boundaries.
The attached does that, and I'll apply later.
marking this as done,
thanks!
Pádraig
[cat-E-trailing-CR.patch (text/x-patch, attachment)]
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Sun, 05 Sep 2021 11:24:05 GMT)
Full text and
rfc822 format available.
This bug report was last modified 3 years and 292 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.