GNU bug report logs -
#49925
cat -E interprets sentinel newline at the end of buffer as an actual newline after a \r
Previous Next
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
Your bug report
#49925: cat -E interprets sentinel newline at the end of buffer as an actual newline after a \r
which was filed against the coreutils package, has been closed.
The explanation is attached below, along with your original report.
If you require more details, please reply to 49925 <at> debbugs.gnu.org.
--
49925: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=49925
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
[Message part 3 (text/plain, inline)]
On 07/08/2021 14:07, Michael Debertol wrote:
> Hi,
>
> after https://lists.gnu.org/archive/html/coreutils/2021-02/msg00003.html
> (unreleased), the behavior of cat -E was changed so that it prints "^M$"
> for "\r\n" line endings.
>
> Whenever it sees a \r "cat -E" checks if the byte after is a \n, however
> that \n might be the sentinel value that is inserted at the end of a buffer.
>
> This is a problem in two cases:
>
> - When a \r is at the end of the input. `printf "\r" | cat -E` will
> print "^M", even though there is no "\n" after the "\r". FWIW,
> tests/misc/cat-E.sh expects a "^M" for a trailing "\r", but I think
> that's wrong.
This was intentional (as per the test) as I was thinking
we can provide more info here in the edge case that \r is the last char of a file.
However it's incorrect as you suggest, as cat can't treat files independently.
> - When the file is too big to fit into one buffer. If you try to "cat
> -E" a big file (mutliple megabytes) that consists of only "\r", cat will
> print a few "^M" whenever it hits the end of a buffer in the middle of
> the file and at the end.
That indeed is a bug.
So we need to track handling of \r across buffer and file boundaries.
The attached does that, and I'll apply later.
marking this as done,
thanks!
Pádraig
[cat-E-trailing-CR.patch (text/x-patch, attachment)]
[Message part 5 (message/rfc822, inline)]
Hi,
after https://lists.gnu.org/archive/html/coreutils/2021-02/msg00003.html
(unreleased), the behavior of cat -E was changed so that it prints "^M$"
for "\r\n" line endings.
Whenever it sees a \r "cat -E" checks if the byte after is a \n, however
that \n might be the sentinel value that is inserted at the end of a buffer.
This is a problem in two cases:
- When a \r is at the end of the input. `printf "\r" | cat -E` will
print "^M", even though there is no "\n" after the "\r". FWIW,
tests/misc/cat-E.sh expects a "^M" for a trailing "\r", but I think
that's wrong.
- When the file is too big to fit into one buffer. If you try to "cat
-E" a big file (mutliple megabytes) that consists of only "\r", cat will
print a few "^M" whenever it hits the end of a buffer in the middle of
the file and at the end.
Michael
This bug report was last modified 3 years and 292 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.