GNU bug report logs - #49925
cat -E interprets sentinel newline at the end of buffer as an actual newline after a \r

Previous Next

Package: coreutils;

Reported by: Michael Debertol <michael.debertol <at> gmail.com>

Date: Sat, 7 Aug 2021 13:08:01 UTC

Severity: normal

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 49925 in the body.
You can then email your comments to 49925 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#49925; Package coreutils. (Sat, 07 Aug 2021 13:08:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Michael Debertol <michael.debertol <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Sat, 07 Aug 2021 13:08:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Michael Debertol <michael.debertol <at> gmail.com>
To: bug-coreutils <at> gnu.org
Subject: cat -E interprets sentinel newline at the end of buffer as an actual
 newline after a \r
Date: Sat, 7 Aug 2021 15:07:32 +0200
Hi,

after https://lists.gnu.org/archive/html/coreutils/2021-02/msg00003.html 
(unreleased), the behavior of cat -E was changed so that it prints "^M$" 
for "\r\n" line endings.

Whenever it sees a \r "cat -E" checks if the byte after is a \n, however 
that \n might be the sentinel value that is inserted at the end of a buffer.

This is a problem in two cases:

- When a \r is at the end of the input. `printf "\r" | cat -E` will 
print "^M", even though there is no "\n" after the "\r". FWIW, 
tests/misc/cat-E.sh expects a "^M" for a trailing "\r", but I think 
that's wrong.

- When the file is too big to fit into one buffer. If you try to "cat 
-E" a big file (mutliple megabytes) that consists of only "\r", cat will 
print a few "^M" whenever it hits the end of a buffer in the middle of 
the file and at the end.

Michael





Reply sent to Pádraig Brady <P <at> draigBrady.com>:
You have taken responsibility. (Sat, 07 Aug 2021 18:30:01 GMT) Full text and rfc822 format available.

Notification sent to Michael Debertol <michael.debertol <at> gmail.com>:
bug acknowledged by developer. (Sat, 07 Aug 2021 18:30:02 GMT) Full text and rfc822 format available.

Message #10 received at 49925-done <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Michael Debertol <michael.debertol <at> gmail.com>, 49925-done <at> debbugs.gnu.org
Subject: Re: bug#49925: cat -E interprets sentinel newline at the end of
 buffer as an actual newline after a \r
Date: Sat, 7 Aug 2021 19:29:06 +0100
[Message part 1 (text/plain, inline)]
On 07/08/2021 14:07, Michael Debertol wrote:
> Hi,
> 
> after https://lists.gnu.org/archive/html/coreutils/2021-02/msg00003.html
> (unreleased), the behavior of cat -E was changed so that it prints "^M$"
> for "\r\n" line endings.
> 
> Whenever it sees a \r "cat -E" checks if the byte after is a \n, however
> that \n might be the sentinel value that is inserted at the end of a buffer.
> 
> This is a problem in two cases:
> 
> - When a \r is at the end of the input. `printf "\r" | cat -E` will
> print "^M", even though there is no "\n" after the "\r". FWIW,
> tests/misc/cat-E.sh expects a "^M" for a trailing "\r", but I think
> that's wrong.

This was intentional (as per the test) as I was thinking
we can provide more info here in the edge case that \r is the last char of a file.
However it's incorrect as you suggest, as cat can't treat files independently.

> - When the file is too big to fit into one buffer. If you try to "cat
> -E" a big file (mutliple megabytes) that consists of only "\r", cat will
> print a few "^M" whenever it hits the end of a buffer in the middle of
> the file and at the end.

That indeed is a bug.

So we need to track handling of \r across buffer and file boundaries.
The attached does that, and I'll apply later.

marking this as done,

thanks!
Pádraig
[cat-E-trailing-CR.patch (text/x-patch, attachment)]

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 05 Sep 2021 11:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 292 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.