GNU bug report logs - #64392
cksum: escaping issues of --check output

Previous Next

Package: coreutils;

Reported by: Christoph Anton Mitterer <calestyo <at> scientia.org>

Date: Sat, 1 Jul 2023 00:22:02 UTC

Severity: normal

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 64392 in the body.
You can then email your comments to 64392 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#64392; Package coreutils. (Sat, 01 Jul 2023 00:22:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Christoph Anton Mitterer <calestyo <at> scientia.org>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Sat, 01 Jul 2023 00:22:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Christoph Anton Mitterer <calestyo <at> scientia.org>
To: bug-coreutils <at> gnu.org
Subject: cksum: escaping issues of --check output
Date: Sat, 01 Jul 2023 02:10:49 +0200
Hey.

It seems to me that the output of --check mode in cksum (and likely
also in md5sum and friends) suffers from improper escaping (which,
IIRC, is not even documented for that output... but may be wrong):

$ touch a $'new\nline' '\n' z
$ ls -al
total 0
drwxr-xr-x 1 calestyo calestyo  24 Jul  1 02:01  .
drwxr-xr-x 1 calestyo calestyo 176 Jul  1 01:48  ..
-rw-r--r-- 1 calestyo calestyo   0 Jul  1 02:01  a
-rw-r--r-- 1 calestyo calestyo   0 Jul  1 02:01 'new'$'\n''line'
-rw-r--r-- 1 calestyo calestyo   0 Jul  1 02:01  z
-rw-r--r-- 1 calestyo calestyo   0 Jul  1 02:01 '\n'

$ cksum -a sha512 --tag * > sums.tagged
$ cksum -a sha512 --untagged * > sums.untagged

$ cat sums.tagged 
SHA512 (a) = cf83e1357eefb8bdf1542850d66d8007d620e4050b5715dc83f4a921d36ce9ce47d0d13c5d85f2b0ff8318d2877eec2f63b931bd47417a81a538327af927da3e
\SHA512 (\\n) = cf83e1357eefb8bdf1542850d66d8007d620e4050b5715dc83f4a921d36ce9ce47d0d13c5d85f2b0ff8318d2877eec2f63b931bd47417a81a538327af927da3e
\SHA512 (new\nline) = cf83e1357eefb8bdf1542850d66d8007d620e4050b5715dc83f4a921d36ce9ce47d0d13c5d85f2b0ff8318d2877eec2f63b931bd47417a81a538327af927da3e
SHA512 (z) = cf83e1357eefb8bdf1542850d66d8007d620e4050b5715dc83f4a921d36ce9ce47d0d13c5d85f2b0ff8318d2877eec2f63b931bd47417a81a538327af927da3e
$ cat sums.untagged 
cf83e1357eefb8bdf1542850d66d8007d620e4050b5715dc83f4a921d36ce9ce47d0d13c5d85f2b0ff8318d2877eec2f63b931bd47417a81a538327af927da3e  a
\cf83e1357eefb8bdf1542850d66d8007d620e4050b5715dc83f4a921d36ce9ce47d0d13c5d85f2b0ff8318d2877eec2f63b931bd47417a81a538327af927da3e  \\n
\cf83e1357eefb8bdf1542850d66d8007d620e4050b5715dc83f4a921d36ce9ce47d0d13c5d85f2b0ff8318d2877eec2f63b931bd47417a81a538327af927da3e  new\nline
65bb946645079f3ebfa931460430c1676d656e455e5a6266b85fa0c78f08f63507eb417b70f67106c8ad9cdebeacb29fa770e86b1624763f310f1ebb6bd0542a  sums.tagged
cf83e1357eefb8bdf1542850d66d8007d620e4050b5715dc83f4a921d36ce9ce47d0d13c5d85f2b0ff8318d2877eec2f63b931bd47417a81a538327af927da3e  z

$ cksum -c sums.tagged
a: OK
\n: OK
\new\nline: OK
z: OK
$ cksum -c sums.untagged 
cksum: sums.untagged: no properly formatted checksum lines found

$ sha512sum -c sums.untagged 
a: OK
\n: OK
\new\nline: OK
sums.tagged: OK
z: OK


Assuming the same rules for the --check output as for the sums files, a
leading \ should serve as the escaping indicator.

So for:
   \new\nline: OK
that would be fine but for:
   \n: OK
it's not but would rather need to be:
   \\\n: OK


The failed cases may be similarly affected by this.


Thanks,
Chris.

btw: Though it's probably too late to change, I think the output format
is rather unfortunate.
It should have been more closely to the BSD style format used for the
sums file, e.g. something like:
   <algo> (<filename>) = <OK|failed|not found|etc.>
again with the optional leading \ to indicate escaping.

The problem with the current format is especially, that it's not
possible to determine the alog, which may however be of interest if
there are more than one per file.




Reply sent to Pádraig Brady <P <at> draigBrady.com>:
You have taken responsibility. (Sat, 01 Jul 2023 16:08:02 GMT) Full text and rfc822 format available.

Notification sent to Christoph Anton Mitterer <calestyo <at> scientia.org>:
bug acknowledged by developer. (Sat, 01 Jul 2023 16:08:02 GMT) Full text and rfc822 format available.

Message #10 received at 64392-done <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Christoph Anton Mitterer <calestyo <at> scientia.org>,
 64392-done <at> debbugs.gnu.org
Subject: Re: bug#64392: cksum: escaping issues of --check output
Date: Sat, 1 Jul 2023 17:07:14 +0100
[Message part 1 (text/plain, inline)]
On 01/07/2023 01:10, Christoph Anton Mitterer wrote:
> Hey.
> 
> It seems to me that the output of --check mode in cksum (and likely
> also in md5sum and friends) suffers from improper escaping (which,
> IIRC, is not even documented for that output... but may be wrong):
> 
> $ touch a $'new\nline' '\n' z
> $ ls -al
> total 0
> drwxr-xr-x 1 calestyo calestyo  24 Jul  1 02:01  .
> drwxr-xr-x 1 calestyo calestyo 176 Jul  1 01:48  ..
> -rw-r--r-- 1 calestyo calestyo   0 Jul  1 02:01  a
> -rw-r--r-- 1 calestyo calestyo   0 Jul  1 02:01 'new'$'\n''line'
> -rw-r--r-- 1 calestyo calestyo   0 Jul  1 02:01  z
> -rw-r--r-- 1 calestyo calestyo   0 Jul  1 02:01 '\n'
> 
> $ cksum -a sha512 --tag * > sums.tagged

> $ cksum -c sums.tagged
> a: OK
> \n: OK
> \new\nline: OK
> z: OK

> Assuming the same rules for the --check output as for the sums files, a
> leading \ should serve as the escaping indicator.
> 
> So for:
>     \new\nline: OK
> that would be fine but for:
>     \n: OK
> it's not but would rather need to be:
>     \\\n: OK

Right. We traditionally didn't escape any chars in the --check output,
but that changed with https://github.com/coreutils/coreutils/commit/646902b30
To minimize escaping, that patch only considered the '\n' character,
but we should also have considered file names with a leading '\'.

The attached should address this.

Marking this as done.

thanks,
Pádraig
[cksum-leading-backslash.patch (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#64392; Package coreutils. (Sat, 01 Jul 2023 17:21:01 GMT) Full text and rfc822 format available.

Message #13 received at 64392 <at> debbugs.gnu.org (full text, mbox):

From: Christoph Anton Mitterer <calestyo <at> scientia.org>
To: Pádraig Brady <P <at> draigBrady.com>, 64392 <at> debbugs.gnu.org
Subject: Re: bug#64392: cksum: escaping issues of --check output
Date: Sat, 01 Jul 2023 19:20:38 +0200
On Sat, 2023-07-01 at 17:07 +0100, Pádraig Brady wrote:
> Right. We traditionally didn't escape any chars in the --check
> output,
> but that changed with
> https://github.com/coreutils/coreutils/commit/646902b30
> To minimize escaping, that patch only considered the '\n' character,
> but we should also have considered file names with a leading '\'.
> 
> The attached should address this.

Thanks, but wouldn't it be better to use exactly the same escaping as
in the sums output? I.e. also escaping \r?

Also, documenting the escaping behaviour in info/manpages?


Cheers,
Chris.




Information forwarded to bug-coreutils <at> gnu.org:
bug#64392; Package coreutils. (Sat, 01 Jul 2023 17:28:01 GMT) Full text and rfc822 format available.

Message #16 received at 64392 <at> debbugs.gnu.org (full text, mbox):

From: Christoph Anton Mitterer <calestyo <at> scientia.org>
To: Pádraig Brady <P <at> draigBrady.com>, 64392 <at> debbugs.gnu.org
Subject: Re: bug#64392: cksum: escaping issues of --check output
Date: Sat, 01 Jul 2023 19:27:11 +0200
Oh and I've seen you really escape \ only if it's the first character.

Same here, I'd suggest to apply the same escaping rules as for the
other output, and escape '\\' '\n' and '\r' as soon as any of them
occurs in the output.




Information forwarded to bug-coreutils <at> gnu.org:
bug#64392; Package coreutils. (Sat, 01 Jul 2023 17:54:02 GMT) Full text and rfc822 format available.

Message #19 received at 64392 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Christoph Anton Mitterer <calestyo <at> scientia.org>, 64392 <at> debbugs.gnu.org
Subject: Re: bug#64392: cksum: escaping issues of --check output
Date: Sat, 1 Jul 2023 18:53:10 +0100
On 01/07/2023 18:20, Christoph Anton Mitterer wrote:
> On Sat, 2023-07-01 at 17:07 +0100, Pádraig Brady wrote:
>> Right. We traditionally didn't escape any chars in the --check
>> output,
>> but that changed with
>> https://github.com/coreutils/coreutils/commit/646902b30
>> To minimize escaping, that patch only considered the '\n' character,
>> but we should also have considered file names with a leading '\'.
>>
>> The attached should address this.
> 
> Thanks, but wouldn't it be better to use exactly the same escaping as
> in the sums output? I.e. also escaping \r?

Yes maybe. I was thinking this status output would be less likely to be persisted,
and so would not need the same escaping requirements,
but for consistency it may be better to have the same escaping rules,
with the caveat that file names with a literal backslash anywhere
would now be escaped. That's not a common case I suppose,
so I'm amenable to using the consistent escaping here.

> Also, documenting the escaping behaviour in info/manpages?

Info docs already contain:

"Without ‘--zero’, if FILE contains a backslash, newline, or carriage
return, the line is started with a backslash, and each problematic
character in the file name is escaped with a backslash, making the
output unambiguous even in the presence of arbitrary file names."

cheers,
Pádraig





Information forwarded to bug-coreutils <at> gnu.org:
bug#64392; Package coreutils. (Sat, 01 Jul 2023 19:13:02 GMT) Full text and rfc822 format available.

Message #22 received at 64392 <at> debbugs.gnu.org (full text, mbox):

From: Christoph Anton Mitterer <calestyo <at> scientia.org>
To: Pádraig Brady <P <at> draigBrady.com>, 64392 <at> debbugs.gnu.org
Subject: Re: bug#64392: cksum: escaping issues of --check output
Date: Sat, 01 Jul 2023 21:12:41 +0200
On Sat, 2023-07-01 at 18:53 +0100, Pádraig Brady wrote:
> That's not a common case I suppose,
> so I'm amenable to using the consistent escaping here.

Good :-)


> Info docs already contain:
> 
> "Without ‘--zero’, if FILE contains a backslash, newline, or carriage
> return, the line is started with a backslash, and each problematic
> character in the file name is escaped with a backslash, making the
> output unambiguous even in the presence of arbitrary file names."


Well yes, but that's in like the "common" section.

Further down, for --tag, it's explicitly mentioned again there, that
there's the escaping when \ is present as leading escaping indicator.

For --untagged and --check there's no such further mentioning ... so at
least it's a bit inconsistent... and could lead people to think it
would happen only with --tag.


Actually I'd even more "definitely" describe the escaping algorithm
above, in the sense that any \ \r and \n are escaped, and that any
other \-sequence (like \" \0 \xXX etc.) are explicitly reserved for
future use.
This especially in hindsight that other tools may also use the
tagged/unttaged output formats and add their own add-ons assuming
they're free to do so.


Cheers,
Chris.




Information forwarded to bug-coreutils <at> gnu.org:
bug#64392; Package coreutils. (Tue, 11 Jul 2023 11:28:02 GMT) Full text and rfc822 format available.

Message #25 received at 64392 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Christoph Anton Mitterer <calestyo <at> scientia.org>, 64392 <at> debbugs.gnu.org
Subject: Re: bug#64392: cksum: escaping issues of --check output
Date: Tue, 11 Jul 2023 12:27:08 +0100
On 01/07/2023 20:12, Christoph Anton Mitterer wrote:
> On Sat, 2023-07-01 at 18:53 +0100, Pádraig Brady wrote:
>> That's not a common case I suppose,
>> so I'm amenable to using the consistent escaping here.
> 
> Good :-)
> 
> 
>> Info docs already contain:
>>
>> "Without ‘--zero’, if FILE contains a backslash, newline, or carriage
>> return, the line is started with a backslash, and each problematic
>> character in the file name is escaped with a backslash, making the
>> output unambiguous even in the presence of arbitrary file names."
> 
> 
> Well yes, but that's in like the "common" section.
> 
> Further down, for --tag, it's explicitly mentioned again there, that
> there's the escaping when \ is present as leading escaping indicator.
> 
> For --untagged and --check there's no such further mentioning ... so at
> least it's a bit inconsistent... and could lead people to think it
> would happen only with --tag.
> 
> 
> Actually I'd even more "definitely" describe the escaping algorithm
> above, in the sense that any \ \r and \n are escaped, and that any
> other \-sequence (like \" \0 \xXX etc.) are explicitly reserved for
> future use.
> This especially in hindsight that other tools may also use the
> tagged/unttaged output formats and add their own add-ons assuming
> they're free to do so.

Full escaping and doc adjustments pushed at:
https://git.sv.gnu.org/cgit/coreutils.git/commit/?id=86614ba1c

cheers,
Pádraig




Information forwarded to bug-coreutils <at> gnu.org:
bug#64392; Package coreutils. (Tue, 11 Jul 2023 12:34:01 GMT) Full text and rfc822 format available.

Message #28 received at 64392 <at> debbugs.gnu.org (full text, mbox):

From: Christoph Anton Mitterer <calestyo <at> scientia.org>
To: Pádraig Brady <P <at> draigBrady.com>, 64392 <at> debbugs.gnu.org
Subject: Re: bug#64392: cksum: escaping issues of --check output
Date: Tue, 11 Jul 2023 14:33:17 +0200
On Tue, 2023-07-11 at 12:27 +0100, Pádraig Brady wrote:
> Full escaping and doc adjustments pushed at:
> https://git.sv.gnu.org/cgit/coreutils.git/commit/?id=86614ba1c

Thanks :-)

Cheers,
Chris.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 09 Aug 2023 11:24:14 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 6 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.