GNU bug report logs - #8766
Bug in sha1sum?

Previous Next

Package: coreutils;

Reported by: Theo Band <theo.band <at> greenpeak.com>

Date: Mon, 30 May 2011 17:54:02 UTC

Severity: normal

Tags: notabug

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 8766 in the body.
You can then email your comments to 8766 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#8766; Package coreutils. (Mon, 30 May 2011 17:54:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Theo Band <theo.band <at> greenpeak.com>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Mon, 30 May 2011 17:54:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Theo Band <theo.band <at> greenpeak.com>
To: <bug-coreutils <at> gnu.org>
Subject: Bug in sha1sum?
Date: Mon, 30 May 2011 12:51:01 +0200
Hi

I'm not sure, but I think I found a bug in sha1sum. It's easy to
reproduce with any file that contains a backslash (\) in the name:
echo test > test
$ sha1sum test
4e1243bd22c66e76c2ba9eddc1f91394e57f9f83  test
$ mv test 'test\test'
$ sha1sum 'test\test'
\4e1243bd22c66e76c2ba9eddc1f91394e57f9f83  test\\test

I expect the file sha1sum to be the same after renaming the file (a
backslash is prepended to the otherwise correct result).

sha1sum --version
sha1sum (GNU coreutils) 5.97
coreutils-5.97-23.el5_6.4

Kind regards,
Theo Band

-- 

GreenPeak Technologies

------------------------------------------------------------------------
Phone :  +31 30 711 5622                             Vinkenburgstraat 2a
E-mail:  Theo.Band <at> greenpeak.com                          3512AB Utrecht
Skype :  Theo.Band-greenpeak                             The Netherlands
                                                http://www.greenpeak.com
  .-.   CONFIDENTIALITY: this message, including possible attachment(s),
  /v\   constitutes confidential GreenPeak information, intended for the
 // \\  use of above named addressee(s) only; any other use or
/(   )\ disclosure to anyone other than addressee(s), is prohibited.
 ^^-^^  Chamber of Commerce NL-3210.56.42.
------------------------------------------------------------------------





Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#8766; Package coreutils. (Mon, 30 May 2011 23:04:01 GMT) Full text and rfc822 format available.

Message #8 received at 8766 <at> debbugs.gnu.org (full text, mbox):

From: "Alan Curry" <pacman-cu <at> kosh.dhis.org>
To: theo.band <at> greenpeak.com (Theo Band)
Cc: 8766 <at> debbugs.gnu.org
Subject: Re: bug#8766: Bug in sha1sum?
Date: Mon, 30 May 2011 18:03:21 -0500 (GMT+5)
Theo Band writes:
> 
> Hi
> 
> I'm not sure, but I think I found a bug in sha1sum. It's easy to
> reproduce with any file that contains a backslash (\) in the name:
> echo test > test
> $ sha1sum test
> 4e1243bd22c66e76c2ba9eddc1f91394e57f9f83  test
> $ mv test 'test\test'
> $ sha1sum 'test\test'
> \4e1243bd22c66e76c2ba9eddc1f91394e57f9f83  test\\test
> 
> I expect the file sha1sum to be the same after renaming the file (a
> backslash is prepended to the otherwise correct result).

This result violated my expectations too, but it turns out to be a documented
feature:

     For each FILE, `md5sum' outputs the MD5 checksum, a flag indicating
  a binary or text input file, and the file name.  If FILE contains a
  backslash or newline, the line is started with a backslash, and each
  problematic character in the file name is escaped with a backslash,
  making the output unambiguous even in the presence of arbitrary file
  names.  If FILE is omitted or specified as `-', standard input is read.

(the sha*sum utilities all refer back to md5sum's description)

I better go fix all my scripts that rely on /^[0-9a-f]{32} /

-- 
Alan Curry




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#8766; Package coreutils. (Tue, 31 May 2011 07:40:03 GMT) Full text and rfc822 format available.

Message #11 received at 8766 <at> debbugs.gnu.org (full text, mbox):

From: Theo Band <theo.band <at> greenpeak.com>
To: Alan Curry <pacman-cu <at> kosh.dhis.org>
Cc: 8766 <at> debbugs.gnu.org
Subject: Re: bug#8766: Bug in sha1sum?
Date: Tue, 31 May 2011 09:39:45 +0200
On 05/31/2011 01:03 AM, Alan Curry wrote:
> Theo Band writes:
>> Hi
>>
>> I'm not sure, but I think I found a bug in sha1sum. It's easy to
>> reproduce with any file that contains a backslash (\) in the name:
>> echo test > test
>> $ sha1sum test
>> 4e1243bd22c66e76c2ba9eddc1f91394e57f9f83  test
>> $ mv test 'test\test'
>> $ sha1sum 'test\test'
>> \4e1243bd22c66e76c2ba9eddc1f91394e57f9f83  test\\test
>>
>> I expect the file sha1sum to be the same after renaming the file (a
>> backslash is prepended to the otherwise correct result).
> This result violated my expectations too, but it turns out to be a documented
> feature:
>
>      For each FILE, `md5sum' outputs the MD5 checksum, a flag indicating
>   a binary or text input file, and the file name.  If FILE contains a
>   backslash or newline, the line is started with a backslash, and each
>   problematic character in the file name is escaped with a backslash,
>   making the output unambiguous even in the presence of arbitrary file
>   names.  If FILE is omitted or specified as `-', standard input is read.
>
> (the sha*sum utilities all refer back to md5sum's description)
>
> I better go fix all my scripts that rely on /^[0-9a-f]{32} /
>
man sha1sum, info sha1sum and sha1sum --help don't show me this info.
Instead I read this:

> The default mode is to print a line with checksum, a character
indicating type (`*' for binary, ` ' for text), and name for each FILE.

Would that mean the documentation in the coreutils-5.97-23.el5_6.4 is
outdated? If so, is there perhaps an undocumented option that does not
output this backslash?
I make an index of all my files to find duplicates. The backslash
doesn't help.

Theo





Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#8766; Package coreutils. (Tue, 31 May 2011 07:57:02 GMT) Full text and rfc822 format available.

Message #14 received at 8766 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Theo Band <theo.band <at> greenpeak.com>
Cc: 8766 <at> debbugs.gnu.org, Alan Curry <pacman-cu <at> kosh.dhis.org>
Subject: Re: bug#8766: Bug in sha1sum?
Date: Tue, 31 May 2011 09:56:28 +0200
Theo Band wrote:
> On 05/31/2011 01:03 AM, Alan Curry wrote:
>> Theo Band writes:
>>> Hi
>>>
>>> I'm not sure, but I think I found a bug in sha1sum. It's easy to
>>> reproduce with any file that contains a backslash (\) in the name:
>>> echo test > test
>>> $ sha1sum test
>>> 4e1243bd22c66e76c2ba9eddc1f91394e57f9f83  test
>>> $ mv test 'test\test'
>>> $ sha1sum 'test\test'
>>> \4e1243bd22c66e76c2ba9eddc1f91394e57f9f83  test\\test
>>>
>>> I expect the file sha1sum to be the same after renaming the file (a
>>> backslash is prepended to the otherwise correct result).
>> This result violated my expectations too, but it turns out to be a documented
>> feature:
>>
>>      For each FILE, `md5sum' outputs the MD5 checksum, a flag indicating
>>   a binary or text input file, and the file name.  If FILE contains a
>>   backslash or newline, the line is started with a backslash, and each
>>   problematic character in the file name is escaped with a backslash,
>>   making the output unambiguous even in the presence of arbitrary file
>>   names.  If FILE is omitted or specified as `-', standard input is read.
>>
>> (the sha*sum utilities all refer back to md5sum's description)
>>
>> I better go fix all my scripts that rely on /^[0-9a-f]{32} /
>>
> man sha1sum, info sha1sum and sha1sum --help don't show me this info.
> Instead I read this:
>
>> The default mode is to print a line with checksum, a character
> indicating type (`*' for binary, ` ' for text), and name for each FILE.
>
> Would that mean the documentation in the coreutils-5.97-23.el5_6.4 is
> outdated? If so, is there perhaps an undocumented option that does not
> output this backslash?
> I make an index of all my files to find duplicates. The backslash
> doesn't help.

That feature is required to allow checking the hash of any file name
that contains newlines.  There is no option to disable it.
That omission in the documentation was corrected by COREUTILS-6_8-69-g826ff08.

If you're sure you have no newline-afflicted file name,
you can safely filter out the backslashes with this:

    sed 's/^\\//;s/\\\\/\\/g'

E.g.,

    $ touch a\\b
    $ md5sum a\\b | sed 's/^\\//;s/\\\\/\\/g' | md5sum -c -
    a\b: OK




Added tag(s) notabug. Request was from Jim Meyering <jim <at> meyering.net> to control <at> debbugs.gnu.org. (Tue, 31 May 2011 08:02:01 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 8766 <at> debbugs.gnu.org and Theo Band <theo.band <at> greenpeak.com> Request was from Jim Meyering <jim <at> meyering.net> to control <at> debbugs.gnu.org. (Tue, 31 May 2011 08:02:02 GMT) Full text and rfc822 format available.

Message #19 received at 8766-done <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Theo Band <theo.band <at> greenpeak.com>
Cc: 8766-done <at> debbugs.gnu.org
Subject: Re: bug#8766: Bug in sha1sum?
Date: Tue, 31 May 2011 09:49:52 +0100
tags 8766 notabug

On 30/05/11 11:51, Theo Band wrote:
> Hi
> 
> I'm not sure, but I think I found a bug in sha1sum. It's easy to
> reproduce with any file that contains a backslash (\) in the name:
> echo test > test
> $ sha1sum test
> 4e1243bd22c66e76c2ba9eddc1f91394e57f9f83  test
> $ mv test 'test\test'
> $ sha1sum 'test\test'
> \4e1243bd22c66e76c2ba9eddc1f91394e57f9f83  test\\test
> 
> I expect the file sha1sum to be the same after renaming the file (a
> backslash is prepended to the otherwise correct result).
> 
> sha1sum --version
> sha1sum (GNU coreutils) 5.97
> coreutils-5.97-23.el5_6.4

This is expected.
Here is a shell function I use in FSlint to
clean the output from these utilities
when we know we'll not have files with \n chars.

cleanup_sum() {
  sed '
  # md5sum and sha1sum et. al. from coreutils at least,
  # to deal with \n in filenames, convert any \ and \n chars
  # to \\ and \\n respectively. Currently we ignore files with \n
  # so just undo this problematic escaping
  /^\\/{s/^\\//; s/\\\\/\\/g};

  # These utils also add a "*" flag character for normal files
  # on platforms where O_BINARY is significant (like CYGWIN).
  # We always process in binary mode and so remove that flag here
  s/^\([^ ]*\) \*/\1  /;
  '
}

So you can just:

sha1sum test | cleanup_sum

cheers,
Pádraig.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 28 Jun 2011 11:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 14 years and 54 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.