GNU bug report logs - #10355
Add an option to {md5,sha*} to ignore directories

Previous Next

Package: coreutils;

Reported by: "Gilles Espinasse" <g.esp <at> free.fr>

Date: Fri, 23 Dec 2011 13:47:02 UTC

Severity: wishlist

Tags: moreinfo, notabug, wontfix

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Pádraig Brady <P <at> draigBrady.com>
Cc: tracker <at> debbugs.gnu.org
Subject: bug#10355: closed (Add an option to {md5,sha*} to ignore directories)
Date: Fri, 23 Dec 2011 17:51:01 +0000
[Message part 1 (text/plain, inline)]
Your message dated Fri, 23 Dec 2011 17:48:05 +0000
with message-id <4EF4BED5.7030607 <at> draigBrady.com>
and subject line Re: bug#10355: Add an option to {md5,sha*} to ignore directories
has caused the debbugs.gnu.org bug report #10355,
regarding Add an option to {md5,sha*} to ignore directories
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)


-- 
10355: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=10355
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: "Gilles Espinasse" <g.esp <at> free.fr>
To: <bug-coreutils <at> gnu.org>
Subject: Add an option to {md5,sha*} to ignore directories
Date: Fri, 23 Dec 2011 14:45:10 +0100
I was using a way to check md5sum on a lot of file using
 for myfile in `cat ${ALLFILES}`; do if [ -f /${myfile} ]; then md5sum
/$myfile >> $ALLFILES}.md5; fi; done

But this is slow, comparing with xargs md5sum way.
time (for myfile in `cat ${ALLFILES}`; do if [ -f /${myfile} ]; then md5sum
/$myfile >> ${ALLFILES}.md5; fi; done)

real    0m26.907s
user    0m40.019s
sys     0m10.253s

This is faster using xargs md5sum.
time (sed -e '/.\/$/d' -e 's|^.|/&|g' ${ALLFILES} | xargs md5sum
>${ALLFILES}.md5)
md5sum: /etc/ipsec.d/cacerts: Is a directory
md5sum: /etc/ipsec.d/certs: Is a directory
md5sum: /etc/ipsec.d/crls: Is a directory
md5sum: /etc/ppp/chap-secrets: No such file or directory
md5sum: /etc/ppp/pap-secrets: No such file or directory
md5sum: /etc/squid/squid.conf: No such file or directory

real    0m1.176s
user    0m0.780s
sys     0m0.400s

That run mostly 30 times faster.
In the above example, I already skipped most of the directories in the list,
removing lines that end with / but not all directories in my list match on
that condition.

So the fast solution emit errors and end with status 123.
I know I could hide error messages and status error but that start to be
ugly.
sed -e'/.\/$/d' -e 's|^.|/&|g' ${ALLFILES} | xargs md5sum > ${ALLFILES}.md5
2>/dev/null || test $? -eq 123

Would it not be great to support an option in {md5,sha*} to ignore directory
error?
I may even be able to produce a patch if there is a real interest.

Gilles



[Message part 3 (message/rfc822, inline)]
From: Pádraig Brady <P <at> draigBrady.com>
To: Gilles Espinasse <g.esp <at> free.fr>
Cc: 10355-done <at> debbugs.gnu.org
Subject: Re: bug#10355: Add an option to {md5,sha*} to ignore directories
Date: Fri, 23 Dec 2011 17:48:05 +0000
On 12/23/2011 01:45 PM, Gilles Espinasse wrote:
> I was using a way to check md5sum on a lot of file using
>  for myfile in `cat ${ALLFILES}`; do if [ -f /${myfile} ]; then md5sum
> /$myfile >> $ALLFILES}.md5; fi; done
> 
> But this is slow, comparing with xargs md5sum way.
> time (for myfile in `cat ${ALLFILES}`; do if [ -f /${myfile} ]; then md5sum
> /$myfile >> ${ALLFILES}.md5; fi; done)
> 
> real    0m26.907s
> user    0m40.019s
> sys     0m10.253s
> 
> This is faster using xargs md5sum.
> time (sed -e '/.\/$/d' -e 's|^.|/&|g' ${ALLFILES} | xargs md5sum
>> ${ALLFILES}.md5)
> md5sum: /etc/ipsec.d/cacerts: Is a directory
> md5sum: /etc/ipsec.d/certs: Is a directory
> md5sum: /etc/ipsec.d/crls: Is a directory
> md5sum: /etc/ppp/chap-secrets: No such file or directory
> md5sum: /etc/ppp/pap-secrets: No such file or directory
> md5sum: /etc/squid/squid.conf: No such file or directory
> 
> real    0m1.176s
> user    0m0.780s
> sys     0m0.400s
> 
> That run mostly 30 times faster.
> In the above example, I already skipped most of the directories in the list,
> removing lines that end with / but not all directories in my list match on
> that condition.
> 
> So the fast solution emit errors and end with status 123.
> I know I could hide error messages and status error but that start to be
> ugly.
> sed -e'/.\/$/d' -e 's|^.|/&|g' ${ALLFILES} | xargs md5sum > ${ALLFILES}.md5
> 2>/dev/null || test $? -eq 123
> 
> Would it not be great to support an option in {md5,sha*} to ignore directory
> error?
> I may even be able to produce a patch if there is a real interest.
> 
> Gilles

I don't think this is worthwhile TBH, as it is too unusual.
One can easily exclude dirs from the source.
Either trivially with find, or filtering like:

LANG=C xargs -d'\n' -r stat -L -c "%F:%n" < ${ALLFILES} | # decorate
sed '/^directory:/d; s/^[^:]*://' | # filter and undecorate
xargs -d'\n' md5sum # process

cheers,
Pádraig.


This bug report was last modified 13 years and 158 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.