GNU bug report logs -
#10355
Add an option to {md5,sha*} to ignore directories
Previous Next
Reported by: "Gilles Espinasse" <g.esp <at> free.fr>
Date: Fri, 23 Dec 2011 13:47:02 UTC
Severity: wishlist
Tags: moreinfo, notabug, wontfix
Done: Pádraig Brady <P <at> draigBrady.com>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 10355 in the body.
You can then email your comments to 10355 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-coreutils <at> gnu.org
:
bug#10355
; Package
coreutils
.
(Fri, 23 Dec 2011 13:47:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
"Gilles Espinasse" <g.esp <at> free.fr>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Fri, 23 Dec 2011 13:47:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
I was using a way to check md5sum on a lot of file using
for myfile in `cat ${ALLFILES}`; do if [ -f /${myfile} ]; then md5sum
/$myfile >> $ALLFILES}.md5; fi; done
But this is slow, comparing with xargs md5sum way.
time (for myfile in `cat ${ALLFILES}`; do if [ -f /${myfile} ]; then md5sum
/$myfile >> ${ALLFILES}.md5; fi; done)
real 0m26.907s
user 0m40.019s
sys 0m10.253s
This is faster using xargs md5sum.
time (sed -e '/.\/$/d' -e 's|^.|/&|g' ${ALLFILES} | xargs md5sum
>${ALLFILES}.md5)
md5sum: /etc/ipsec.d/cacerts: Is a directory
md5sum: /etc/ipsec.d/certs: Is a directory
md5sum: /etc/ipsec.d/crls: Is a directory
md5sum: /etc/ppp/chap-secrets: No such file or directory
md5sum: /etc/ppp/pap-secrets: No such file or directory
md5sum: /etc/squid/squid.conf: No such file or directory
real 0m1.176s
user 0m0.780s
sys 0m0.400s
That run mostly 30 times faster.
In the above example, I already skipped most of the directories in the list,
removing lines that end with / but not all directories in my list match on
that condition.
So the fast solution emit errors and end with status 123.
I know I could hide error messages and status error but that start to be
ugly.
sed -e'/.\/$/d' -e 's|^.|/&|g' ${ALLFILES} | xargs md5sum > ${ALLFILES}.md5
2>/dev/null || test $? -eq 123
Would it not be great to support an option in {md5,sha*} to ignore directory
error?
I may even be able to produce a patch if there is a real interest.
Gilles
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#10355
; Package
coreutils
.
(Fri, 23 Dec 2011 15:06:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 10355 <at> debbugs.gnu.org (full text, mbox):
Hi Gilles,
On 12/23/2011 02:45 PM, Gilles Espinasse wrote:
> I was using a way to check md5sum on a lot of file using
> for myfile in `cat ${ALLFILES}`; do if [ -f /${myfile} ]; then md5sum
> /$myfile>> $ALLFILES}.md5; fi; done
>
> But this is slow, comparing with xargs md5sum way.
> time (for myfile in `cat ${ALLFILES}`; do if [ -f /${myfile} ]; then md5sum
> /$myfile>> ${ALLFILES}.md5; fi; done)
>
> real 0m26.907s
> user 0m40.019s
> sys 0m10.253s
>
> This is faster using xargs md5sum.
> time (sed -e '/.\/$/d' -e 's|^.|/&|g' ${ALLFILES} | xargs md5sum
>> ${ALLFILES}.md5)
> md5sum: /etc/ipsec.d/cacerts: Is a directory
> md5sum: /etc/ipsec.d/certs: Is a directory
> md5sum: /etc/ipsec.d/crls: Is a directory
> md5sum: /etc/ppp/chap-secrets: No such file or directory
> md5sum: /etc/ppp/pap-secrets: No such file or directory
> md5sum: /etc/squid/squid.conf: No such file or directory
>
> real 0m1.176s
> user 0m0.780s
> sys 0m0.400s
>
> That run mostly 30 times faster.
> In the above example, I already skipped most of the directories in the list,
> removing lines that end with / but not all directories in my list match on
> that condition.
How do you create the list of files to check?
You could use "find $DIR -type f" to list regular files only.
Erik
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#10355
; Package
coreutils
.
(Fri, 23 Dec 2011 17:20:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 10355 <at> debbugs.gnu.org (full text, mbox):
severity 10355 wishlist
tags 10355 + notabug wontfix moreinfo
thanks
Erik Auerswald wrote:
> Gilles Espinasse wrote:
> >I was using a way to check md5sum on a lot of file using
> > for myfile in `cat ${ALLFILES}`; do if [ -f /${myfile} ]; then md5sum
> >/$myfile>> $ALLFILES}.md5; fi; done
>...
> You could use "find $DIR -type f" to list regular files only.
Yes. Exactly. The capability you ask for is already present.
Please try this:
find . -type f -exec md5sum {} +
Replace '.' above with a directory if you wish it to find files in a
different directory.
Bob
Severity set to 'wishlist' from 'normal'
Request was from
Bob Proulx <bob <at> proulx.com>
to
control <at> debbugs.gnu.org
.
(Fri, 23 Dec 2011 17:20:02 GMT)
Full text and
rfc822 format available.
Added tag(s) notabug, moreinfo, and wontfix.
Request was from
Bob Proulx <bob <at> proulx.com>
to
control <at> debbugs.gnu.org
.
(Fri, 23 Dec 2011 17:20:02 GMT)
Full text and
rfc822 format available.
Reply sent
to
Pádraig Brady <P <at> draigBrady.com>
:
You have taken responsibility.
(Fri, 23 Dec 2011 17:51:01 GMT)
Full text and
rfc822 format available.
Notification sent
to
"Gilles Espinasse" <g.esp <at> free.fr>
:
bug acknowledged by developer.
(Fri, 23 Dec 2011 17:51:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 10355-done <at> debbugs.gnu.org (full text, mbox):
On 12/23/2011 01:45 PM, Gilles Espinasse wrote:
> I was using a way to check md5sum on a lot of file using
> for myfile in `cat ${ALLFILES}`; do if [ -f /${myfile} ]; then md5sum
> /$myfile >> $ALLFILES}.md5; fi; done
>
> But this is slow, comparing with xargs md5sum way.
> time (for myfile in `cat ${ALLFILES}`; do if [ -f /${myfile} ]; then md5sum
> /$myfile >> ${ALLFILES}.md5; fi; done)
>
> real 0m26.907s
> user 0m40.019s
> sys 0m10.253s
>
> This is faster using xargs md5sum.
> time (sed -e '/.\/$/d' -e 's|^.|/&|g' ${ALLFILES} | xargs md5sum
>> ${ALLFILES}.md5)
> md5sum: /etc/ipsec.d/cacerts: Is a directory
> md5sum: /etc/ipsec.d/certs: Is a directory
> md5sum: /etc/ipsec.d/crls: Is a directory
> md5sum: /etc/ppp/chap-secrets: No such file or directory
> md5sum: /etc/ppp/pap-secrets: No such file or directory
> md5sum: /etc/squid/squid.conf: No such file or directory
>
> real 0m1.176s
> user 0m0.780s
> sys 0m0.400s
>
> That run mostly 30 times faster.
> In the above example, I already skipped most of the directories in the list,
> removing lines that end with / but not all directories in my list match on
> that condition.
>
> So the fast solution emit errors and end with status 123.
> I know I could hide error messages and status error but that start to be
> ugly.
> sed -e'/.\/$/d' -e 's|^.|/&|g' ${ALLFILES} | xargs md5sum > ${ALLFILES}.md5
> 2>/dev/null || test $? -eq 123
>
> Would it not be great to support an option in {md5,sha*} to ignore directory
> error?
> I may even be able to produce a patch if there is a real interest.
>
> Gilles
I don't think this is worthwhile TBH, as it is too unusual.
One can easily exclude dirs from the source.
Either trivially with find, or filtering like:
LANG=C xargs -d'\n' -r stat -L -c "%F:%n" < ${ALLFILES} | # decorate
sed '/^directory:/d; s/^[^:]*://' | # filter and undecorate
xargs -d'\n' md5sum # process
cheers,
Pádraig.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#10355
; Package
coreutils
.
(Fri, 23 Dec 2011 17:55:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 10355 <at> debbugs.gnu.org (full text, mbox):
----- Original Message -----
From: "Bob Proulx" <bob <at> proulx.com>
To: "Gilles Espinasse" <g.esp <at> free.fr>; <10355 <at> debbugs.gnu.org>
Sent: Friday, December 23, 2011 6:17 PM
Subject: Re: bug#10355: Add an option to {md5,sha*} to ignore directories
> severity 10355 wishlist
> tags 10355 + notabug wontfix moreinfo
> thanks
>
> Erik Auerswald wrote:
> > Gilles Espinasse wrote:
> > >I was using a way to check md5sum on a lot of file using
> > > for myfile in `cat ${ALLFILES}`; do if [ -f /${myfile} ]; then md5sum
> > >/$myfile>> $ALLFILES}.md5; fi; done
> >...
> > You could use "find $DIR -type f" to list regular files only.
>
Thank for the suggestion.
ALLFILES is indirectly made using find $DIR, but I can't use -type f during
that find.
The primary usage of that list is to create a tar using --files-from and
when a directory is empty, you need to include the directory name directly.
> Yes. Exactly. The capability you ask for is already present.
>
> Please try this:
>
> find . -type f -exec md5sum {} +
>
> Replace '.' above with a directory if you wish it to find files in a
> different directory.
>
> Bob
This does not work too in my case.
I didn't want to calculate md5 for each file found in my chroot.
I care only for a shorter list of files to be include in a tar.
The only change I find is derived from the slow version, but instead of
running md5sum each time, adding true file names to a list that md5sum will
only use at the end:
rm /tmp/ALLFILES*
time (for myfile in ${ALLFILES} | sed -e 's/^dev.*//' -e 's/^sys.*//'); do
if [ -f /${myfile} ]; then echo /$myfile >>/tmp/ALLFILES; fi; done; xargs
md5sum < /tmp/ALLFILES >/tmp/ALLFILES.md5)
real 0m1.967s
user 0m1.368s
sys 0m0.604s
This is approximatly 100% slower than the fast version but does not need
hiding errors (from directory message and program status). That's fast
enought for my need and divide by 5 the time from the first slow version.
Gilles
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#10355
; Package
coreutils
.
(Fri, 23 Dec 2011 23:02:02 GMT)
Full text and
rfc822 format available.
Message #26 received at 10355 <at> debbugs.gnu.org (full text, mbox):
Bob Proulx writes:
>
> severity 10355 wishlist
> tags 10355 + notabug wontfix moreinfo
> thanks
>
> Erik Auerswald wrote:
> > Gilles Espinasse wrote:
> > >I was using a way to check md5sum on a lot of file using
> > > for myfile in `cat ${ALLFILES}`; do if [ -f /${myfile} ]; then md5sum
> > >/$myfile>> $ALLFILES}.md5; fi; done
> >...
> > You could use "find $DIR -type f" to list regular files only.
>
> Yes. Exactly. The capability you ask for is already present.
Do you suppose we can convince GNU grep's maintainer to follow this
philosphy?
$ mkdir d
$ touch d/foo
$ grep foo *
$
It opens and reads, gets EISDIR, and intentionally skips printing it. Grr.
But wait, there's a -d option with 3 alternatives for what to do with
directories! ...and none of choices is "just print the EISDIR so I'll know
if I accidentally grepped a directory".
--
Alan Curry
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#10355
; Package
coreutils
.
(Sat, 24 Dec 2011 01:00:02 GMT)
Full text and
rfc822 format available.
Message #29 received at 10355 <at> debbugs.gnu.org (full text, mbox):
Alan Curry wrote:
> Do you suppose we can convince GNU grep's maintainer to follow this
> philosphy?
Too late. GNU grep already has --recursive. I think adding
--recursive to grep was a mistake. It then requires most of 'find' to
be added to it too. (--include*, --exclude*)
> $ mkdir d
> $ touch d/foo
> $ grep foo *
> $
>
> It opens and reads, gets EISDIR, and intentionally skips printing it. Grr.
All silently. For most cases I think your example would have been a a
case of programming error. It would be better to make those cases noisy.
The above seems to be a bug since it violates the documented action of
'read' for directories. It appears to be skipping by default. Even
when --directories=read is specified.
> But wait, there's a -d option with 3 alternatives for what to do with
> directories! ...and none of choices is "just print the EISDIR so I'll know
> if I accidentally grepped a directory".
And the problems just go on and on.
Bob
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#10355
; Package
coreutils
.
(Sat, 24 Dec 2011 10:45:02 GMT)
Full text and
rfc822 format available.
Message #32 received at 10355 <at> debbugs.gnu.org (full text, mbox):
On 12/23/11 14:58, Alan Curry wrote:
> Do you suppose we can convince GNU grep's maintainer to follow this
> philosphy?
We definitely should. I have filed a bug report (with patch) at
<https://savannah.gnu.org/bugs/index.php?35169>.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Sat, 21 Jan 2012 12:24:03 GMT)
Full text and
rfc822 format available.
This bug report was last modified 13 years and 158 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.