GNU bug report logs -
#6804
Bug-coreutils Digest, Vol 93, Issue 6
Previous Next
Reported by: belhage <at> midibel.com
Date: Thu, 5 Aug 2010 20:09:01 UTC
Severity: normal
Done: Bob Proulx <bob <at> proulx.com>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 6804 in the body.
You can then email your comments to 6804 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#6804
; Package
coreutils
.
(Thu, 05 Aug 2010 20:09:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
belhage <at> midibel.com
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Thu, 05 Aug 2010 20:09:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
On Thu, 2010-08-05 at 12:03 -0400, bug-coreutils-request <at> gnu.org wrote:
> Send Bug-coreutils mailing list submissions to
> bug-coreutils <at> gnu.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://lists.gnu.org/mailman/listinfo/bug-coreutils
> or, via email, send a message with subject or body 'help' to
> bug-coreutils-request <at> gnu.org
>
> You can reach the person managing the list at
> bug-coreutils-owner <at> gnu.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Bug-coreutils digest..."
>
>
> Today's Topics:
>
> 1. bug#6789: propose renaming gnulib memxfrm to amemxfrm (naming
> collision with coreutils) (Paul Eggert)
> 2. bug#6789: propose renaming gnulib memxfrm to amemxfrm (naming
> collision with coreutils) (Simon Josefsson)
> 3. bug#6789: propose renaming gnulib memxfrm to amemxfrm (naming
> collision with coreutils) (Paolo Bonzini)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 04 Aug 2010 16:21:28 -0700
> From: Paul Eggert <eggert <at> CS.UCLA.EDU>
> Subject: bug#6789: propose renaming gnulib memxfrm to amemxfrm (naming
> collision with coreutils)
> To: Bruno Haible <bruno <at> clisp.org>
> Cc: Bug-gnulib <bug-gnulib <at> gnu.org>, Bug-coreutils
> <bug-coreutils <at> gnu.org>
> Message-ID: <4C59F5F8.9080309 <at> cs.ucla.edu>
> Content-Type: text/plain; charset=UTF-8
>
> On 08/03/10 16:33, Bruno Haible wrote:
> > But when the stack buffer is not sufficient, then the use of coreutils memxfrm
> > is 30% to 70% slower than the use of gnulib memxfrm, with a difference of
> > 700 sec at least.
>
> (Ooo! Ooo! Performance measurements! I love this stuff!)
>
> It depends on the data. In the typical case, "sort" is applied to
> text data, which does not contain NUL bytes. The data in that
> benchmark contained many NUL bytes. If you take the same benchmark
> and uniformly replace "\0" with "\t" in compare.c, then the situation
> is much different: coreutils memxfrm is about 3 times faster than
> gnulib memxfrm on the larger test cases (this platform is Ubuntu
> 10.04, gcc 4.5.0, 2.4 GHz Pentium 4):
>
> 503-penguin $ gcc -std=gnu99 -O2 -Wall coreutils-memxfrm.c gnulib-memxfrm.c compare1.c -I.
> 504-penguin $ ./a.out
> Time for gnulib_memxfrm: 0,032002
> Time for coreutils_memxfrm: 0,028001
> Time for gnulib_memxfrm: 0,024002
> Time for coreutils_memxfrm: 0,024001
> Time for gnulib_memxfrm: 0,036003
> Time for coreutils_memxfrm: 0,032002
> Time for gnulib_memxfrm: 18,2051
> Time for coreutils_memxfrm: 5,48834
> Time for gnulib_memxfrm: 16,045
> Time for coreutils_memxfrm: 5,50034
> Time for gnulib_memxfrm: 15,837
> Time for coreutils_memxfrm: 5,44834
>
> I expect that this performance glitch in gnulib memxfrm could be
> improved, as it shouldn't simply double buffer sizes when they're too
> small, as at that point it already knows what the final buffer size
> should be. Doing this should bring up gnulib memxfrm to be as fast as
> coreutils xmemxfrm for this benchmark. Also, I agree that gnulib
> memxfrm is faster in some important cases. Still, gnulib memxfrm is
> problematic, because it insists on managing memory itself.
>
> Come to think of it, looking at gnulib memxfrm gave me an idea
> to improve the performance of GNU sort by bypassing the need for an
> memxfrm-like function entirely. I pushed a patch to do that at
> <http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=2b49b140cc13cf36ec5ee5acaca5ac7bfeed6366>.
>
> This avoids any potential naming collision for now. The point
> remains, though, that it's confusing that gnulib memxfrm's name begins
> with "mem", as the mem* functions don't allocate memory. Would you
> consider a patch that renames gnulib memxfrm to amemxfrm, or to some
> other such name?
>
>
>
>
>
>
> ------------------------------
>
> Message: 2
> Date: Thu, 05 Aug 2010 01:44:30 +0200
> From: Simon Josefsson <simon <at> josefsson.org>
> Subject: bug#6789: propose renaming gnulib memxfrm to amemxfrm (naming
> collision with coreutils)
> To: Paul Eggert <eggert <at> CS.UCLA.EDU>
> Cc: Bug-gnulib <bug-gnulib <at> gnu.org>, Bug-coreutils
> <bug-coreutils <at> gnu.org>, Bruno Haible <bruno <at> clisp.org>
> Message-ID: <8739uurli9.fsf <at> mocca.josefsson.org>
> Content-Type: text/plain; charset=us-ascii
>
> Paul Eggert <eggert <at> CS.UCLA.EDU> writes:
>
> > Come to think of it, looking at gnulib memxfrm gave me an idea
> > to improve the performance of GNU sort by bypassing the need for an
> > memxfrm-like function entirely. I pushed a patch to do that at
> > <http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=2b49b140cc13cf36ec5ee5acaca5ac7bfeed6366>.
>
> I don't know this code at all, but would your approach lead to problems
> if two different strings have the same MD5 hash? MD5 is broken, and
> finding collisions takes just seconds on normal PC. See:
> http://en.wikipedia.org/wiki/MD5#Security
>
> /Simon
>
>
>
>
>
> ------------------------------
>
> Message: 3
> Date: Thu, 05 Aug 2010 04:58:26 +0200
> From: Paolo Bonzini <bonzini <at> gnu.org>
> Subject: bug#6789: propose renaming gnulib memxfrm to amemxfrm (naming
> collision with coreutils)
> To: bug-gnulib <at> gnu.org, Bug-coreutils <bug-coreutils <at> gnu.org>, Paul
> Eggert <eggert <at> cs.ucla.edu>
> Message-ID: <4C5A28D2.3040102 <at> gnu.org>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> On 08/05/2010 01:44 AM, Simon Josefsson wrote:
> > Paul Eggert<eggert <at> CS.UCLA.EDU> writes:
> >
> >> Come to think of it, looking at gnulib memxfrm gave me an idea
> >> to improve the performance of GNU sort by bypassing the need for an
> >> memxfrm-like function entirely. I pushed a patch to do that at
> >> <http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=2b49b140cc13cf36ec5ee5acaca5ac7bfeed6366>.
> >
> > I don't know this code at all, but would your approach lead to problems
> > if two different strings have the same MD5 hash? MD5 is broken, and
> > finding collisions takes just seconds on normal PC. See:
> > http://en.wikipedia.org/wiki/MD5#Security
>
> MD5 is used simply as a kind of "random key generator", so it doesn't
> matter. I wonder two things however:
>
> 1) why bother with memxfrm as a tie-breaker? isn't memcmp good enough?
>
> 2) maybe there's something cheaper than md5 that can be used? For
> example you could compare a^x and b^x where x is the output of a fast
> 32-bit random number generator? It doesn't need to be cryptographic,
> I'd pick http://en.wikipedia.org/wiki/Xorshift.
>
> Paolo
>
>
>
>
>
> ------------------------------
>
> _______________________________________________
> Bug-coreutils mailing list
> Bug-coreutils <at> gnu.org
> http://lists.gnu.org/mailman/listinfo/bug-coreutils
>
>
> End of Bug-coreutils Digest, Vol 93, Issue 6
> ********************************************
bug closed, send any further explanations to
6804 <at> debbugs.gnu.org and belhage <at> midibel.com
Request was from
Bob Proulx <bob <at> proulx.com>
to
control <at> debbugs.gnu.org
.
(Thu, 25 Aug 2011 05:45:02 GMT)
Full text and
rfc822 format available.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Thu, 22 Sep 2011 11:24:03 GMT)
Full text and
rfc822 format available.
This bug report was last modified 13 years and 325 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.