GNU bug report logs -
#6789
propose renaming gnulib memxfrm to amemxfrm (naming collision with coreutils)
Previous Next
Reported by: Paul Eggert <eggert <at> CS.UCLA.EDU>
Date: Tue, 3 Aug 2010 19:47:01 UTC
Severity: normal
Done: Jim Meyering <jim <at> meyering.net>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
On 08/08/10 05:24, Bruno Haible wrote:
> sort: reduce number of strxfrm calls
Thanks for that suggestion. Amusingly enough, it made 'sort -R'
slower on the first benchmark I tried it on, which was 'sort -R *'.
But that's an unfair benchmark, since '*' expanded to executables and
other non-text files. Overall, it's a good idea. However, the code
need not be quite that long, since there's no need to do size_t
overflow checking. I pushed this:
From 0061819c7e1bbc26586cc5977ea96da016f7cea2 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert <at> cs.ucla.edu>
Date: Sun, 8 Aug 2010 23:14:38 -0700
Subject: [PATCH] sort: speed up -R with long lines in hard locales
* src/sort.c (compare_random): Guess that the output will be
3X the input. This avoids the overhead of calling strxfrm
twice on typical implementations. Suggested by Bruno Haible.
---
src/sort.c | 18 +++++++++++++-----
1 files changed, 13 insertions(+), 5 deletions(-)
diff --git a/src/sort.c b/src/sort.c
index dcfd24f..148ed3e 100644
--- a/src/sort.c
+++ b/src/sort.c
@@ -2024,6 +2024,7 @@ compare_random (char *restrict texta, size_t lena,
char stackbuf[4000];
char *buf = stackbuf;
size_t bufsize = sizeof stackbuf;
+ void *allocated = NULL;
uint32_t dig[2][MD5_DIGEST_SIZE / sizeof (uint32_t)];
struct md5_ctx s[2];
s[0] = s[1] = random_md5_state;
@@ -2047,6 +2048,16 @@ compare_random (char *restrict texta, size_t lena,
/* Store the transformed data into a big-enough buffer. */
+ /* A 3X size guess avoids the overhead of calling strxfrm
+ twice on typical implementations. Don't worry about
+ size_t overflow, as the guess need not be correct. */
+ size_t guess_bufsize = 3 * (lena + lenb) + 2;
+ if (bufsize < guess_bufsize)
+ {
+ bufsize = MAX (guess_bufsize, bufsize * 3 / 2);
+ buf = allocated = xrealloc (allocated, bufsize);
+ }
+
size_t sizea =
(texta < lima ? xstrxfrm (buf, texta, bufsize) + 1 : 0);
bool a_fits = sizea <= bufsize;
@@ -2062,9 +2073,7 @@ compare_random (char *restrict texta, size_t lena,
bufsize = sizea + sizeb;
if (bufsize < SIZE_MAX / 3)
bufsize = bufsize * 3 / 2;
- buf = (buf == stackbuf
- ? xmalloc (bufsize)
- : xrealloc (buf, bufsize));
+ buf = allocated = xrealloc (allocated, bufsize);
if (texta < lima)
strxfrm (buf, texta, sizea);
if (textb < limb)
@@ -2119,8 +2128,7 @@ compare_random (char *restrict texta, size_t lena,
diff = xfrm_diff;
}
- if (buf != stackbuf)
- free (buf);
+ free (allocated);
return diff;
}
--
1.7.2
This bug report was last modified 14 years and 6 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.