GNU bug report logs - #6600
[PATCH] sort: add --threads option to parallelize internal sort.

Previous Next

Package: coreutils;

Reported by: Pádraig Brady <P <at> draigBrady.com>

Date: Sat, 10 Jul 2010 01:09:02 UTC

Severity: normal

Tags: patch

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Pádraig Brady <P <at> draigBrady.com>
To: 6600 <at> debbugs.gnu.org
Subject: bug#6600: [PATCH] sort: add --threads option to	parallelize	internal sort.
Date: Thu, 15 Jul 2010 01:07:16 +0100
On 13/07/10 01:59, Pádraig Brady wrote:
> I've finally applied the patch.
> http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commit;h=9face836
> 
> I made a few comment tweaks and added
> some dependencies for the heap module.
> 
> I also removed the xmemcoll0() calls
> which are separate to this concurrent functionality.
> I will add those back in Chen's name after updating to
> the latest gnulib.
> 
> Thanks everyone for their work on this!
> 
> Pádraig.
> 
> 
> 
> 

Here's the xmemcoll0 follow up:

From: Chen Guo <chenguo4 <at> yahoo.com>
Date: Wed, 14 Jul 2010 07:41:05 +0100
Subject: [PATCH] sort: speed up default full line sorting

Don't write NUL after the comparison buffers on each compare,
which increases performance by about 3% for short lines
on a pentium-m with gcc-4.4.1

* src/sort.c: (fillbuf): Delimit input items with NUL.
(write_bytes): Restore the item delimiter char which was
replaced with NUL in fillbuf().
---
 src/sort.c |   18 +++++++++++++++---
 1 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/src/sort.c b/src/sort.c
index 5ea1b34..45cb78f 100644
--- a/src/sort.c
+++ b/src/sort.c
@@ -1743,13 +1743,17 @@ fillbuf (struct buffer *buf, FILE *fp, char const *file)
                   if (buf->buf == ptrlim)
                     return false;
                   if (ptrlim[-1] != eol)
-                    *ptrlim++ = eol;
+                    *ptrlim++ = '\0';
                 }
             }

           /* Find and record each line in the just-read input.  */
           while ((p = memchr (ptr, eol, ptrlim - ptr)))
             {
+              /* Delimit the line with NUL. This eliminates the need to
+                 temporarily replace the last byte with NUL when calling
+                 xmemcoll(), which increases performance.  */
+              *p = '\0';
               ptr = p + 1;
               line--;
               line->text = line_start;
@@ -2642,7 +2646,13 @@ compare (const struct line *a, const struct line *b, bool show_debug)
   else if (blen == 0)
     diff = 1;
   else if (hard_LC_COLLATE)
-    diff = xmemcoll (a->text, alen, b->text, blen);
+    {
+      /* Note xmemcoll0 is a performance enhancement as
+         it will not unconditionally write '\0' after the
+         passed in buffers, which was seen to give around
+         a 3% increase in performance for short lines.  */
+      diff = xmemcoll0 (a->text, alen + 1, b->text, blen + 1);
+    }
   else if (! (diff = memcmp (a->text, b->text, MIN (alen, blen))))
     diff = alen < blen ? -1 : alen != blen;

@@ -2652,9 +2662,11 @@ compare (const struct line *a, const struct line *b, bool show_debug)
 static void
 write_bytes (struct line const *line, FILE *fp, char const *output_file)
 {
-  char const *buf = line->text;
+  char *buf = line->text;
   size_t n_bytes = line->length;

+  *(buf + n_bytes - 1) = eolchar;
+
   /* Convert TABs to '>' and \0 to \n when -z specified.  */
   if (debug && fp == stdout)
     {
-- 
1.6.2.5





This bug report was last modified 14 years and 313 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.