From unknown Mon Jun 23 23:52:43 2025
X-Loop: help-debbugs@gnu.org
Subject: bug#6600: [PATCH] sort: add --threads option to parallelize internal sort.
Resent-From: =?UTF-8?Q?P=C3=A1draig?= Brady
Original-Sender: debbugs-submit-bounces@debbugs.gnu.org
Resent-To: owner@debbugs.gnu.org
Resent-CC: bug-coreutils@gnu.org
Resent-Date: Sat, 10 Jul 2010 01:09:02 +0000
Resent-Message-ID:
Resent-Sender: help-debbugs@gnu.org
X-GNU-PR-Message: report 6600
X-GNU-PR-Package: coreutils
X-GNU-PR-Keywords: patch
To: Chen Guo
Cc: eggert@cs.ucla.edu, 6600@debbugs.gnu.org, glen.lenker@gmail.com, mykphyre@gmail.com, quaker4lyf@gmail.com, cdickens@ucla.edu
X-Debbugs-Original-Cc: Paul Eggert , Bug Coreutils , Glen Lenker , Mike Nichols , Gene Auyeung , Chris Dickens
Received: via spool by submit@debbugs.gnu.org id=B.12787241262758
(code B ref -1); Sat, 10 Jul 2010 01:09:02 +0000
Received: (at submit) by debbugs.gnu.org; 10 Jul 2010 01:08:46 +0000
Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from )
id 1OXOYf-0000iR-AP
for submit@debbugs.gnu.org; Fri, 09 Jul 2010 21:08:45 -0400
Received: from mx10.gnu.org ([199.232.76.166])
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from ) id 1OXOYc-0000iK-TE
for submit@debbugs.gnu.org; Fri, 09 Jul 2010 21:08:43 -0400
Received: from lists.gnu.org ([199.232.76.165]:49413)
by monty-python.gnu.org with esmtps
(TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60)
(envelope-from
) id 1OXOYZ-0003lg-0E
for submit@debbugs.gnu.org; Fri, 09 Jul 2010 21:08:39 -0400
Received: from [140.186.70.92] (port=57143 helo=eggs.gnu.org)
by lists.gnu.org with esmtp (Exim 4.43) id 1OXOYX-0000Wc-Hi
for bug-coreutils@gnu.org; Fri, 09 Jul 2010 21:08:38 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org
X-Spam-Level:
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00
autolearn=unavailable version=3.3.1
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69)
(envelope-from
) id 1OXOYW-0003qp-5r
for bug-coreutils@gnu.org; Fri, 09 Jul 2010 21:08:37 -0400
Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]:1061)
by eggs.gnu.org with smtp (Exim 4.69)
(envelope-from
) id 1OXOYV-0003qL-SU
for bug-coreutils@gnu.org; Fri, 09 Jul 2010 21:08:36 -0400
Received: (qmail 45007 invoked from network); 10 Jul 2010 01:08:33 -0000
Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218)
by mail1.slb.deg.dub.stisp.net with SMTP; 10 Jul 2010 01:08:33 -0000
Message-ID: <4C37C7D9.2030909@draigBrady.com>
Date: Sat, 10 Jul 2010 02:07:37 +0100
From: =?UTF-8?Q?P=C3=A1draig?= Brady
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US;
rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3
MIME-Version: 1.0
References: <362522.89643.qm@web180012.mail.gq1.yahoo.com>
<4B94CAE0.6000106@draigBrady.com>
<535984.88146.qm@web180006.mail.gq1.yahoo.com>
In-Reply-To: <535984.88146.qm@web180006.mail.gq1.yahoo.com>
X-Enigmail-Version: 1.0.1
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-detected-operating-system: by eggs.gnu.org: FreeBSD 4.6-4.9
X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6,
seldom 2.4 (older, 4)
X-Spam-Score: -3.5 (---)
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.11
Precedence: list
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Sender: debbugs-submit-bounces@debbugs.gnu.org
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
X-Spam-Score: -4.8 (----)
On 08/03/10 10:39, Chen Guo wrote:
> Hi Padraig,
>
>> You previously mentioned a thread bug with memcoll. Is that worked around?
>
> That happened when more than one instance of memcoll is called on the same
> line at once, since memcoll replaces the eolchar with '\0'. Under our approach,
> the same line shouldn't ever be compared at the same time, so we're fine.
> On top of that, Professor Eggert suggested NUL delimiting all lines as they're
> read in, so memcoll doesn't have to; hence the patch to gnulib, which introduces
> xmemcoll_nul and memcoll_nul, for when input is known to be NUL delimited, thus
> no replacement of the eolchar is needed, making memcoll threadsafe.
Note the current xmemcoll0() in gnulib requires the length
_including_ the terminating NUL to be passed, whereas one
usually does not include the terminating char in the length
passed to xmemcoll(). I accordingly updated the lengths
passed to xmemcoll0() by your latest patch.
However there are still writes done to the source text
in the keycompare() function. So I'm thinking of dropping
the whole xmemcoll0() thing altogether assuming your
statement above is correct, that a particular line will
not be used at the same time by multiple threads.
I did try to copy the text to the stack before comparing,
but that introduced a significant overhead noted below.
Your patch is still performing well on a single core machine:
----------- before ---------------------
$ time ./src/sort < nums.list >/dev/null
real 0m8.644s
user 0m8.307s
sys 0m0.292s
$ time ./src/sort -g < nums.list >/dev/null
real 0m11.046s
user 0m10.652s
sys 0m0.295s
$ time ./src/sort -n < nums.list >/dev/null
real 0m4.909s
user 0m4.567s
sys 0m0.298s
$ time LANG=C ./src/sort < nums.list >/dev/null
real 0m1.959s
user 0m1.657s
sys 0m0.285s
------------ after ---------------------
$ time ./src/sort < nums.list >/dev/null
real 0m8.686s
user 0m8.300s
sys 0m0.232s
$ time ./src/sort -g < nums.list >/dev/null
real 0m10.196s
user 0m9.850s
sys 0m0.221s
$ time ./src/sort -n < nums.list >/dev/null
real 0m2.958s
user 0m2.664s
sys 0m0.221s
$ time LANG=C ./src/sort < nums.list >/dev/null
real 0m1.985s
user 0m1.750s
sys 0m0.217s
After copying the text to the stack as mentioned above
there is a significant performance drop:
$ time ./src/sort -n < nums.list >/dev/null
real 0m4.086s
user 0m3.848s
sys 0m0.218s
cheers,
Pádraig.
From unknown Mon Jun 23 23:52:43 2025
X-Loop: help-debbugs@gnu.org
Subject: bug#6600: [PATCH] sort: add --threads option to parallelize internal sort.
Resent-From: Paul Eggert
Original-Sender: debbugs-submit-bounces@debbugs.gnu.org
Resent-To: owner@debbugs.gnu.org
Resent-CC: bug-coreutils@gnu.org
Resent-Date: Sat, 10 Jul 2010 02:55:01 +0000
Resent-Message-ID:
Resent-Sender: help-debbugs@gnu.org
X-GNU-PR-Message: followup 6600
X-GNU-PR-Package: coreutils
X-GNU-PR-Keywords: patch
To: =?UTF-8?Q?P=C3=A1draig?= Brady
Cc: Bug Coreutils , Chen Guo , Glen Lenker , Mike Nichols , Gene Auyeung , Chris Dickens
Received: via spool by submit@debbugs.gnu.org id=B.12787304845356
(code B ref -1); Sat, 10 Jul 2010 02:55:01 +0000
Received: (at submit) by debbugs.gnu.org; 10 Jul 2010 02:54:44 +0000
Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from )
id 1OXQDD-0001OL-NQ
for submit@debbugs.gnu.org; Fri, 09 Jul 2010 22:54:44 -0400
Received: from mail.gnu.org ([199.232.76.166] helo=mx10.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from ) id 1OXQDB-0001OG-3d
for submit@debbugs.gnu.org; Fri, 09 Jul 2010 22:54:41 -0400
Received: from lists.gnu.org ([199.232.76.165]:39317)
by monty-python.gnu.org with esmtps
(TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60)
(envelope-from ) id 1OXQD7-0004nO-Lo
for submit@debbugs.gnu.org; Fri, 09 Jul 2010 22:54:37 -0400
Received: from [140.186.70.92] (port=55389 helo=eggs.gnu.org)
by lists.gnu.org with esmtp (Exim 4.43) id 1OXQD6-000704-8o
for bug-coreutils@gnu.org; Fri, 09 Jul 2010 22:54:37 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org
X-Spam-Level:
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,T_RP_MATCHES_RCVD
autolearn=unavailable version=3.3.1
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69)
(envelope-from ) id 1OXQD5-0005WR-4Z
for bug-coreutils@gnu.org; Fri, 09 Jul 2010 22:54:36 -0400
Received: from kiwi.cs.ucla.edu ([131.179.128.19]:52653)
by eggs.gnu.org with esmtp (Exim 4.69)
(envelope-from ) id 1OXQD4-0005W5-PV
for bug-coreutils@gnu.org; Fri, 09 Jul 2010 22:54:35 -0400
Received: from [131.179.64.200] (Penguin.CS.UCLA.EDU [131.179.64.200])
by kiwi.cs.ucla.edu (8.13.8+Sun/8.13.8/UCLACS-6.0) with ESMTP id
o6A2sQoF002843; Fri, 9 Jul 2010 19:54:27 -0700 (PDT)
Message-ID: <4C37E0E2.3060205@cs.ucla.edu>
Date: Fri, 09 Jul 2010 19:54:26 -0700
From: Paul Eggert
Organization: UCLA Computer Science Department
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US;
rv:1.9.1.10) Gecko/20100527 Thunderbird/3.0.5
MIME-Version: 1.0
References: <362522.89643.qm@web180012.mail.gq1.yahoo.com>
<4B94CAE0.6000106@draigBrady.com>
<535984.88146.qm@web180006.mail.gq1.yahoo.com>
<4C37C7D9.2030909@draigBrady.com>
In-Reply-To: <4C37C7D9.2030909@draigBrady.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-MIME-Autoconverted: from 8bit to quoted-printable by kiwi.cs.ucla.edu id
o6A2sQoF002843
X-detected-operating-system: by eggs.gnu.org: Solaris 10 (beta)
X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6,
seldom 2.4 (older, 4)
X-Spam-Score: -4.8 (----)
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.11
Precedence: list
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Sender: debbugs-submit-bounces@debbugs.gnu.org
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
X-Spam-Score: -4.8 (----)
On 07/09/10 18:07, P=E1draig Brady wrote:
> Chen Guo wrote:
>> That happened when more than one instance of memcoll is called on the =
same
>> line at once, since memcoll replaces the eolchar with '\0'. Under our =
approach,
>> the same line shouldn't ever be compared at the same time, so we're fi=
ne.
Ah, sorry, I wasn't aware of that.
> I'm thinking of dropping
> the whole xmemcoll0() thing altogether assuming your
> statement above is correct, that a particular line will
> not be used at the same time by multiple threads.
Yes, that makes sense. We can revert that change from gnulib, since it
makes gnulib bigger unnecessarily.
From unknown Mon Jun 23 23:52:43 2025
X-Loop: help-debbugs@gnu.org
Subject: bug#6600: [PATCH] sort: add --threads option to parallelize internal sort.
Resent-From: =?UTF-8?Q?P=C3=A1draig?= Brady
Original-Sender: debbugs-submit-bounces@debbugs.gnu.org
Resent-To: owner@debbugs.gnu.org
Resent-CC: bug-coreutils@gnu.org
Resent-Date: Sat, 10 Jul 2010 09:31:02 +0000
Resent-Message-ID:
Resent-Sender: help-debbugs@gnu.org
X-GNU-PR-Message: followup 6600
X-GNU-PR-Package: coreutils
X-GNU-PR-Keywords: patch
To: Chen Guo
Cc: Paul Eggert , Bug Coreutils , Glen Lenker , Mike Nichols , Gene Auyeung , Chris Dickens
Received: via spool by submit@debbugs.gnu.org id=B.127875421615086
(code B ref -1); Sat, 10 Jul 2010 09:31:02 +0000
Received: (at submit) by debbugs.gnu.org; 10 Jul 2010 09:30:16 +0000
Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from )
id 1OXWO0-0003vH-Il
for submit@debbugs.gnu.org; Sat, 10 Jul 2010 05:30:16 -0400
Received: from mail.gnu.org ([199.232.76.166] helo=mx10.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from ) id 1OXWNx-0003vC-7J
for submit@debbugs.gnu.org; Sat, 10 Jul 2010 05:30:14 -0400
Received: from lists.gnu.org ([199.232.76.165]:47815)
by monty-python.gnu.org with esmtps
(TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60)
(envelope-from
) id 1OXWNt-00010x-AB
for submit@debbugs.gnu.org; Sat, 10 Jul 2010 05:30:09 -0400
Received: from [140.186.70.92] (port=53408 helo=eggs.gnu.org)
by lists.gnu.org with esmtp (Exim 4.43) id 1OXWNp-00032U-LJ
for bug-coreutils@gnu.org; Sat, 10 Jul 2010 05:30:07 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org
X-Spam-Level:
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00
autolearn=unavailable version=3.3.1
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69)
(envelope-from
) id 1OXWNn-00053f-UJ
for bug-coreutils@gnu.org; Sat, 10 Jul 2010 05:30:05 -0400
Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]:13450)
by eggs.gnu.org with smtp (Exim 4.69)
(envelope-from
) id 1OXWNn-00053F-LS
for bug-coreutils@gnu.org; Sat, 10 Jul 2010 05:30:03 -0400
Received: (qmail 96387 invoked from network); 10 Jul 2010 09:30:01 -0000
Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218)
by mail1.slb.deg.dub.stisp.net with SMTP; 10 Jul 2010 09:30:01 -0000
Message-ID: <4C383D5F.4090703@draigBrady.com>
Date: Sat, 10 Jul 2010 10:29:03 +0100
From: =?UTF-8?Q?P=C3=A1draig?= Brady
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US;
rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3
MIME-Version: 1.0
References: <362522.89643.qm@web180012.mail.gq1.yahoo.com> <4B94CAE0.6000106@draigBrady.com> <535984.88146.qm@web180006.mail.gq1.yahoo.com> <4C37C7D9.2030909@draigBrady.com> <4C37E0E2.3060205@cs.ucla.edu>
In-Reply-To:
X-Enigmail-Version: 1.0.1
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-detected-operating-system: by eggs.gnu.org: FreeBSD 4.6-4.9
X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6,
seldom 2.4 (older, 4)
X-Spam-Score: -4.8 (----)
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.11
Precedence: list
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Sender: debbugs-submit-bounces@debbugs.gnu.org
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
X-Spam-Score: -4.8 (----)
On 10/07/10 05:23, Chen Guo wrote:
> 2010/7/9 Paul Eggert :
>> On 07/09/10 18:07, Pádraig Brady wrote:
>>> Chen Guo wrote:
>>>> That happened when more than one instance of memcoll is called on the same
>>>> line at once, since memcoll replaces the eolchar with '\0'. Under our approach,
>>>> the same line shouldn't ever be compared at the same time, so we're fine.
>>
>> Ah, sorry, I wasn't aware of that.
>>
>>> I'm thinking of dropping
>>> the whole xmemcoll0() thing altogether assuming your
>>> statement above is correct, that a particular line will
>>> not be used at the same time by multiple threads.
>>
>> Yes, that makes sense. We can revert that change from gnulib, since it
>> makes gnulib bigger unnecessarily.
>>
>
> Actually, the '\0' saves about 5% off runtime last I checked. This is because
> EACH TIME sort compares two lines memcoll would replace the last byte. If we
> set them all to NUL anyway at the start, memcoll_nul wouldn't need to do that
> replacement for each compare. When we output, we'd simply put the \n back.
>
> I could be wrong though, this is going off memory from 4-5 months ago. But 5%
> is about what I remember, when sorting 1M lines on 8 cores.
Well for the whole line comparison where it works it's 2.9% faster
(or 2.3% adding in the NUL checks to xmemcoll0).
Also xmemcoll0() is probably generally useful since it's a readonly function.
So I'll leave it and fix up the keycompare() calls to it
(while documenting that keycompare() needs to work on a particular
line in line one thread.
thanks,
Pádraig.
From unknown Mon Jun 23 23:52:43 2025
X-Loop: help-debbugs@gnu.org
Subject: bug#6600: [PATCH] sort: add --threads option to parallelize internal sort.
Resent-From: Chen Guo
Original-Sender: debbugs-submit-bounces@debbugs.gnu.org
Resent-To: owner@debbugs.gnu.org
Resent-CC: bug-coreutils@gnu.org
Resent-Date: Sat, 10 Jul 2010 16:30:02 +0000
Resent-Message-ID:
Resent-Sender: help-debbugs@gnu.org
X-GNU-PR-Message: followup 6600
X-GNU-PR-Package: coreutils
X-GNU-PR-Keywords: patch
To: Paul Eggert
Cc: Bug Coreutils , Glen Lenker , Mike Nichols , Gene Auyeung , Chris Dickens , =?UTF-8?Q?P=C3=A1draig?= Brady
Received: via spool by submit@debbugs.gnu.org id=B.127877934428854
(code B ref -1); Sat, 10 Jul 2010 16:30:02 +0000
Received: (at submit) by debbugs.gnu.org; 10 Jul 2010 16:29:04 +0000
Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from )
id 1OXcvH-0007VL-0o
for submit@debbugs.gnu.org; Sat, 10 Jul 2010 12:29:03 -0400
Received: from mx10.gnu.org ([199.232.76.166])
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from ) id 1OXRbW-0001vV-OS
for submit@debbugs.gnu.org; Sat, 10 Jul 2010 00:23:55 -0400
Received: from lists.gnu.org ([199.232.76.165]:39306)
by monty-python.gnu.org with esmtps
(TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60)
(envelope-from ) id 1OXRbT-0005gB-8u
for submit@debbugs.gnu.org; Sat, 10 Jul 2010 00:23:51 -0400
Received: from [140.186.70.92] (port=52869 helo=eggs.gnu.org)
by lists.gnu.org with esmtp (Exim 4.43) id 1OXRbR-0001Zb-DY
for bug-coreutils@gnu.org; Sat, 10 Jul 2010 00:23:50 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org
X-Spam-Level:
X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00,
FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,
T_DKIM_INVALID autolearn=no version=3.3.1
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69)
(envelope-from ) id 1OXRbQ-0005sK-3s
for bug-coreutils@gnu.org; Sat, 10 Jul 2010 00:23:49 -0400
Received: from mail-pv0-f169.google.com ([74.125.83.169]:33082)
by eggs.gnu.org with esmtp (Exim 4.69)
(envelope-from ) id 1OXRbP-0005sG-T2
for bug-coreutils@gnu.org; Sat, 10 Jul 2010 00:23:48 -0400
Received: by pvc30 with SMTP id 30so2510326pvc.0
for ; Fri, 09 Jul 2010 21:23:46 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
h=domainkey-signature:mime-version:received:received:in-reply-to
:references:date:message-id:subject:from:to:cc:content-type
:content-transfer-encoding;
bh=AocMZLpSpYOpiyP9DgmzJn42whcvPIoDCOSGecskCMI=;
b=grt7ovdCLMC3bnsyh6O6/GfoWXmrM4qoIwj+yrRITjp6kPmUEcm2Vl+kxL/hu5zOC9
b0rERZ9QOhTrXBq40ktf/jX672mvaeFq8X060xmtNrqGhAQyEmKHu1vlfXVUlLB123L7
NjS4t4PO3CrsSAskN2H+1oEJgZsDVihmxUtlI=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
h=mime-version:in-reply-to:references:date:message-id:subject:from:to
:cc:content-type:content-transfer-encoding;
b=HRMEZg+CctXiLfjn8GciyoqJbgZmzLCveTdTc9khskP0WfuLK/MBs9B9LexpoaHsLY
G11OIz+qNnx2KD7njpkO62w5TI3UQod5qNQb7WUoFkXwDsDDRqH4B5PhC6XdAJ2D4gzP
12LBOBo2WVS0Mv9m1c8QvoFMtr7f/qVULpV4k=
MIME-Version: 1.0
Received: by 10.142.170.2 with SMTP id s2mr12850973wfe.243.1278735826309; Fri,
09 Jul 2010 21:23:46 -0700 (PDT)
Received: by 10.142.192.12 with HTTP; Fri, 9 Jul 2010 21:23:46 -0700 (PDT)
In-Reply-To: <4C37E0E2.3060205@cs.ucla.edu>
References: <362522.89643.qm@web180012.mail.gq1.yahoo.com>
<4B94CAE0.6000106@draigBrady.com>
<535984.88146.qm@web180006.mail.gq1.yahoo.com>
<4C37C7D9.2030909@draigBrady.com> <4C37E0E2.3060205@cs.ucla.edu>
Date: Fri, 9 Jul 2010 21:23:46 -0700
Message-ID:
From: Chen Guo
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2)
X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6,
seldom 2.4 (older, 4)
X-Spam-Score: -5.9 (-----)
X-Mailman-Approved-At: Sat, 10 Jul 2010 12:29:01 -0400
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.11
Precedence: list
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Sender: debbugs-submit-bounces@debbugs.gnu.org
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
X-Spam-Score: -5.9 (-----)
2010/7/9 Paul Eggert :
> On 07/09/10 18:07, P=E1draig Brady wrote:
>> Chen Guo wrote:
>>> That happened when more than one instance of memcoll is called on the s=
ame
>>> line at once, since memcoll replaces the eolchar with '\0'. Under our a=
pproach,
>>> the same line shouldn't ever be compared at the same time, so we're fin=
e.
>
> Ah, sorry, I wasn't aware of that.
>
>> I'm thinking of dropping
>> the whole xmemcoll0() thing altogether assuming your
>> statement above is correct, that a particular line will
>> not be used at the same time by multiple threads.
>
> Yes, that makes sense. =A0We can revert that change from gnulib, since it
> makes gnulib bigger unnecessarily.
>
Actually, the '\0' saves about 5% off runtime last I checked. This is becau=
se
EACH TIME sort compares two lines memcoll would replace the last byte. If w=
e
set them all to NUL anyway at the start, memcoll_nul wouldn't need to do th=
at
replacement for each compare. When we output, we'd simply put the \n back.
I could be wrong though, this is going off memory from 4-5 months ago. But =
5%
is about what I remember, when sorting 1M lines on 8 cores.
From unknown Mon Jun 23 23:52:43 2025
MIME-Version: 1.0
X-Mailer: MIME-tools 5.427 (Entity 5.427)
X-Loop: help-debbugs@gnu.org
From: help-debbugs@gnu.org (GNU bug Tracking System)
To: =?UTF-8?Q?P=C3=A1draig?= Brady
Subject: bug#6600: closed (Re: bug#6600: [PATCH] sort: add --threads
option to parallelize internal sort.)
Message-ID:
References: <4C3BBA60.9020201@draigBrady.com>
<4C37C7D9.2030909@draigBrady.com>
X-Gnu-PR-Message: they-closed 6600
X-Gnu-PR-Package: coreutils
X-Gnu-PR-Keywords: patch
Reply-To: 6600@debbugs.gnu.org
Date: Tue, 13 Jul 2010 01:01:01 +0000
Content-Type: multipart/mixed; boundary="----------=_1278982861-19756-1"
This is a multi-part message in MIME format...
------------=_1278982861-19756-1
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"
Your bug report
#6600: [PATCH] sort: add --threads option to parallelize internal sort.
which was filed against the coreutils package, has been closed.
The explanation is attached below, along with your original report.
If you require more details, please reply to 6600@debbugs.gnu.org.
--=20
6600: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D6600
GNU Bug Tracking System
Contact help-debbugs@gnu.org with problems
------------=_1278982861-19756-1
Content-Type: message/rfc822
Content-Disposition: inline
Content-Transfer-Encoding: 7bit
Received: (at 6600-done) by debbugs.gnu.org; 13 Jul 2010 01:00:22 +0000
Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from )
id 1OYTrC-00058O-7I
for submit@debbugs.gnu.org; Mon, 12 Jul 2010 21:00:22 -0400
Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98])
by debbugs.gnu.org with smtp (Exim 4.69)
(envelope-from ) id 1OYTr9-00058J-MX
for 6600-done@debbugs.gnu.org; Mon, 12 Jul 2010 21:00:20 -0400
Received: (qmail 15223 invoked from network); 13 Jul 2010 01:00:22 -0000
Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218)
by mail1.slb.deg.dub.stisp.net with SMTP; 13 Jul 2010 01:00:22 -0000
Message-ID: <4C3BBA60.9020201@draigBrady.com>
Date: Tue, 13 Jul 2010 01:59:12 +0100
From: =?ISO-8859-1?Q?P=E1draig_Brady?=
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US;
rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3
MIME-Version: 1.0
To: 6600-done@debbugs.gnu.org
Subject: Re: bug#6600: [PATCH] sort: add --threads option to
parallelize internal sort.
References: <362522.89643.qm@web180012.mail.gq1.yahoo.com> <4B94CAE0.6000106@draigBrady.com> <535984.88146.qm@web180006.mail.gq1.yahoo.com> <4C37C7D9.2030909@draigBrady.com> <4C37E0E2.3060205@cs.ucla.edu>
<4C383D5F.4090703@draigBrady.com>
In-Reply-To: <4C383D5F.4090703@draigBrady.com>
X-Enigmail-Version: 1.0.1
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Spam-Score: -2.8 (--)
X-Debbugs-Envelope-To: 6600-done
Cc: Paul Eggert , Chen Guo ,
Glen Lenker , Mike Nichols ,
Gene Auyeung , Chris Dickens
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.11
Precedence: list
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Sender: debbugs-submit-bounces@debbugs.gnu.org
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
X-Spam-Score: -2.8 (--)
I've finally applied the patch.
http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commit;h=9face836
I made a few comment tweaks and added
some dependencies for the heap module.
I also removed the xmemcoll0() calls
which are separate to this concurrent functionality.
I will add those back in Chen's name after updating to
the latest gnulib.
Thanks everyone for their work on this!
Pádraig.
------------=_1278982861-19756-1
Content-Type: message/rfc822
Content-Disposition: inline
Content-Transfer-Encoding: 7bit
Received: (at submit) by debbugs.gnu.org; 10 Jul 2010 01:08:46 +0000
Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from )
id 1OXOYf-0000iR-AP
for submit@debbugs.gnu.org; Fri, 09 Jul 2010 21:08:45 -0400
Received: from mx10.gnu.org ([199.232.76.166])
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from ) id 1OXOYc-0000iK-TE
for submit@debbugs.gnu.org; Fri, 09 Jul 2010 21:08:43 -0400
Received: from lists.gnu.org ([199.232.76.165]:49413)
by monty-python.gnu.org with esmtps
(TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60)
(envelope-from
) id 1OXOYZ-0003lg-0E
for submit@debbugs.gnu.org; Fri, 09 Jul 2010 21:08:39 -0400
Received: from [140.186.70.92] (port=57143 helo=eggs.gnu.org)
by lists.gnu.org with esmtp (Exim 4.43) id 1OXOYX-0000Wc-Hi
for bug-coreutils@gnu.org; Fri, 09 Jul 2010 21:08:38 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org
X-Spam-Level:
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00
autolearn=unavailable version=3.3.1
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69)
(envelope-from
) id 1OXOYW-0003qp-5r
for bug-coreutils@gnu.org; Fri, 09 Jul 2010 21:08:37 -0400
Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]:1061)
by eggs.gnu.org with smtp (Exim 4.69)
(envelope-from
) id 1OXOYV-0003qL-SU
for bug-coreutils@gnu.org; Fri, 09 Jul 2010 21:08:36 -0400
Received: (qmail 45007 invoked from network); 10 Jul 2010 01:08:33 -0000
Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218)
by mail1.slb.deg.dub.stisp.net with SMTP; 10 Jul 2010 01:08:33 -0000
Message-ID: <4C37C7D9.2030909@draigBrady.com>
Date: Sat, 10 Jul 2010 02:07:37 +0100
From: =?ISO-8859-1?Q?P=E1draig_Brady?=
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US;
rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3
MIME-Version: 1.0
To: Chen Guo
Subject: Re: [PATCH] sort: add --threads option to parallelize internal sort.
References: <362522.89643.qm@web180012.mail.gq1.yahoo.com>
<4B94CAE0.6000106@draigBrady.com>
<535984.88146.qm@web180006.mail.gq1.yahoo.com>
In-Reply-To: <535984.88146.qm@web180006.mail.gq1.yahoo.com>
X-Enigmail-Version: 1.0.1
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-detected-operating-system: by eggs.gnu.org: FreeBSD 4.6-4.9
X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6,
seldom 2.4 (older, 4)
X-Spam-Score: -3.5 (---)
X-Debbugs-Envelope-To: submit
Cc: Paul Eggert , Bug Coreutils ,
Glen Lenker , Mike Nichols ,
Gene Auyeung , Chris Dickens
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.11
Precedence: list
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Sender: debbugs-submit-bounces@debbugs.gnu.org
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
X-Spam-Score: -4.8 (----)
On 08/03/10 10:39, Chen Guo wrote:
> Hi Padraig,
>
>> You previously mentioned a thread bug with memcoll. Is that worked around?
>
> That happened when more than one instance of memcoll is called on the same
> line at once, since memcoll replaces the eolchar with '\0'. Under our approach,
> the same line shouldn't ever be compared at the same time, so we're fine.
> On top of that, Professor Eggert suggested NUL delimiting all lines as they're
> read in, so memcoll doesn't have to; hence the patch to gnulib, which introduces
> xmemcoll_nul and memcoll_nul, for when input is known to be NUL delimited, thus
> no replacement of the eolchar is needed, making memcoll threadsafe.
Note the current xmemcoll0() in gnulib requires the length
_including_ the terminating NUL to be passed, whereas one
usually does not include the terminating char in the length
passed to xmemcoll(). I accordingly updated the lengths
passed to xmemcoll0() by your latest patch.
However there are still writes done to the source text
in the keycompare() function. So I'm thinking of dropping
the whole xmemcoll0() thing altogether assuming your
statement above is correct, that a particular line will
not be used at the same time by multiple threads.
I did try to copy the text to the stack before comparing,
but that introduced a significant overhead noted below.
Your patch is still performing well on a single core machine:
----------- before ---------------------
$ time ./src/sort < nums.list >/dev/null
real 0m8.644s
user 0m8.307s
sys 0m0.292s
$ time ./src/sort -g < nums.list >/dev/null
real 0m11.046s
user 0m10.652s
sys 0m0.295s
$ time ./src/sort -n < nums.list >/dev/null
real 0m4.909s
user 0m4.567s
sys 0m0.298s
$ time LANG=C ./src/sort < nums.list >/dev/null
real 0m1.959s
user 0m1.657s
sys 0m0.285s
------------ after ---------------------
$ time ./src/sort < nums.list >/dev/null
real 0m8.686s
user 0m8.300s
sys 0m0.232s
$ time ./src/sort -g < nums.list >/dev/null
real 0m10.196s
user 0m9.850s
sys 0m0.221s
$ time ./src/sort -n < nums.list >/dev/null
real 0m2.958s
user 0m2.664s
sys 0m0.221s
$ time LANG=C ./src/sort < nums.list >/dev/null
real 0m1.985s
user 0m1.750s
sys 0m0.217s
After copying the text to the stack as mentioned above
there is a significant performance drop:
$ time ./src/sort -n < nums.list >/dev/null
real 0m4.086s
user 0m3.848s
sys 0m0.218s
cheers,
Pádraig.
------------=_1278982861-19756-1--
From unknown Mon Jun 23 23:52:43 2025
X-Loop: help-debbugs@gnu.org
Subject: bug#6600: [PATCH] sort: add --threads option to parallelize internal sort.
Resent-From: =?UTF-8?Q?P=C3=A1draig?= Brady
Original-Sender: debbugs-submit-bounces@debbugs.gnu.org
Resent-To: owner@debbugs.gnu.org
Resent-CC: bug-coreutils@gnu.org
Resent-Date: Tue, 13 Jul 2010 09:15:02 +0000
Resent-Message-ID:
Resent-Sender: help-debbugs@gnu.org
X-GNU-PR-Message: followup 6600
X-GNU-PR-Package: coreutils
X-GNU-PR-Keywords: patch
To: 6600@debbugs.gnu.org
Cc: Gene Auyeung , Paul Eggert , Chen Guo
Received: via spool by 6600-submit@debbugs.gnu.org id=B6600.127901249232252
(code B ref 6600); Tue, 13 Jul 2010 09:15:02 +0000
Received: (at 6600) by debbugs.gnu.org; 13 Jul 2010 09:14:52 +0000
Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from )
id 1OYbZj-0008O9-UR
for submit@debbugs.gnu.org; Tue, 13 Jul 2010 05:14:52 -0400
Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98])
by debbugs.gnu.org with smtp (Exim 4.69)
(envelope-from ) id 1OYbZh-0008O4-05
for 6600@debbugs.gnu.org; Tue, 13 Jul 2010 05:14:50 -0400
Received: (qmail 78612 invoked from network); 13 Jul 2010 09:14:53 -0000
Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218)
by mail1.slb.deg.dub.stisp.net with SMTP; 13 Jul 2010 09:14:53 -0000
Message-ID: <4C3C2E44.9050009@draigBrady.com>
Date: Tue, 13 Jul 2010 10:13:40 +0100
From: =?UTF-8?Q?P=C3=A1draig?= Brady
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US;
rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3
MIME-Version: 1.0
References: <362522.89643.qm@web180012.mail.gq1.yahoo.com> <4B94CAE0.6000106@draigBrady.com> <535984.88146.qm@web180006.mail.gq1.yahoo.com> <4C37C7D9.2030909@draigBrady.com> <4C37E0E2.3060205@cs.ucla.edu> <4C383D5F.4090703@draigBrady.com>
<4C3BBA60.9020201@draigBrady.com>
In-Reply-To: <4C3BBA60.9020201@draigBrady.com>
X-Enigmail-Version: 1.0.1
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Spam-Score: -2.8 (--)
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.11
Precedence: list
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Sender: debbugs-submit-bounces@debbugs.gnu.org
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
X-Spam-Score: -2.8 (--)
Here's a small cleanup I missed.
Alternatively one could make heap() return NULL rather than aborting,
but since it already used xmalloc, I'm tending to this..
commit cec33eb226df63f406f7eb70cd46d960ee02a060
Author: Pádraig Brady
Date: Tue Jul 13 08:23:52 2010 +0100
maint: heap.c: simplify heap_alloc
* gl/lib/heap.c (heap_alloc): Use the fact that the xalloc
routines will not return NULL. Also remove the redundant
temporary variables.
diff --git a/gl/lib/heap.c b/gl/lib/heap.c
index a37224f..f148434 100644
--- a/gl/lib/heap.c
+++ b/gl/lib/heap.c
@@ -36,22 +36,12 @@ static void heapify_up (void **, size_t,
struct heap *
heap_alloc (int (*compare)(const void *, const void *), size_t n_reserve)
{
- struct heap *heap;
- void *xmalloc_ret = xmalloc (sizeof *heap);
- heap = (struct heap *) xmalloc_ret;
- if (!heap)
- return NULL;
+ struct heap *heap = xmalloc (sizeof *heap);
- if (n_reserve <= 0)
+ if (n_reserve == 0)
n_reserve = 1;
- xmalloc_ret = xmalloc (n_reserve * sizeof *(heap->array));
- heap->array = (void **) xmalloc_ret;
- if (!heap->array)
- {
- free (heap);
- return NULL;
- }
+ heap->array = xmalloc (n_reserve * sizeof *(heap->array));
heap->array[0] = NULL;
heap->capacity = n_reserve;
@@ -84,8 +74,7 @@ heap_insert (struct heap *heap, void *item)
if (heap->capacity - 1 <= heap->count)
{
size_t new_size = (2 + heap->count) * sizeof *(heap->array);
- void *realloc_ret = xrealloc (heap->array, new_size);
- heap->array = (void **) realloc_ret;
+ heap->array = xrealloc (heap->array, new_size);
From unknown Mon Jun 23 23:52:43 2025
X-Loop: help-debbugs@gnu.org
Subject: bug#6600: [PATCH] sort: add --threads option to parallelize internal sort.
Resent-From: Jim Meyering
Original-Sender: debbugs-submit-bounces@debbugs.gnu.org
Resent-To: owner@debbugs.gnu.org
Resent-CC: bug-coreutils@gnu.org
Resent-Date: Tue, 13 Jul 2010 14:36:02 +0000
Resent-Message-ID:
Resent-Sender: help-debbugs@gnu.org
X-GNU-PR-Message: followup 6600
X-GNU-PR-Package: coreutils
X-GNU-PR-Keywords: patch
To: =?UTF-8?Q?P=C3=A1draig?= Brady
Cc: Gene Auyeung , 6600@debbugs.gnu.org, Paul Eggert , Chen Guo
Received: via spool by 6600-submit@debbugs.gnu.org id=B6600.127903172910972
(code B ref 6600); Tue, 13 Jul 2010 14:36:02 +0000
Received: (at 6600) by debbugs.gnu.org; 13 Jul 2010 14:35:29 +0000
Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from )
id 1OYga0-0002qv-QY
for submit@debbugs.gnu.org; Tue, 13 Jul 2010 10:35:29 -0400
Received: from smtp1-g21.free.fr ([212.27.42.1])
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from ) id 1OYgZy-0002ql-6V
for 6600@debbugs.gnu.org; Tue, 13 Jul 2010 10:35:27 -0400
Received: from mx.meyering.net (unknown [82.230.74.64])
by smtp1-g21.free.fr (Postfix) with ESMTP id 5D78A94012A;
Tue, 13 Jul 2010 16:35:26 +0200 (CEST)
Received: by rho.meyering.net (Acme Bit-Twister, from userid 1000)
id E677EDEBA; Tue, 13 Jul 2010 16:35:24 +0200 (CEST)
From: Jim Meyering
In-Reply-To: <4C3C2E44.9050009@draigBrady.com> =?UTF-8?Q?("P=C3=A1draig?=
Brady"'s message of "Tue, 13 Jul 2010 10:13:40 +0100")
References: <362522.89643.qm@web180012.mail.gq1.yahoo.com>
<4B94CAE0.6000106@draigBrady.com>
<535984.88146.qm@web180006.mail.gq1.yahoo.com>
<4C37C7D9.2030909@draigBrady.com> <4C37E0E2.3060205@cs.ucla.edu>
<4C383D5F.4090703@draigBrady.com> <4C3BBA60.9020201@draigBrady.com>
<4C3C2E44.9050009@draigBrady.com>
Date: Tue, 13 Jul 2010 16:35:24 +0200
Message-ID: <87hbk34fgz.fsf@meyering.net>
Lines: 83
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: -3.3 (---)
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.11
Precedence: list
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Sender: debbugs-submit-bounces@debbugs.gnu.org
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
X-Spam-Score: -3.3 (---)
P=C3=A1draig Brady wrote:
> Here's a small cleanup I missed.
> Alternatively one could make heap() return NULL rather than aborting,
> but since it already used xmalloc, I'm tending to this..
>
> commit cec33eb226df63f406f7eb70cd46d960ee02a060
> Author: P=C3=A1draig Brady
> Date: Tue Jul 13 08:23:52 2010 +0100
>
> maint: heap.c: simplify heap_alloc
>
> * gl/lib/heap.c (heap_alloc): Use the fact that the xalloc
> routines will not return NULL. Also remove the redundant
> temporary variables.
>
> diff --git a/gl/lib/heap.c b/gl/lib/heap.c
> index a37224f..f148434 100644
> --- a/gl/lib/heap.c
> +++ b/gl/lib/heap.c
> @@ -36,22 +36,12 @@ static void heapify_up (void **, size_t,
> struct heap *
> heap_alloc (int (*compare)(const void *, const void *), size_t n_reserve)
> {
> - struct heap *heap;
> - void *xmalloc_ret =3D xmalloc (sizeof *heap);
> - heap =3D (struct heap *) xmalloc_ret;
> - if (!heap)
> - return NULL;
> + struct heap *heap =3D xmalloc (sizeof *heap);
>
> - if (n_reserve <=3D 0)
> + if (n_reserve =3D=3D 0)
> n_reserve =3D 1;
>
> - xmalloc_ret =3D xmalloc (n_reserve * sizeof *(heap->array));
> - heap->array =3D (void **) xmalloc_ret;
> - if (!heap->array)
> - {
> - free (heap);
> - return NULL;
> - }
> + heap->array =3D xmalloc (n_reserve * sizeof *(heap->array));
>
> heap->array[0] =3D NULL;
> heap->capacity =3D n_reserve;
> @@ -84,8 +74,7 @@ heap_insert (struct heap *heap, void *item)
> if (heap->capacity - 1 <=3D heap->count)
> {
> size_t new_size =3D (2 + heap->count) * sizeof *(heap->array);
> - void *realloc_ret =3D xrealloc (heap->array, new_size);
> - heap->array =3D (void **) realloc_ret;
> + heap->array =3D xrealloc (heap->array, new_size);
Thanks.
That looks good. Please push.
I noticed that heap_insert's reallocation was awkward and inefficient.
Using x2nrealloc rather than xrealloc makes the code
cleaner as well as more efficient in the face of a growing
heap, and also handles integer overflow.
diff --git a/gl/lib/heap.c b/gl/lib/heap.c
index a37224f..12a7767 100644
--- a/gl/lib/heap.c
+++ b/gl/lib/heap.c
@@ -82,15 +82,8 @@ int
heap_insert (struct heap *heap, void *item)
{
if (heap->capacity - 1 <=3D heap->count)
- {
- size_t new_size =3D (2 + heap->count) * sizeof *(heap->array);
- void *realloc_ret =3D xrealloc (heap->array, new_size);
- heap->array =3D (void **) realloc_ret;
- heap->capacity =3D (2 + heap->count);
-
- if (!heap->array)
- return -1;
- }
+ heap->array =3D x2nrealloc (heap->array, &heap->capacity,
+ sizeof *(heap->array));
heap->array[++heap->count] =3D item;
heapify_up (heap->array, heap->count, heap->compare);
From unknown Mon Jun 23 23:52:43 2025
X-Loop: help-debbugs@gnu.org
Subject: bug#6600: [PATCH] sort: add --threads option to parallelize internal sort.
Resent-From: =?UTF-8?Q?P=C3=A1draig?= Brady
Original-Sender: debbugs-submit-bounces@debbugs.gnu.org
Resent-To: owner@debbugs.gnu.org
Resent-CC: bug-coreutils@gnu.org
Resent-Date: Tue, 13 Jul 2010 15:11:01 +0000
Resent-Message-ID:
Resent-Sender: help-debbugs@gnu.org
X-GNU-PR-Message: followup 6600
X-GNU-PR-Package: coreutils
X-GNU-PR-Keywords: patch
To: Jim Meyering
Cc: Gene Auyeung , 6600@debbugs.gnu.org, Paul Eggert , Chen Guo
Received: via spool by 6600-submit@debbugs.gnu.org id=B6600.127903382011938
(code B ref 6600); Tue, 13 Jul 2010 15:11:01 +0000
Received: (at 6600) by debbugs.gnu.org; 13 Jul 2010 15:10:20 +0000
Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from )
id 1OYh7k-00036V-BH
for submit@debbugs.gnu.org; Tue, 13 Jul 2010 11:10:20 -0400
Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98])
by debbugs.gnu.org with smtp (Exim 4.69)
(envelope-from ) id 1OYh7h-00036O-Dk
for 6600@debbugs.gnu.org; Tue, 13 Jul 2010 11:10:18 -0400
Received: (qmail 50470 invoked from network); 13 Jul 2010 15:10:22 -0000
Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218)
by mail1.slb.deg.dub.stisp.net with SMTP; 13 Jul 2010 15:10:22 -0000
Message-ID: <4C3C8194.9020008@draigBrady.com>
Date: Tue, 13 Jul 2010 16:09:08 +0100
From: =?UTF-8?Q?P=C3=A1draig?= Brady
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US;
rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3
MIME-Version: 1.0
References: <362522.89643.qm@web180012.mail.gq1.yahoo.com> <4B94CAE0.6000106@draigBrady.com> <535984.88146.qm@web180006.mail.gq1.yahoo.com> <4C37C7D9.2030909@draigBrady.com>
<4C37E0E2.3060205@cs.ucla.edu> <4C383D5F.4090703@draigBrady.com>
<4C3BBA60.9020201@draigBrady.com> <4C3C2E44.9050009@draigBrady.com>
<87hbk34fgz.fsf@meyering.net>
In-Reply-To: <87hbk34fgz.fsf@meyering.net>
X-Enigmail-Version: 1.0.1
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Spam-Score: -2.8 (--)
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.11
Precedence: list
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Sender: debbugs-submit-bounces@debbugs.gnu.org
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
X-Spam-Score: -2.8 (--)
On 13/07/10 15:35, Jim Meyering wrote:
> I noticed that heap_insert's reallocation was awkward and inefficient.
> Using x2nrealloc rather than xrealloc makes the code
> cleaner as well as more efficient in the face of a growing
> heap, and also handles integer overflow.
>
> diff --git a/gl/lib/heap.c b/gl/lib/heap.c
> index a37224f..12a7767 100644
> --- a/gl/lib/heap.c
> +++ b/gl/lib/heap.c
> @@ -82,15 +82,8 @@ int
> heap_insert (struct heap *heap, void *item)
> {
> if (heap->capacity - 1 <= heap->count)
> - {
> - size_t new_size = (2 + heap->count) * sizeof *(heap->array);
> - void *realloc_ret = xrealloc (heap->array, new_size);
> - heap->array = (void **) realloc_ret;
> - heap->capacity = (2 + heap->count);
> -
> - if (!heap->array)
> - return -1;
> - }
> + heap->array = x2nrealloc (heap->array, &heap->capacity,
> + sizeof *(heap->array));
>
> heap->array[++heap->count] = item;
> heapify_up (heap->array, heap->count, heap->compare);
Much cleaner and increases with n *= 1.5 rather than n += 2
Testing here shows no change in performance.
Do you want me to roll that into my patch?
cheers,
Pádraig.
From unknown Mon Jun 23 23:52:43 2025
X-Loop: help-debbugs@gnu.org
Subject: bug#6600: [PATCH] sort: add --threads option to parallelize internal sort.
Resent-From: Jim Meyering
Original-Sender: debbugs-submit-bounces@debbugs.gnu.org
Resent-To: owner@debbugs.gnu.org
Resent-CC: bug-coreutils@gnu.org
Resent-Date: Tue, 13 Jul 2010 15:17:02 +0000
Resent-Message-ID:
Resent-Sender: help-debbugs@gnu.org
X-GNU-PR-Message: followup 6600
X-GNU-PR-Package: coreutils
X-GNU-PR-Keywords: patch
To: =?UTF-8?Q?P=C3=A1draig?= Brady
Cc: Gene Auyeung , 6600@debbugs.gnu.org, Paul Eggert , Chen Guo
Received: via spool by 6600-submit@debbugs.gnu.org id=B6600.127903418412112
(code B ref 6600); Tue, 13 Jul 2010 15:17:02 +0000
Received: (at 6600) by debbugs.gnu.org; 13 Jul 2010 15:16:24 +0000
Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from )
id 1OYhDc-00039J-2F
for submit@debbugs.gnu.org; Tue, 13 Jul 2010 11:16:24 -0400
Received: from smtp1-g21.free.fr ([212.27.42.1])
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from ) id 1OYhDY-00039E-US
for 6600@debbugs.gnu.org; Tue, 13 Jul 2010 11:16:22 -0400
Received: from mx.meyering.net (unknown [82.230.74.64])
by smtp1-g21.free.fr (Postfix) with ESMTP id 5A9AC940051;
Tue, 13 Jul 2010 17:16:21 +0200 (CEST)
Received: by rho.meyering.net (Acme Bit-Twister, from userid 1000)
id 2EE87E12C; Tue, 13 Jul 2010 17:16:20 +0200 (CEST)
From: Jim Meyering
In-Reply-To: <4C3C8194.9020008@draigBrady.com> =?UTF-8?Q?("P=C3=A1draig?=
Brady"'s message of "Tue, 13 Jul 2010 16:09:08 +0100")
References: <362522.89643.qm@web180012.mail.gq1.yahoo.com>
<4B94CAE0.6000106@draigBrady.com>
<535984.88146.qm@web180006.mail.gq1.yahoo.com>
<4C37C7D9.2030909@draigBrady.com> <4C37E0E2.3060205@cs.ucla.edu>
<4C383D5F.4090703@draigBrady.com> <4C3BBA60.9020201@draigBrady.com>
<4C3C2E44.9050009@draigBrady.com> <87hbk34fgz.fsf@meyering.net>
<4C3C8194.9020008@draigBrady.com>
Date: Tue, 13 Jul 2010 17:16:20 +0200
Message-ID: <87aapv4dkr.fsf@meyering.net>
Lines: 38
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: -3.3 (---)
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.11
Precedence: list
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Sender: debbugs-submit-bounces@debbugs.gnu.org
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
X-Spam-Score: -3.3 (---)
P=C3=A1draig Brady wrote:
> On 13/07/10 15:35, Jim Meyering wrote:
>> I noticed that heap_insert's reallocation was awkward and inefficient.
>> Using x2nrealloc rather than xrealloc makes the code
>> cleaner as well as more efficient in the face of a growing
>> heap, and also handles integer overflow.
>>
>> diff --git a/gl/lib/heap.c b/gl/lib/heap.c
>> index a37224f..12a7767 100644
>> --- a/gl/lib/heap.c
>> +++ b/gl/lib/heap.c
>> @@ -82,15 +82,8 @@ int
>> heap_insert (struct heap *heap, void *item)
>> {
>> if (heap->capacity - 1 <=3D heap->count)
>> - {
>> - size_t new_size =3D (2 + heap->count) * sizeof *(heap->array);
>> - void *realloc_ret =3D xrealloc (heap->array, new_size);
>> - heap->array =3D (void **) realloc_ret;
>> - heap->capacity =3D (2 + heap->count);
>> -
>> - if (!heap->array)
>> - return -1;
>> - }
>> + heap->array =3D x2nrealloc (heap->array, &heap->capacity,
>> + sizeof *(heap->array));
>>
>> heap->array[++heap->count] =3D item;
>> heapify_up (heap->array, heap->count, heap->compare);
>
> Much cleaner and increases with n *=3D 1.5 rather than n +=3D 2
> Testing here shows no change in performance.
Thanks for the perf. testing.
> Do you want me to roll that into my patch?
Sure, thanks.
From unknown Mon Jun 23 23:52:43 2025
X-Loop: help-debbugs@gnu.org
Subject: bug#6600: [PATCH] sort: add --threads option to parallelize internal sort.
Resent-From: Chen Guo
Original-Sender: debbugs-submit-bounces@debbugs.gnu.org
Resent-To: owner@debbugs.gnu.org
Resent-CC: bug-coreutils@gnu.org
Resent-Date: Wed, 14 Jul 2010 03:15:02 +0000
Resent-Message-ID:
Resent-Sender: help-debbugs@gnu.org
X-GNU-PR-Message: followup 6600
X-GNU-PR-Package: coreutils
X-GNU-PR-Keywords: patch
To: Jim Meyering
Cc: Gene Auyeung , 6600@debbugs.gnu.org, Paul Eggert , =?UTF-8?Q?P=C3=A1draig?= Brady
Received: via spool by 6600-submit@debbugs.gnu.org id=B6600.127907727113514
(code B ref 6600); Wed, 14 Jul 2010 03:15:02 +0000
Received: (at 6600) by debbugs.gnu.org; 14 Jul 2010 03:14:31 +0000
Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from )
id 1OYsQY-0003Vr-JQ
for submit@debbugs.gnu.org; Tue, 13 Jul 2010 23:14:30 -0400
Received: from mail-px0-f172.google.com ([209.85.212.172])
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from ) id 1OYsQX-0003VS-2k
for 6600@debbugs.gnu.org; Tue, 13 Jul 2010 23:14:29 -0400
Received: by pxi20 with SMTP id 20so2477168pxi.3
for <6600@debbugs.gnu.org>; Tue, 13 Jul 2010 20:14:35 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
h=domainkey-signature:mime-version:received:received:in-reply-to
:references:date:message-id:subject:from:to:cc:content-type
:content-transfer-encoding;
bh=QVLioVOqoNqbfQlvIbykD4xqldO3cuNQyyrqNNnqA68=;
b=AbhGZI0O8z6S9ln84eZ/0TzwONiw0zOywmJDXM8+6aJIguFC1VY2oFOJvKXGeN6nuu
QWLkz6JDVKtzou9RGGYpuuaAXs+GudwKiEMG7EyK1FhXnLegipFSPpU5l5PXroTeV//Z
2e4Y9vh6+GuTlFyRNc5bk1wTUP1pSX5JSprOA=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
h=mime-version:in-reply-to:references:date:message-id:subject:from:to
:cc:content-type:content-transfer-encoding;
b=EauTYLKEORJLSdjb0htvbpPHrxXRCfdEJiQrYtYUyVFS8JFQ6M8eT9KtgHWjsXpu6F
DZr+qwpCIfxVwWPUS3O9xUUgF8JfBl/EuPReghH9D9c1i2OHRhMLHA1nqAQGFgzDpIhr
6qOcXBRpjK5Ke1Gznm2rd2OdYVOkJIkOefY0U=
MIME-Version: 1.0
Received: by 10.142.134.13 with SMTP id h13mr20461638wfd.119.1279077275341;
Tue, 13 Jul 2010 20:14:35 -0700 (PDT)
Received: by 10.142.241.3 with HTTP; Tue, 13 Jul 2010 20:14:35 -0700 (PDT)
In-Reply-To: <87aapv4dkr.fsf@meyering.net>
References: <362522.89643.qm@web180012.mail.gq1.yahoo.com>
<4B94CAE0.6000106@draigBrady.com>
<535984.88146.qm@web180006.mail.gq1.yahoo.com>
<4C37C7D9.2030909@draigBrady.com> <4C37E0E2.3060205@cs.ucla.edu>
<4C383D5F.4090703@draigBrady.com> <4C3BBA60.9020201@draigBrady.com>
<4C3C2E44.9050009@draigBrady.com> <87hbk34fgz.fsf@meyering.net>
<4C3C8194.9020008@draigBrady.com> <87aapv4dkr.fsf@meyering.net>
Date: Tue, 13 Jul 2010 20:14:35 -0700
Message-ID:
From: Chen Guo
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: -2.6 (--)
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.11
Precedence: list
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Sender: debbugs-submit-bounces@debbugs.gnu.org
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
X-Spam-Score: -2.6 (--)
Thanks a lot for all the hard work reviewing and revising this, P=E1draig.
From unknown Mon Jun 23 23:52:43 2025
X-Loop: help-debbugs@gnu.org
Subject: bug#6600: [PATCH] sort: add --threads option to parallelize internal sort.
Resent-From: =?UTF-8?Q?P=C3=A1draig?= Brady
Original-Sender: debbugs-submit-bounces@debbugs.gnu.org
Resent-To: owner@debbugs.gnu.org
Resent-CC: bug-coreutils@gnu.org
Resent-Date: Thu, 15 Jul 2010 00:09:01 +0000
Resent-Message-ID:
Resent-Sender: help-debbugs@gnu.org
X-GNU-PR-Message: followup 6600
X-GNU-PR-Package: coreutils
X-GNU-PR-Keywords: patch
To: 6600@debbugs.gnu.org
Received: via spool by 6600-submit@debbugs.gnu.org id=B6600.127915251115475
(code B ref 6600); Thu, 15 Jul 2010 00:09:01 +0000
Received: (at 6600) by debbugs.gnu.org; 15 Jul 2010 00:08:31 +0000
Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from )
id 1OZC07-00041Y-4d
for submit@debbugs.gnu.org; Wed, 14 Jul 2010 20:08:31 -0400
Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98])
by debbugs.gnu.org with smtp (Exim 4.69)
(envelope-from ) id 1OZC04-00041P-Cv
for 6600@debbugs.gnu.org; Wed, 14 Jul 2010 20:08:29 -0400
Received: (qmail 76922 invoked from network); 15 Jul 2010 00:08:36 -0000
Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218)
by mail1.slb.deg.dub.stisp.net with SMTP; 15 Jul 2010 00:08:36 -0000
Message-ID: <4C3E5134.3040700@draigBrady.com>
Date: Thu, 15 Jul 2010 01:07:16 +0100
From: =?UTF-8?Q?P=C3=A1draig?= Brady
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US;
rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3
MIME-Version: 1.0
References: <362522.89643.qm@web180012.mail.gq1.yahoo.com> <4B94CAE0.6000106@draigBrady.com> <535984.88146.qm@web180006.mail.gq1.yahoo.com> <4C37C7D9.2030909@draigBrady.com> <4C37E0E2.3060205@cs.ucla.edu> <4C383D5F.4090703@draigBrady.com>
<4C3BBA60.9020201@draigBrady.com>
In-Reply-To: <4C3BBA60.9020201@draigBrady.com>
X-Enigmail-Version: 1.0.1
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Spam-Score: -2.7 (--)
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.11
Precedence: list
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Sender: debbugs-submit-bounces@debbugs.gnu.org
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
X-Spam-Score: -2.7 (--)
On 13/07/10 01:59, Pádraig Brady wrote:
> I've finally applied the patch.
> http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commit;h=9face836
>
> I made a few comment tweaks and added
> some dependencies for the heap module.
>
> I also removed the xmemcoll0() calls
> which are separate to this concurrent functionality.
> I will add those back in Chen's name after updating to
> the latest gnulib.
>
> Thanks everyone for their work on this!
>
> Pádraig.
>
>
>
>
Here's the xmemcoll0 follow up:
From: Chen Guo
Date: Wed, 14 Jul 2010 07:41:05 +0100
Subject: [PATCH] sort: speed up default full line sorting
Don't write NUL after the comparison buffers on each compare,
which increases performance by about 3% for short lines
on a pentium-m with gcc-4.4.1
* src/sort.c: (fillbuf): Delimit input items with NUL.
(write_bytes): Restore the item delimiter char which was
replaced with NUL in fillbuf().
---
src/sort.c | 18 +++++++++++++++---
1 files changed, 15 insertions(+), 3 deletions(-)
diff --git a/src/sort.c b/src/sort.c
index 5ea1b34..45cb78f 100644
--- a/src/sort.c
+++ b/src/sort.c
@@ -1743,13 +1743,17 @@ fillbuf (struct buffer *buf, FILE *fp, char const *file)
if (buf->buf == ptrlim)
return false;
if (ptrlim[-1] != eol)
- *ptrlim++ = eol;
+ *ptrlim++ = '\0';
}
}
/* Find and record each line in the just-read input. */
while ((p = memchr (ptr, eol, ptrlim - ptr)))
{
+ /* Delimit the line with NUL. This eliminates the need to
+ temporarily replace the last byte with NUL when calling
+ xmemcoll(), which increases performance. */
+ *p = '\0';
ptr = p + 1;
line--;
line->text = line_start;
@@ -2642,7 +2646,13 @@ compare (const struct line *a, const struct line *b, bool show_debug)
else if (blen == 0)
diff = 1;
else if (hard_LC_COLLATE)
- diff = xmemcoll (a->text, alen, b->text, blen);
+ {
+ /* Note xmemcoll0 is a performance enhancement as
+ it will not unconditionally write '\0' after the
+ passed in buffers, which was seen to give around
+ a 3% increase in performance for short lines. */
+ diff = xmemcoll0 (a->text, alen + 1, b->text, blen + 1);
+ }
else if (! (diff = memcmp (a->text, b->text, MIN (alen, blen))))
diff = alen < blen ? -1 : alen != blen;
@@ -2652,9 +2662,11 @@ compare (const struct line *a, const struct line *b, bool show_debug)
static void
write_bytes (struct line const *line, FILE *fp, char const *output_file)
{
- char const *buf = line->text;
+ char *buf = line->text;
size_t n_bytes = line->length;
+ *(buf + n_bytes - 1) = eolchar;
+
/* Convert TABs to '>' and \0 to \n when -z specified. */
if (debug && fp == stdout)
{
--
1.6.2.5
From unknown Mon Jun 23 23:52:43 2025
X-Loop: help-debbugs@gnu.org
Subject: bug#6600: [PATCH] sort: add --threads option to parallelize internal sort.
Resent-From: =?UTF-8?Q?P=C3=A1draig?= Brady
Original-Sender: debbugs-submit-bounces@debbugs.gnu.org
Resent-To: owner@debbugs.gnu.org
Resent-CC: bug-coreutils@gnu.org
Resent-Date: Thu, 15 Jul 2010 11:13:02 +0000
Resent-Message-ID:
Resent-Sender: help-debbugs@gnu.org
X-GNU-PR-Message: followup 6600
X-GNU-PR-Package: coreutils
X-GNU-PR-Keywords: patch
To: 6600@debbugs.gnu.org, Ludovic =?UTF-8?Q?Court=C3=A8s?=
Received: via spool by 6600-submit@debbugs.gnu.org id=B6600.12791923582419
(code B ref 6600); Thu, 15 Jul 2010 11:13:02 +0000
Received: (at 6600) by debbugs.gnu.org; 15 Jul 2010 11:12:38 +0000
Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from )
id 1OZMMn-0000cy-01
for submit@debbugs.gnu.org; Thu, 15 Jul 2010 07:12:38 -0400
Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98])
by debbugs.gnu.org with smtp (Exim 4.69)
(envelope-from ) id 1OZMMl-0000cq-Ho
for 6600@debbugs.gnu.org; Thu, 15 Jul 2010 07:12:36 -0400
Received: (qmail 65705 invoked from network); 15 Jul 2010 11:12:45 -0000
Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218)
by mail1.slb.deg.dub.stisp.net with SMTP; 15 Jul 2010 11:12:45 -0000
Message-ID: <4C3EECDB.9030907@draigBrady.com>
Date: Thu, 15 Jul 2010 12:11:23 +0100
From: =?UTF-8?Q?P=C3=A1draig?= Brady
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US;
rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3
MIME-Version: 1.0
References: <362522.89643.qm@web180012.mail.gq1.yahoo.com> <4B94CAE0.6000106@draigBrady.com> <535984.88146.qm@web180006.mail.gq1.yahoo.com> <4C37C7D9.2030909@draigBrady.com> <4C37E0E2.3060205@cs.ucla.edu> <4C383D5F.4090703@draigBrady.com> <4C3BBA60.9020201@draigBrady.com>
<4C3E5134.3040700@draigBrady.com>
In-Reply-To: <4C3E5134.3040700@draigBrady.com>
X-Enigmail-Version: 1.0.1
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Spam-Score: -2.7 (--)
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.11
Precedence: list
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Sender: debbugs-submit-bounces@debbugs.gnu.org
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
X-Spam-Score: -2.7 (--)
On 15/07/10 01:07, Pádraig Brady wrote:
> On 13/07/10 01:59, Pádraig Brady wrote:
>> I've finally applied the patch.
>> http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commit;h=9face836
>>
>> I made a few comment tweaks and added
>> some dependencies for the heap module.
>>
>> I also removed the xmemcoll0() calls
>> which are separate to this concurrent functionality.
>> I will add those back in Chen's name after updating to
>> the latest gnulib.
>>
>> Thanks everyone for their work on this!
>>
>> Pádraig.
>
> Here's the xmemcoll0 follow up:
And a follow up fix to that which
fixes 2 test failures noticed on our integration server:
http://hydra.nixos.org/build/486508
commit aadc67dfdb47f28bb8d1fa5e0fe0f52e2a8c51bf
Author: Pádraig Brady
Date: Thu Jul 15 12:06:04 2010 +0100
sort: fix a bug when sorting unterminated lines
* src/sort.c (fillbuf): The previous commit incorrectly
terminated the buffer when the last line of input
didn't contain a terminating character.
diff --git a/src/sort.c b/src/sort.c
index 45cb78f..7d31878 100644
--- a/src/sort.c
+++ b/src/sort.c
@@ -1743,7 +1743,7 @@ fillbuf (struct buffer *buf, FILE *fp, char const *file)
if (buf->buf == ptrlim)
return false;
if (ptrlim[-1] != eol)
- *ptrlim++ = '\0';
+ *ptrlim++ = eol;
}
}
From unknown Mon Jun 23 23:52:43 2025
X-Loop: help-debbugs@gnu.org
Subject: bug#6600: [PATCH] sort: add --threads option to parallelize internal sort.
Resent-From: Jim Meyering
Original-Sender: debbugs-submit-bounces@debbugs.gnu.org
Resent-To: owner@debbugs.gnu.org
Resent-CC: bug-coreutils@gnu.org
Resent-Date: Thu, 15 Jul 2010 14:09:02 +0000
Resent-Message-ID:
Resent-Sender: help-debbugs@gnu.org
X-GNU-PR-Message: followup 6600
X-GNU-PR-Package: coreutils
X-GNU-PR-Keywords: patch
To: =?UTF-8?Q?P=C3=A1draig?= Brady
Cc: 6600@debbugs.gnu.org, Ludovic =?UTF-8?Q?Court=C3=A8s?=
Received: via spool by 6600-submit@debbugs.gnu.org id=B6600.12792029089539
(code B ref 6600); Thu, 15 Jul 2010 14:09:02 +0000
Received: (at 6600) by debbugs.gnu.org; 15 Jul 2010 14:08:28 +0000
Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from )
id 1OZP6y-0002To-2k
for submit@debbugs.gnu.org; Thu, 15 Jul 2010 10:08:28 -0400
Received: from smtp1-g21.free.fr ([212.27.42.1])
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from ) id 1OZP6u-0002Td-PJ
for 6600@debbugs.gnu.org; Thu, 15 Jul 2010 10:08:26 -0400
Received: from mx.meyering.net (unknown [82.230.74.64])
by smtp1-g21.free.fr (Postfix) with ESMTP id A7D4894012E;
Thu, 15 Jul 2010 16:08:30 +0200 (CEST)
Received: by rho.meyering.net (Acme Bit-Twister, from userid 1000)
id 47850E267; Thu, 15 Jul 2010 16:08:29 +0200 (CEST)
From: Jim Meyering
In-Reply-To: <4C3EECDB.9030907@draigBrady.com> =?UTF-8?Q?("P=C3=A1draig?=
Brady"'s message of "Thu, 15 Jul 2010 12:11:23 +0100")
References: <362522.89643.qm@web180012.mail.gq1.yahoo.com>
<4B94CAE0.6000106@draigBrady.com>
<535984.88146.qm@web180006.mail.gq1.yahoo.com>
<4C37C7D9.2030909@draigBrady.com> <4C37E0E2.3060205@cs.ucla.edu>
<4C383D5F.4090703@draigBrady.com> <4C3BBA60.9020201@draigBrady.com>
<4C3E5134.3040700@draigBrady.com> <4C3EECDB.9030907@draigBrady.com>
Date: Thu, 15 Jul 2010 16:08:29 +0200
Message-ID: <87oce8u9b6.fsf_-_@meyering.net>
Lines: 51
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: -3.3 (---)
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.11
Precedence: list
List-Id:
List-Unsubscribe: