From debbugs-submit-bounces@debbugs.gnu.org Wed Jun 08 16:40:24 2011 Received: (at submit) by debbugs.gnu.org; 8 Jun 2011 20:40:24 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QUPY6-00018n-RN for submit@debbugs.gnu.org; Wed, 08 Jun 2011 16:40:24 -0400 Received: from eggs.gnu.org ([140.186.70.92]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QUPB4-0000bA-Ot for submit@debbugs.gnu.org; Wed, 08 Jun 2011 16:16:35 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QUPAx-0003LF-K2 for submit@debbugs.gnu.org; Wed, 08 Jun 2011 16:16:29 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW autolearn=unavailable version=3.3.1 Received: from lists.gnu.org ([140.186.70.17]:43185) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QUPAx-0003LB-Cv for submit@debbugs.gnu.org; Wed, 08 Jun 2011 16:16:27 -0400 Received: from eggs.gnu.org ([140.186.70.92]:35582) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QUPAv-0004Y2-4p for bug-coreutils@gnu.org; Wed, 08 Jun 2011 16:16:27 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QUPAs-0003Kv-FZ for bug-coreutils@gnu.org; Wed, 08 Jun 2011 16:16:24 -0400 Received: from mail-qy0-f169.google.com ([209.85.216.169]:64966) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QUPAr-0003KU-Ss for bug-coreutils@gnu.org; Wed, 08 Jun 2011 16:16:22 -0400 Received: by qyk2 with SMTP id 2so2574948qyk.0 for ; Wed, 08 Jun 2011 13:16:20 -0700 (PDT) Received: by 10.224.182.206 with SMTP id cd14mr4177149qab.252.1307564180111; Wed, 08 Jun 2011 13:16:20 -0700 (PDT) Received: from mina.bloomlan (modemcable098.129-202-24.mc.videotron.ca [24.202.129.98]) by mx.google.com with ESMTPS id r32sm672533qcs.38.2011.06.08.13.16.18 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 08 Jun 2011 13:16:19 -0700 (PDT) From: Mina Naguib Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Subject: sort occasionally hangs - appears to be in a merge-sort loop Date: Wed, 8 Jun 2011 16:16:17 -0400 Message-Id: To: bug-coreutils@gnu.org Mime-Version: 1.0 (Apple Message framework v1084) X-Mailer: Apple Mail (2.1084) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 140.186.70.17 X-Spam-Score: -6.0 (------) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Wed, 08 Jun 2011 16:40:22 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.0 (------) Hi I've observed a few instances when `sort` simply "hangs" and never = returns the sorted data. Currently using coreutils 8.12 on gentoo linux 2.6.25, `locale` outputs: LANG=3D LC_CTYPE=3D"POSIX" LC_NUMERIC=3D"POSIX" LC_TIME=3D"POSIX" LC_COLLATE=3D"POSIX" LC_MONETARY=3D"POSIX" LC_MESSAGES=3D"POSIX" LC_PAPER=3D"POSIX" LC_NAME=3D"POSIX" LC_ADDRESS=3D"POSIX" LC_TELEPHONE=3D"POSIX" LC_MEASUREMENT=3D"POSIX" LC_IDENTIFICATION=3D"POSIX" LC_ALL=3D An example ongoing right now on one of my servers. Sort invoked as: /usr/bin/sort -t '|' -k1,1 -k6,6n It was fed via STDIN pipe-delimited line-based records, and STDIN was = closed. The parent process is waiting on sort to start receiving the = output via STDOUT which never happens. Input was roughly 350 million records, each record looking something = like this: 000WQg16MdC2Puk|direct_count|2049|63581|15090|1306695571 I dug into the sort process with strace and I believe it's doing one of = the final merge sorts: read(12, "986|61518|14528|1301616354\nnHM5Og"..., 4096) =3D 4096 read(12, "|1967|61925|14584|1301634780\nnHMk"..., 4096) =3D 4096 read(12, "533\nnHNQDohSrKT3Ffn|direct_count|"..., 4096) =3D 4096 write(4, "4488|1302016964\nnGq7jG4hfFS9flL|d"..., 4096) =3D 4096 write(4, "nGq9POv11Jh7OiQ|direct_count|1965"..., 4096) =3D 4096 write(4, "05|1299763468\nnGqCgAT3jwkrzhc|dir"..., 4096) =3D 4096 write(4, "460\nnGqFItcr2F1Pz8n|tag|1995|6275"..., 4096) =3D 4096 write(4, "JuhUfMkOPC7J|direct_count|1914|57"..., 4096) =3D 4096 write(4, "|1914|57180|13801|1302640204\nnGqO"..., 4096) =3D 4096 After watching it do this diddy for 5.5 hours I dug in further. Here's = its open temp files: $ date; lsof -n -p 31370 | grep tmp | nl Wed Jun 8 15:37:41 EDT 2011 1 sort 31370 dmt 4u REG 8,1 3123896320 12705303 = /tmp/sort0YtEtm 2 sort 31370 dmt 5r REG 8,1 1199907358 12705551 = /tmp/sortWNOpxy 3 sort 31370 dmt 7r REG 8,1 1181994889 12705627 = /tmp/sortIpm44r 4 sort 31370 dmt 8r REG 8,1 1199857626 12705636 = /tmp/sort2WOPPo 5 sort 31370 dmt 9r REG 8,1 1196868088 12705640 = /tmp/sortCUkW6a 6 sort 31370 dmt 10r REG 8,1 1199488185 12705644 = /tmp/sortSDopQX 7 sort 31370 dmt 11r REG 8,1 1188267263 12705646 = /tmp/sortUdXqke 8 sort 31370 dmt 12r REG 8,1 1176087055 12705665 = /tmp/sortMiavw5 9 sort 31370 dmt 13r REG 8,1 1184572313 12705666 = /tmp/sortUf5q03 10 sort 31370 dmt 14r REG 8,1 1187433621 12705672 = /tmp/sorti3mKO9 11 sort 31370 dmt 15r REG 8,1 1184537051 12705680 = /tmp/sort0iYrQr 12 sort 31370 dmt 16r REG 8,1 1186874578 12705696 = /tmp/sortUwVkpQ 13 sort 31370 dmt 17r REG 8,1 1173767980 12705703 = /tmp/sortov5uyz 14 sort 31370 dmt 18r REG 8,1 1172616590 12705711 = /tmp/sort2ibXqt 15 sort 31370 dmt 19r REG 8,1 1184319006 12705714 = /tmp/sortA1DmGE 16 sort 31370 dmt 20r REG 8,1 1188372691 12705728 = /tmp/sortibGbvd 17 sort 31370 dmt 21r REG 8,1 1185407259 12705713 = /tmp/sort4Ou1u6 again: $ date; lsof -n -p 31370 | grep tmp | nl Wed Jun 8 15:37:58 EDT 2011 1 sort 31370 dmt 4u REG 8,1 3432607744 12705303 = /tmp/sort0YtEtm 2 sort 31370 dmt 5r REG 8,1 1199907358 12705551 = /tmp/sortWNOpxy 3 sort 31370 dmt 7r REG 8,1 1181994889 12705627 = /tmp/sortIpm44r 4 sort 31370 dmt 8r REG 8,1 1199857626 12705636 = /tmp/sort2WOPPo 5 sort 31370 dmt 9r REG 8,1 1196868088 12705640 = /tmp/sortCUkW6a 6 sort 31370 dmt 10r REG 8,1 1199488185 12705644 = /tmp/sortSDopQX 7 sort 31370 dmt 11r REG 8,1 1188267263 12705646 = /tmp/sortUdXqke 8 sort 31370 dmt 12r REG 8,1 1176087055 12705665 = /tmp/sortMiavw5 9 sort 31370 dmt 13r REG 8,1 1184572313 12705666 = /tmp/sortUf5q03 10 sort 31370 dmt 14r REG 8,1 1187433621 12705672 = /tmp/sorti3mKO9 11 sort 31370 dmt 15r REG 8,1 1184537051 12705680 = /tmp/sort0iYrQr 12 sort 31370 dmt 16r REG 8,1 1186874578 12705696 = /tmp/sortUwVkpQ 13 sort 31370 dmt 17r REG 8,1 1173767980 12705703 = /tmp/sortov5uyz 14 sort 31370 dmt 18r REG 8,1 1172616590 12705711 = /tmp/sort2ibXqt 15 sort 31370 dmt 19r REG 8,1 1184319006 12705714 = /tmp/sortA1DmGE 16 sort 31370 dmt 20r REG 8,1 1188372691 12705728 = /tmp/sortibGbvd 17 sort 31370 dmt 21r REG 8,1 1185407259 12705713 = /tmp/sort4Ou1u6 again several times, you see as the main writable file approaches the = total size sum of the other files combined: Wed Jun 8 15:55:20 EDT 2011 1 sort 31370 dmt 4u REG 8,1 18965975040 12705303 = /tmp/sort0YtEtm 2 sort 31370 dmt 5r REG 8,1 1199907358 12705551 = /tmp/sortWNOpxy 3 sort 31370 dmt 7r REG 8,1 1181994889 12705627 = /tmp/sortIpm44r 4 sort 31370 dmt 8r REG 8,1 1199857626 12705636 = /tmp/sort2WOPPo 5 sort 31370 dmt 9r REG 8,1 1196868088 12705640 = /tmp/sortCUkW6a 6 sort 31370 dmt 10r REG 8,1 1199488185 12705644 = /tmp/sortSDopQX 7 sort 31370 dmt 11r REG 8,1 1188267263 12705646 = /tmp/sortUdXqke 8 sort 31370 dmt 12r REG 8,1 1176087055 12705665 = /tmp/sortMiavw5 9 sort 31370 dmt 13r REG 8,1 1184572313 12705666 = /tmp/sortUf5q03 10 sort 31370 dmt 14r REG 8,1 1187433621 12705672 = /tmp/sorti3mKO9 11 sort 31370 dmt 15r REG 8,1 1184537051 12705680 = /tmp/sort0iYrQr 12 sort 31370 dmt 16r REG 8,1 1186874578 12705696 = /tmp/sortUwVkpQ 13 sort 31370 dmt 17r REG 8,1 1173767980 12705703 = /tmp/sortov5uyz 14 sort 31370 dmt 18r REG 8,1 1172616590 12705711 = /tmp/sort2ibXqt 15 sort 31370 dmt 19r REG 8,1 1184319006 12705714 = /tmp/sortA1DmGE 16 sort 31370 dmt 20r REG 8,1 1188372691 12705728 = /tmp/sortibGbvd 17 sort 31370 dmt 21r REG 8,1 1185407259 12705713 = /tmp/sort4Ou1u6 then what appears to be finalization and cleanup: Wed Jun 8 15:55:28 EDT 2011 1 sort 31370 dmt 4u REG 8,1 18990366720 12705303 = /tmp/sort0YtEtm 2 sort 31370 dmt 5r REG 8,1 1199907358 12705551 = /tmp/sortWNOpxy 3 sort 31370 dmt 7r REG 8,1 1181994889 12705627 = /tmp/sortIpm44r 4 sort 31370 dmt 8r REG 8,1 1199857626 12705636 = /tmp/sort2WOPPo 5 sort 31370 dmt 9r REG 8,1 1196868088 12705640 = /tmp/sortCUkW6a 6 sort 31370 dmt 12r REG 8,1 1176087055 12705665 = /tmp/sortMiavw5 7 sort 31370 dmt 13r REG 8,1 1184572313 12705666 = /tmp/sortUf5q03 8 sort 31370 dmt 14r REG 8,1 1187433621 12705672 = /tmp/sorti3mKO9 9 sort 31370 dmt 15r REG 8,1 1184537051 12705680 = /tmp/sort0iYrQr 10 sort 31370 dmt 16r REG 8,1 1186874578 12705696 = /tmp/sortUwVkpQ 11 sort 31370 dmt 17r REG 8,1 1173767980 12705703 = /tmp/sortov5uyz 12 sort 31370 dmt 18r REG 8,1 1172616590 12705711 = /tmp/sort2ibXqt again, almost done: Wed Jun 8 15:55:34 EDT 2011 1 sort 31370 dmt 4u REG 8,1 18990370816 12705303 = /tmp/sort0YtEtm 2 sort 31370 dmt 16r REG 8,1 1186874578 12705696 = /tmp/sortUwVkpQ again, 1 main file left: Wed Jun 8 15:55:35 EDT 2011 1 sort 31370 dmt 4u REG 8,1 18990370816 12705303 = /tmp/sort0YtEtm But alas, 16 more readable files pop up and 1 writable starts adding up = again: Wed Jun 8 15:55:36 EDT 2011 1 sort 31370 dmt 4u REG 8,1 0 12705551 = /tmp/sortevbKK1 2 sort 31370 dmt 5r REG 8,1 1178392494 12705739 = /tmp/sortIIvZhW 3 sort 31370 dmt 7r REG 8,1 1180102559 12706259 = /tmp/sortwZr2sU 4 sort 31370 dmt 8r REG 8,1 1181439270 12706530 = /tmp/sort8B2hbx 5 sort 31370 dmt 9r REG 8,1 1183275901 12706731 = /tmp/sorteGJa35 6 sort 31370 dmt 10r REG 8,1 1178538478 12706793 = /tmp/sortoGXmOo 7 sort 31370 dmt 11r REG 8,1 1191511729 12706795 = /tmp/sortG0RNr9 8 sort 31370 dmt 12r REG 8,1 1199965207 12706799 = /tmp/sort6Q8tAy 9 sort 31370 dmt 13r REG 8,1 1192944995 12706817 = /tmp/sortCGXzT2 10 sort 31370 dmt 14r REG 8,1 1191485559 12706823 = /tmp/sortMwsygv 11 sort 31370 dmt 15r REG 8,1 1200144993 12706830 = /tmp/sortyA2RWw 12 sort 31370 dmt 16r REG 8,1 1198251727 12706866 = /tmp/sortyZlqHC 13 sort 31370 dmt 17r REG 8,1 1184876212 12706871 = /tmp/sortCCjKOH 14 sort 31370 dmt 18r REG 8,1 1174916186 12706880 = /tmp/sortKrUJSJ 15 sort 31370 dmt 19r REG 8,1 1184800950 12706824 = /tmp/sort2Vjy7s 16 sort 31370 dmt 20r REG 8,1 1187817055 12706891 = /tmp/sort89xsBm 17 sort 31370 dmt 21r REG 8,1 1185686697 12706944 = /tmp/sortwjlE5L Wed Jun 8 15:55:40 EDT 2011 1 sort 31370 dmt 4u REG 8,1 62631936 12705551 = /tmp/sortevbKK1 2 sort 31370 dmt 5r REG 8,1 1178392494 12705739 = /tmp/sortIIvZhW 3 sort 31370 dmt 7r REG 8,1 1180102559 12706259 = /tmp/sortwZr2sU 4 sort 31370 dmt 8r REG 8,1 1181439270 12706530 = /tmp/sort8B2hbx 5 sort 31370 dmt 9r REG 8,1 1183275901 12706731 = /tmp/sorteGJa35 6 sort 31370 dmt 10r REG 8,1 1178538478 12706793 = /tmp/sortoGXmOo 7 sort 31370 dmt 11r REG 8,1 1191511729 12706795 = /tmp/sortG0RNr9 8 sort 31370 dmt 12r REG 8,1 1199965207 12706799 = /tmp/sort6Q8tAy 9 sort 31370 dmt 13r REG 8,1 1192944995 12706817 = /tmp/sortCGXzT2 10 sort 31370 dmt 14r REG 8,1 1191485559 12706823 = /tmp/sortMwsygv 11 sort 31370 dmt 15r REG 8,1 1200144993 12706830 = /tmp/sortyA2RWw 12 sort 31370 dmt 16r REG 8,1 1198251727 12706866 = /tmp/sortyZlqHC 13 sort 31370 dmt 17r REG 8,1 1184876212 12706871 = /tmp/sortCCjKOH 14 sort 31370 dmt 18r REG 8,1 1174916186 12706880 = /tmp/sortKrUJSJ 15 sort 31370 dmt 19r REG 8,1 1184800950 12706824 = /tmp/sort2Vjy7s 16 sort 31370 dmt 20r REG 8,1 1187817055 12706891 = /tmp/sort89xsBm 17 sort 31370 dmt 21r REG 8,1 1185686697 12706944 = /tmp/sortwjlE5L I may be mistaken, but I don't think this is normal behaviour - once the = 16-way merge sort is done, I see no reason to observe this whole process = happening over and over. Ironically, I had upgraded from coreutils 8.5 to 8.12 because other = tools on the same machine invokes sort with "--compress-program=3Dgzip" = and had observed similar-but-different hanging on that tool's = invocation. I dug in and found this commit ( = http://git.savannah.gnu.org/gitweb/?p=3Dcoreutils.git;a=3Dcommitdiff;h=3D1= b31ce6982a9151d9dfe2ea3595ad7595cb9ca86 ) which encouraged me to try the = upgrade to fix the hang-while-compressing issue, but I'm now seeing this = hanging even in the non-compressed invocation. In my case, I think I'll downgrade coreutils back down to 8.5 as well as = disable all tools from invoking it with "--compress-program" to get = predictable non-hanging behaviour across the board. If I can be of further help please let me know. Thank you. From debbugs-submit-bounces@debbugs.gnu.org Thu Jun 09 02:46:49 2011 Received: (at 8824) by debbugs.gnu.org; 9 Jun 2011 06:46:49 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QUZ0y-0005j8-HM for submit@debbugs.gnu.org; Thu, 09 Jun 2011 02:46:48 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QUZ0v-0005iv-TN for 8824@debbugs.gnu.org; Thu, 09 Jun 2011 02:46:47 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id DC6C039E811D; Wed, 8 Jun 2011 23:46:39 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KB-9Vr7MjNnk; Wed, 8 Jun 2011 23:46:39 -0700 (PDT) Received: from [192.168.1.10] (pool-71-189-109-235.lsanca.fios.verizon.net [71.189.109.235]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 4A34639E80F0; Wed, 8 Jun 2011 23:46:39 -0700 (PDT) Message-ID: <4DF06C4A.9000809@cs.ucla.edu> Date: Wed, 08 Jun 2011 23:46:34 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.17) Gecko/20110516 Thunderbird/3.1.10 MIME-Version: 1.0 To: Mina Naguib Subject: Re: bug#8824: sort occasionally hangs - appears to be in a merge-sort loop References: In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Score: -3.0 (---) X-Debbugs-Envelope-To: 8824 Cc: 8824@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -3.0 (---) On 06/08/11 13:16, Mina Naguib wrote: > I may be mistaken, but I don't think this is normal behaviour - once > the 16-way merge sort is done, I see no reason to observe this whole > process happening over and over. Thanks for your bug report. If 'sort' is breaking up its input into 4 MiB chunks, sorting them, creating a separate temp file for each chunk, and then merging the results with a 16-way merge, then the first level of 16-way merges will produce 64 MiB files, and the second level will produce 1 GiB temp files, which is about the size you're observing. Since your input is about 18 GiB in size (is that right?), I'd expect to see two third-level merges. The first would be a 16-way merge, generating about 16 GiB total. The second would be a roughly 2-way merge, generating about 2 GiB. Then there would be a single fourth-level merge of these two big files into the final 18 GiB of output. But you're observing a second third-level merge that's the same size as the first. So something is wrong here. I just now tried to reproduce your bug on my host (Fedora 14 x86-64 with 8 GiB RAM) as follows: shuf -i 1-350000000 -o col1 shuf -i 1-350000000 -o col2 paste col1 col2 | awk '{printf "%s|direct_count|2049|63581|15090|%s\n", $1, $2}' | sort -t '|' -k1,1 -k6,6n | cat >sortout Alas, this didn't reproduce the problem; it worked just fine. Can you modify the above recipe somehow and make the problem happen on your host? How much RAM do you have? Is your host x86 or x86-64 or what? (That "4 MiB" in my example is an absurdly small number, and this is a performance bug in 'sort', but that's a different matter I think.) From debbugs-submit-bounces@debbugs.gnu.org Thu Jun 09 04:32:36 2011 Received: (at 8824) by debbugs.gnu.org; 9 Jun 2011 08:32:36 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QUafL-000804-Vp for submit@debbugs.gnu.org; Thu, 09 Jun 2011 04:32:36 -0400 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1QUafJ-0007zq-9B for 8824@debbugs.gnu.org; Thu, 09 Jun 2011 04:32:34 -0400 Received: (qmail 32532 invoked from network); 9 Jun 2011 08:32:27 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 9 Jun 2011 08:32:27 -0000 Message-ID: <4DF08456.9090200@draigBrady.com> Date: Thu, 09 Jun 2011 09:29:10 +0100 From: =?ISO-8859-1?Q?P=E1draig_Brady?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: Mina Naguib Subject: Re: bug#8824: sort occasionally hangs - appears to be in a merge-sort loop References: In-Reply-To: X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.7 (--) X-Debbugs-Envelope-To: 8824 Cc: 8824@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.7 (--) On 08/06/11 21:16, Mina Naguib wrote: > > I've observed a few instances when `sort` simply "hangs" and never returns the sorted data. > In my case, I think I'll downgrade coreutils back down to 8.5 as well as disable all tools from invoking it with "--compress-program" to get predictable non-hanging behaviour across the board. coreutils 8.6 got the threaded sort implementation, so maybe that's triggering an obscure bug. If you were on a multi processor machine, you could restrict `sort` to a single thread by adding the --parallel=1 option. It's worth absolving that I think. thanks for the very detailed bug report, Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Fri Jun 10 09:54:20 2011 Received: (at 8824) by debbugs.gnu.org; 10 Jun 2011 13:54:20 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QV2AF-0001to-NT for submit@debbugs.gnu.org; Fri, 10 Jun 2011 09:54:20 -0400 Received: from mail-vx0-f172.google.com ([209.85.220.172]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QV2AC-0001tb-US for 8824@debbugs.gnu.org; Fri, 10 Jun 2011 09:54:17 -0400 Received: by vxg33 with SMTP id 33so2190982vxg.3 for <8824@debbugs.gnu.org>; Fri, 10 Jun 2011 06:54:11 -0700 (PDT) Received: by 10.52.98.1 with SMTP id ee1mr605172vdb.255.1307714051440; Fri, 10 Jun 2011 06:54:11 -0700 (PDT) Received: from [192.168.5.146] (modemcable098.129-202-24.mc.videotron.ca [24.202.129.98]) by mx.google.com with ESMTPS id ck16sm1005857vdb.44.2011.06.10.06.54.10 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 10 Jun 2011 06:54:10 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1084) Subject: Re: bug#8824: sort occasionally hangs - appears to be in a merge-sort loop From: Mina Naguib In-Reply-To: <4DF06C4A.9000809@cs.ucla.edu> Date: Fri, 10 Jun 2011 09:54:08 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: <055ED2A0-9F16-4712-A1F3-8BF8913913BE@bloomdigital.com> References: <4DF06C4A.9000809@cs.ucla.edu> To: 8824@debbugs.gnu.org X-Mailer: Apple Mail (2.1084) X-Spam-Score: -4.5 (----) X-Debbugs-Envelope-To: 8824 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -4.0 (----) Hi I realized that there was a mistake on my part in the initial report. On 2011-06-09, at 2:46 AM, Paul Eggert wrote: > Thanks for your bug report. If 'sort' is breaking up its input into 4 > MiB chunks, sorting them, creating a separate temp file for each > chunk, and then merging the results with a 16-way merge, then the > first level of 16-way merges will produce 64 MiB files, and the second > level will produce 1 GiB temp files, which is about the size you're > observing. Since your input is about 18 GiB in size (is that right?), > I'd expect to see two third-level merges. The first would be a 16-way > merge, generating about 16 GiB total. The second would be a roughly > 2-way merge, generating about 2 GiB. Then there would be a single > fourth-level merge of these two big files into the final 18 GiB of > output. The input data is not 350 million records, but significantly higher (to = the tune of 2 billion). I'd estimate that at ~56 bytes per record, the = data set's size was roughly 100Gigs I may have jumped the gun and associated the delay/looping I've observed = with the unrelated "hang" bug I've observed earlier with compressed temp = files in coreutils 8.5. Given this new information, do you think the behaviour I observed is = reasonable ? Or is there still the possibility of a bug worth pursuing = ? > How much RAM do you have? Is your host x86 or x86-64 or what? > (That "4 MiB" in my example is an absurdly small number, and > this is a performance bug in 'sort', but that's a different > matter I think.) This occurred on an x86_64 box with 12G RAM, 4 x 2.66Ghz CPUs. The = disks are quite slow as it's all on the same local RAID5 volume for both = the read-data and the written temp files and eventual output.=20 Thank you.= From debbugs-submit-bounces@debbugs.gnu.org Fri Jun 10 12:34:02 2011 Received: (at 8824) by debbugs.gnu.org; 10 Jun 2011 16:34:02 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QV4eo-0005b2-DO for submit@debbugs.gnu.org; Fri, 10 Jun 2011 12:34:02 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QV4em-0005aj-Ll for 8824@debbugs.gnu.org; Fri, 10 Jun 2011 12:34:01 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 6E49039E8105; Fri, 10 Jun 2011 09:33:54 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rq5xLQbzhyD7; Fri, 10 Jun 2011 09:33:54 -0700 (PDT) Received: from [131.179.64.200] (Penguin.CS.UCLA.EDU [131.179.64.200]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id F320939E80FF; Fri, 10 Jun 2011 09:33:53 -0700 (PDT) Message-ID: <4DF24760.3030507@cs.ucla.edu> Date: Fri, 10 Jun 2011 09:33:36 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110428 Fedora/3.1.10-1.fc14 Thunderbird/3.1.10 MIME-Version: 1.0 To: Mina Naguib Subject: Re: bug#8824: sort occasionally hangs - appears to be in a merge-sort loop References: <4DF06C4A.9000809@cs.ucla.edu> <055ED2A0-9F16-4712-A1F3-8BF8913913BE@bloomdigital.com> In-Reply-To: <055ED2A0-9F16-4712-A1F3-8BF8913913BE@bloomdigital.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Spam-Score: -3.1 (---) X-Debbugs-Envelope-To: 8824 Cc: 8824@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -3.1 (---) On 06/10/11 06:54, Mina Naguib wrote: > Given this new information, do you think the behaviour I observed is reasonable? > Or is there still the possibility of a bug worth pursuing ? I suspect there is a *performance* bug, but not a correctness bug. Does the performance improve if you use "sort -S 6G" for the big sort? From debbugs-submit-bounces@debbugs.gnu.org Thu Oct 11 18:34:30 2018 Received: (at control) by debbugs.gnu.org; 11 Oct 2018 22:34:30 +0000 Received: from localhost ([127.0.0.1]:45702 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gAjXK-0004zX-B8 for submit@debbugs.gnu.org; Thu, 11 Oct 2018 18:34:30 -0400 Received: from mail-pg1-f182.google.com ([209.85.215.182]:45621) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gAjXI-0004zL-9d for control@debbugs.gnu.org; Thu, 11 Oct 2018 18:34:28 -0400 Received: by mail-pg1-f182.google.com with SMTP id t70-v6so4828541pgd.12 for ; Thu, 11 Oct 2018 15:34:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=to:from:message-id:date:user-agent:mime-version:content-language :content-transfer-encoding; bh=7apBCEYAVn0WRPO0lUXyhbI1fmL5j8ewFFZTMWbq3yI=; b=toL8k7fQ5KxcbpSCAnSstHi4BG3qCcxWJcN/bt/seLilla0Ue8MSoU9J1v5V6iO3fg Amtg9kpyXCz6IczP+2B5sJYJK3YfFfjxTAGmOxtTrod4ghSW0k15rGSpYxYDiFSlcAsB b+vq/HbVS9N7uvE1FMp1P3qbiMEIDsfLl6k4ET6Al2IcPeFzjjrZ9zwHGa2VJe/ExfXe 3jQYK+A79YMNLDdZUC61eJzXCsccw+t/vFb+7hJkDSgoB+YVi3OxSUwg8JCeiJ7OQyWV 5ZoV8ucVi7kf21B9kSPfhOsY9gPJXdQIdN2I3zwzq+DrwthawKhrBWKjm3NgJcH/itND ft5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:from:message-id:date:user-agent:mime-version :content-language:content-transfer-encoding; bh=7apBCEYAVn0WRPO0lUXyhbI1fmL5j8ewFFZTMWbq3yI=; b=nuSXC9XJQrXagjzp9W7BnUZ1ommpMMtThBjiA0K4nSj9o4WjsFR3A+T1I38yKP0Fzo a80GuqT/kmd2gJ88NlNpMrUnhTux+WSOZT6sFbPjHet1qbxR/kyeBG+9D3hrvi5hSWA0 tPvm9ZgUek97jmNo/HTOyGsYT7L9WvOcTvbZmJCQO76Uela8KzNPYaRgVOQJVzz465na sR7y8GEOE2qoVSBsmVUSe1wlUlBo3OI4XiIlKUgqew8ofHkd3Hl+Mbbx5l67NR2KDA57 ZddvDznKsR+vGLbgzftMTy7UXhlrFIOgqUwKS/fIz34hApKzdH3ed7ALT7sbC92Nk5DQ euZw== X-Gm-Message-State: ABuFfoizOr8B6DzumAK7s/Z8zthjLcrY02/3IwhW8rDQY8Q2vWkcMPwY fJKVbts5ENDLCy3pJZy+ho5ZQXpS X-Google-Smtp-Source: ACcGV60Jjf/wOacofqXbjwl8/obHA6hjNflkUrX7x5s0ygRZTEYhAlGGdhz8FbeEIziWm+tOq6xQvQ== X-Received: by 2002:a62:8910:: with SMTP id v16-v6mr3401359pfd.106.1539297261732; Thu, 11 Oct 2018 15:34:21 -0700 (PDT) Received: from tomato.housegordon.com (moose.housegordon.com. [184.68.105.38]) by smtp.googlemail.com with ESMTPSA id o12-v6sm12147564pgv.7.2018.10.11.15.34.19 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 11 Oct 2018 15:34:20 -0700 (PDT) To: control@debbugs.gnu.org From: Assaf Gordon Message-ID: <9feaab7a-6767-723b-785a-b9e39fa507c7@gmail.com> Date: Thu, 11 Oct 2018 16:34:19 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: 2.0 (++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: close 12656 tags 8824 moreinfo close 8824 tags 8767 + moreinfo close 8767 [...] Content analysis details: (2.0 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at http://www.dnswl.org/, no trust [209.85.215.182 listed in list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (assafgordon[at]gmail.com) 0.0 RCVD_IN_MSPIKE_H3 RBL: Good reputation (+3) [209.85.215.182 listed in wl.mailspike.net] 1.8 MISSING_SUBJECT Missing Subject: header 0.2 NO_SUBJECT Extra score for no subject 0.0 RCVD_IN_MSPIKE_WL Mailspike good senders X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) close 12656 tags 8824 moreinfo close 8824 tags 8767 + moreinfo close 8767 tags 8736 wontfix close 8736 tags 8700 wontfix close 8700 close 8616 tags 9101 fixed close 9101 retitle 9129 printf: RFE: reject field width larger than INT_MAX tags 9129 notabug close 9129 tags 9140 fixed close 9140 tags 9207 wontfix close 9207 From unknown Tue Jun 24 13:59:07 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Fri, 09 Nov 2018 12:24:06 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator