From unknown Mon Jun 23 02:21:23 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#17470 <17470@debbugs.gnu.org> To: bug#17470 <17470@debbugs.gnu.org> Subject: Status: [PATCH] sort: rotate on ENOSPC while creating tmp files Reply-To: bug#17470 <17470@debbugs.gnu.org> Date: Mon, 23 Jun 2025 09:21:23 +0000 retitle 17470 [PATCH] sort: rotate on ENOSPC while creating tmp files reassign 17470 coreutils submitter 17470 Azat Khuzhin severity 17470 normal tag 17470 patch wontfix thanks From debbugs-submit-bounces@debbugs.gnu.org Sun May 11 16:44:24 2014 Received: (at submit) by debbugs.gnu.org; 11 May 2014 20:44:24 +0000 Received: from localhost ([127.0.0.1]:59712 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wjabf-00070z-7M for submit@debbugs.gnu.org; Sun, 11 May 2014 16:44:23 -0400 Received: from eggs.gnu.org ([208.118.235.92]:47416) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WjaVp-0006q2-5X for submit@debbugs.gnu.org; Sun, 11 May 2014 16:38:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WjaVa-00067j-T1 for submit@debbugs.gnu.org; Sun, 11 May 2014 16:38:15 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM, T_DKIM_INVALID autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:53971) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WjaVa-00067X-QZ for submit@debbugs.gnu.org; Sun, 11 May 2014 16:38:06 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:49070) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WjaVR-0005Ok-Jd for bug-coreutils@gnu.org; Sun, 11 May 2014 16:38:06 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WjaVG-00065M-E7 for bug-coreutils@gnu.org; Sun, 11 May 2014 16:37:57 -0400 Received: from mail-lb0-x233.google.com ([2a00:1450:4010:c04::233]:46080) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WjaVG-00065I-6g for bug-coreutils@gnu.org; Sun, 11 May 2014 16:37:46 -0400 Received: by mail-lb0-f179.google.com with SMTP id c11so6743068lbj.24 for ; Sun, 11 May 2014 13:37:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id; bh=39K4LOJBzGY6KhLtp9VPnDP2xzd0VHkIxz737++dpk4=; b=txNEnjImZltwqemjSaz9avuQYkDyGP+q106npx03AHjBPDENpuNJ6loLFGHHYANlw9 N/XrGzQiuByfy+S5WTw5k+KXeV+cV3SO3a2ULpg2r0x+YLvdCPcfybjtXkhsXkkUz2k0 +74JmHmP01+Yl4RFzFk+cAIuKYsvHeiclRMGnbpA0Mou6wYp3f2uHiyvwrr2nS01s1n4 K5bahZh/Fi09+sOpO7GsTBeUZwfezeMdKshJHBAjN8Xy9+/gmwJlI/FqM7vHpfT8RgJs fIGhVbVSDeeJ+bjuVNoJJZhlHqfkKL6ye5EDB9uYHco0V339aOBEYp9YyGFks3MMfQV9 018g== X-Received: by 10.153.8.132 with SMTP id dk4mr10210825lad.16.1399840664921; Sun, 11 May 2014 13:37:44 -0700 (PDT) Received: from localhost ([188.134.22.24]) by mx.google.com with ESMTPSA id r2sm9187071laa.5.2014.05.11.13.37.41 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 11 May 2014 13:37:43 -0700 (PDT) From: Azat Khuzhin To: bug-coreutils@gnu.org Subject: [PATCH] sort: rotate on ENOSPC while creating tmp files Date: Mon, 12 May 2014 00:37:15 +0400 Message-Id: <1399840635-19019-1-git-send-email-a3at.mail@gmail.com> X-Mailer: git-send-email 2.0.0.rc0 X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.0 (----) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Sun, 11 May 2014 16:44:22 -0400 Cc: Azat Khuzhin X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.0 (----) This can be useful in case you use partitions with different free space on them. It will better to go to the next partition if we don't have space on current one, instead of fail. * src/sort.c (create_temp_file): Go through all available tmp dirs if we got ENOSPC for the first one, and fail on the last. --- Here is the RFC, comments are welcome. Thanks. src/sort.c | 38 ++++++++++++++++++++++++++------------ 1 file changed, 26 insertions(+), 12 deletions(-) diff --git a/src/sort.c b/src/sort.c index 3380be6..47348b7 100644 --- a/src/sort.c +++ b/src/sort.c @@ -853,22 +853,36 @@ create_temp_file (int *pfd, bool survive_fd_exhaustion) static size_t temp_dir_index; int fd; int saved_errno; - char const *temp_dir = temp_dirs[temp_dir_index]; - size_t len = strlen (temp_dir); - struct tempnode *node = - xmalloc (offsetof (struct tempnode, name) + len + sizeof slashbase); - char *file = node->name; + char const *temp_dir; + size_t len; + struct tempnode *node = NULL; + char *file; struct cs_status cs; - - memcpy (file, temp_dir, len); - memcpy (file + len, slashbase, sizeof slashbase); - node->next = NULL; - if (++temp_dir_index == temp_dir_count) - temp_dir_index = 0; + size_t start_dir_index = temp_dir_index; /* Create the temporary file in a critical section, to avoid races. */ cs = cs_enter (); - fd = mkstemp (file); + do + { + temp_dir = temp_dirs[temp_dir_index]; + len = strlen (temp_dir); + node = + xrealloc (node, + offsetof (struct tempnode, name) + len + sizeof slashbase); + file = node->name; + memcpy (file, temp_dir, len); + memcpy (file + len, slashbase, sizeof slashbase); + node->next = NULL; + + if (++temp_dir_index == temp_dir_count) + temp_dir_index = 0; + + fd = mkstemp (file); + + if (errno != ENOSPC || temp_dir_index == start_dir_index) + break; + } while (0 <= fd); + if (0 <= fd) { *temptail = node; -- 2.0.0.rc0 From debbugs-submit-bounces@debbugs.gnu.org Sun May 11 18:26:07 2014 Received: (at 17470) by debbugs.gnu.org; 11 May 2014 22:26:07 +0000 Received: from localhost ([127.0.0.1]:59803 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WjcC6-0002Pk-Us for submit@debbugs.gnu.org; Sun, 11 May 2014 18:26:07 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:45139) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WjcC4-0002PB-Vz for 17470@debbugs.gnu.org; Sun, 11 May 2014 18:26:05 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 6E92B39E8017; Sun, 11 May 2014 15:25:59 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id royzbNhPSG1a; Sun, 11 May 2014 15:25:56 -0700 (PDT) Received: from [192.168.1.9] (pool-108-0-233-62.lsanca.fios.verizon.net [108.0.233.62]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 9BC1F39E8011; Sun, 11 May 2014 15:25:56 -0700 (PDT) Message-ID: <536FF8F4.8090500@cs.ucla.edu> Date: Sun, 11 May 2014 15:25:56 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: Azat Khuzhin , 17470@debbugs.gnu.org Subject: Re: bug#17470: [PATCH] sort: rotate on ENOSPC while creating tmp files References: <1399840635-19019-1-git-send-email-a3at.mail@gmail.com> In-Reply-To: <1399840635-19019-1-git-send-email-a3at.mail@gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -3.0 (---) X-Debbugs-Envelope-To: 17470 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.0 (---) Azat Khuzhin wrote: > + fd = mkstemp (file); > + > + if (errno != ENOSPC || temp_dir_index == start_dir_index) This assumes that when mkstemp succeeds then errno != ENOSPC, which is not necessarily true. More generally, it appears that with the patch 'sort' checks whether one can create a file, but 'sort' will still respond poorly if a write to a temp file fails due to filesystem space exhaustion. From debbugs-submit-bounces@debbugs.gnu.org Sun May 11 18:45:03 2014 Received: (at 17470) by debbugs.gnu.org; 11 May 2014 22:45:03 +0000 Received: from localhost ([127.0.0.1]:59817 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WjcUP-00045p-82 for submit@debbugs.gnu.org; Sun, 11 May 2014 18:45:02 -0400 Received: from mail-qc0-f181.google.com ([209.85.216.181]:41027) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WjcUL-00045b-Jz for 17470@debbugs.gnu.org; Sun, 11 May 2014 18:44:58 -0400 Received: by mail-qc0-f181.google.com with SMTP id m20so6949476qcx.12 for <17470@debbugs.gnu.org>; Sun, 11 May 2014 15:44:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=8upAcaA1YhiQ4Eny2KyRdxOXfCw49RkLruRWyANMDJ8=; b=Lrsdj9uM14YThlq+0HlLbkpES1CPDBCeVAialvly12ErDmb3GiW52TQDMXJmnB1ZkU Lh8i2XARbmluyrAht1iTEUrj+qKKcUpiANZBfiRhuuNancKtYHL0heao4Jd9SD0LcFrx w4M7EvKfYIBHvrNBlw12IkITri2x+MpSXDMk7P96ANomXdu6RKGR01yT8YuEHKK/hL3u ivdAmtvlHI4n5gS0jNkmqjlfUleh5yDEr71KthSHRv9S82uL6d+cIery3OObk2chZtiz r4qnpAg+bcKJUGM6AWZYqjXUpl41qDozougH0YqKzhKzrK/8IiJWxKL1J8Vs72j3YEQS iRxQ== MIME-Version: 1.0 X-Received: by 10.140.28.198 with SMTP id 64mr31632461qgz.49.1399848291903; Sun, 11 May 2014 15:44:51 -0700 (PDT) Received: by 10.96.3.8 with HTTP; Sun, 11 May 2014 15:44:51 -0700 (PDT) Received: by 10.96.3.8 with HTTP; Sun, 11 May 2014 15:44:51 -0700 (PDT) In-Reply-To: <536FF8F4.8090500@cs.ucla.edu> References: <1399840635-19019-1-git-send-email-a3at.mail@gmail.com> <536FF8F4.8090500@cs.ucla.edu> Date: Mon, 12 May 2014 02:44:51 +0400 Message-ID: Subject: Re: bug#17470: [PATCH] sort: rotate on ENOSPC while creating tmp files From: Azat Khuzhin To: Paul Eggert Content-Type: multipart/alternative; boundary=001a11390f1c2daeb804f9279487 X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 17470 Cc: 17470@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --001a11390f1c2daeb804f9279487 Content-Type: text/plain; charset=UTF-8 On 12 May 2014 02:26, "Paul Eggert" wrote: > > Azat Khuzhin wrote: > >> + fd = mkstemp (file); >> + >> + if (errno != ENOSPC || temp_dir_index == start_dir_index) > > > This assumes that when mkstemp succeeds then errno != ENOSPC, which is not necessarily true. Why that could be, only if there will be an old value? End even if it is true it will not go to the next iteration because fd >= 0 > > More generally, it appears that with the patch 'sort' checks whether one can create a file, but 'sort' will still respond poorly if a write to a temp file fails due to filesystem space exhaustion. The only thing that will slow down 'sort' is going through tmp dirs where there is no enough space, and I personally don't think that this will meaningful, since most generic use case to use tmp dirs is to sort data that will not fit into memory and if this sort will fail because of enospc you must to restart the sort, which can be running for day or weak already, and this is more expensive. Or I misunderstood something? And now I realize that this is not enough, since we only checking on creation, and unfortunately I didn't check how write(2) handle errors, if it try to create another file than it will work. We also could use fallocate here. Thanks. Azat. --001a11390f1c2daeb804f9279487 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


On 12 May 2014 02:26, "Paul Eggert" <eggert@cs.ucla.edu> wrote:
>
> Azat Khuzhin wrote:
>
>> + =C2=A0 =C2=A0 =C2=A0fd =3D mkstemp (file);
>> +
>> + =C2=A0 =C2=A0 =C2=A0if (errno !=3D ENOSPC || temp_dir_index =3D= =3D start_dir_index)
>
>
> This assumes that when mkstemp succeeds then errno !=3D ENOSPC, which = is not necessarily true.

Why that could be, only if there will be an old value?
End even if it is true it will not go to the next iteration because fd >= =3D 0

>
> More generally, it appears that with the patch 'sort' checks w= hether one can create a file, but 'sort' will still respond poorly = if a write to a temp file fails due to filesystem space exhaustion.

The only thing that will slow down 'sort' is going t= hrough tmp dirs where there is no enough space, and I personally don't = think that this will meaningful, since most generic use case to use tmp dir= s is to sort data that will not fit into memory and if this sort will fail = because of enospc you must to restart the sort, which can be running for da= y or weak already, and this is more expensive.
Or I misunderstood something?

And now I realize that this is not enough, since we only che= cking on creation, and unfortunately I didn't check how write(2) handle= errors, if it try to create another file than it will work.
We also could use fallocate here.

Thanks.
Azat.

--001a11390f1c2daeb804f9279487-- From debbugs-submit-bounces@debbugs.gnu.org Sun May 11 19:22:58 2014 Received: (at 17470) by debbugs.gnu.org; 11 May 2014 23:22:58 +0000 Received: from localhost ([127.0.0.1]:59839 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wjd57-000546-Th for submit@debbugs.gnu.org; Sun, 11 May 2014 19:22:58 -0400 Received: from mail3.vodafone.ie ([213.233.128.45]:24748) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wjd55-00053i-Gc; Sun, 11 May 2014 19:22:56 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApQBAHkFcFNtTPZu/2dsb2JhbAANTIcUv2iDEQGBK4MZAQEBBCMPAUYQCw0BCgICBRYLAgIJAwIBAgFFBg0BBwEBiEKqDnejYxeBKoxQD0kHgnWBSwSgVo8wgWwB Received: from unknown (HELO [192.168.1.79]) ([109.76.246.110]) by mail3.vodafone.ie with ESMTP; 12 May 2014 00:22:45 +0100 Message-ID: <53700645.3000302@draigBrady.com> Date: Mon, 12 May 2014 00:22:45 +0100 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 MIME-Version: 1.0 To: Azat Khuzhin Subject: Re: bug#17470: [PATCH] sort: rotate on ENOSPC while creating tmp files References: <1399840635-19019-1-git-send-email-a3at.mail@gmail.com> <536FF8F4.8090500@cs.ucla.edu> In-Reply-To: <536FF8F4.8090500@cs.ucla.edu> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 17470 Cc: 17470@debbugs.gnu.org, Paul Eggert X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) tag 17470 wontfix close 17470 stop On 05/11/2014 11:25 PM, Paul Eggert wrote: > Azat Khuzhin wrote: > >> + fd = mkstemp (file); >> + >> + if (errno != ENOSPC || temp_dir_index == start_dir_index) > > This assumes that when mkstemp succeeds then errno != ENOSPC, which is not necessarily true. > > More generally, it appears that with the patch 'sort' checks whether one can create a file, but 'sort' will still respond poorly if a write to a temp file fails due to filesystem space exhaustion. Yes I agree. Now one could use fallocate() where available to preallocate a given amount of space, however allocation management can be done outside of sort(1). As a rule of thumb, if it's possible to implement outside of a particular functional unit, then it probably should be done outside. In this case there are various schemes for coalescing multiple storage locations to a single mount point (mhddfs, lvm, raid, ...), and since these have a more system wide view, it would be better to avoid implementing similar but limited logic within sort. thanks, Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Wed May 14 07:07:20 2014 Received: (at 17470) by debbugs.gnu.org; 14 May 2014 11:07:20 +0000 Received: from localhost ([127.0.0.1]:34782 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WkX1r-0003ym-0W for submit@debbugs.gnu.org; Wed, 14 May 2014 07:07:19 -0400 Received: from mail-la0-f41.google.com ([209.85.215.41]:47098) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WkX1n-0003yR-K8 for 17470@debbugs.gnu.org; Wed, 14 May 2014 07:07:16 -0400 Received: by mail-la0-f41.google.com with SMTP id e16so1285814lan.14 for <17470@debbugs.gnu.org>; Wed, 14 May 2014 04:07:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=9fufT0fhONBT0D2PXarDoumjxP3f9/WlRbD7MBGx+xg=; b=BwXFPUlp+P3INZkP70Ry+aDp/ynchSHBhXxmVAKKSW8x0OrbjQlpAPZE2Vna6ZQxd5 MLQypkQB3dutPqM5kWy0ePi3yE1fc8GncQCtoNUTkET9P9T63UGC8B4xvidTCbkKUb5N RBAxsHTilOIXViD1Dk467JzW1OhCpDSsdY7EEmdaQNm6oewjJBchb7f6PBvjAVkL3Wvo ApO8YSRBI446cc2EMOWy5ALwwxV96eYmLcq29Fc+dmhmOErkGjYVfJVl4uvqLsh1a7u4 FEUGeg8MCjCQJh3Zy/GsfJy8RqkjOJ3on76IAhN3dffJhAlQUcQvvzkC6l/Rghng2GF/ TF/Q== X-Received: by 10.152.5.135 with SMTP id s7mr1188183las.55.1400065629519; Wed, 14 May 2014 04:07:09 -0700 (PDT) Received: from localhost ([188.134.22.24]) by mx.google.com with ESMTPSA id zx3sm1591434lbc.2.2014.05.14.04.07.07 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 14 May 2014 04:07:08 -0700 (PDT) Date: Wed, 14 May 2014 15:07:02 +0400 From: Azat Khuzhin To: =?iso-8859-1?Q?P=E1draig?= Brady Subject: Re: bug#17470: [PATCH] sort: rotate on ENOSPC while creating tmp files Message-ID: <20140514110702.GH10319@azat> References: <1399840635-19019-1-git-send-email-a3at.mail@gmail.com> <536FF8F4.8090500@cs.ucla.edu> <53700645.3000302@draigBrady.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <53700645.3000302@draigBrady.com> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 17470 Cc: 17470@debbugs.gnu.org, Paul Eggert X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) On Mon, May 12, 2014 at 12:22:45AM +0100, Pádraig Brady wrote: > tag 17470 wontfix > close 17470 > stop > > On 05/11/2014 11:25 PM, Paul Eggert wrote: > > Azat Khuzhin wrote: > > > >> + fd = mkstemp (file); > >> + > >> + if (errno != ENOSPC || temp_dir_index == start_dir_index) > > > > This assumes that when mkstemp succeeds then errno != ENOSPC, which is not necessarily true. > > > > More generally, it appears that with the patch 'sort' checks whether one can create a file, but 'sort' will still respond poorly if a write to a temp file fails due to filesystem space exhaustion. > > Yes I agree. > > Now one could use fallocate() where available to preallocate a given amount of space, > however allocation management can be done outside of sort(1). As a rule of thumb, > if it's possible to implement outside of a particular functional unit, then it > probably should be done outside. Sometimes it's not possible to do this, because it will likely need in erasing data in all involved partitions, or it can make _all_ data unavailable when one of disks will fail. And it more simpler to use just sort(1) instead of fdisk/pvcreate/mdadm/...(1). Occasionally, even restart can be painfull. And this patch is relatively small, so this is not even an _allocation managemenet_. Maybe if I will update patch to do this only under specific option, what you think? Thanks, Azat. > > In this case there are various schemes for coalescing multiple storage locations > to a single mount point (mhddfs, lvm, raid, ...), and since these have a more > system wide view, it would be better to avoid implementing similar but limited > logic within sort. > > thanks, > Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Wed May 14 07:24:56 2014 Received: (at 17470) by debbugs.gnu.org; 14 May 2014 11:24:56 +0000 Received: from localhost ([127.0.0.1]:34790 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WkXIu-0004QJ-Eg for submit@debbugs.gnu.org; Wed, 14 May 2014 07:24:56 -0400 Received: from mail1.vodafone.ie ([213.233.128.43]:61296) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WkXIs-0004Q5-6Q for 17470@debbugs.gnu.org; Wed, 14 May 2014 07:24:55 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApQBAPJRc1NtT74f/2dsb2JhbAANTMcKgxEBgTGDGQEBAQMBMgFGBQsLDQEKCRYPCQMCAQIBRQYNAQcBAReIHg2rDaYIF45OB4RABKBjjzg Received: from unknown (HELO [192.168.1.79]) ([109.79.190.31]) by mail1.vodafone.ie with ESMTP; 14 May 2014 12:24:47 +0100 Message-ID: <5373527E.4070408@draigBrady.com> Date: Wed, 14 May 2014 12:24:46 +0100 From: =?ISO-8859-1?Q?P=E1draig_Brady?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 MIME-Version: 1.0 To: Azat Khuzhin Subject: Re: bug#17470: [PATCH] sort: rotate on ENOSPC while creating tmp files References: <1399840635-19019-1-git-send-email-a3at.mail@gmail.com> <536FF8F4.8090500@cs.ucla.edu> <53700645.3000302@draigBrady.com> <20140514110702.GH10319@azat> In-Reply-To: <20140514110702.GH10319@azat> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 17470 Cc: 17470@debbugs.gnu.org, Paul Eggert X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) On 05/14/2014 12:07 PM, Azat Khuzhin wrote: > On Mon, May 12, 2014 at 12:22:45AM +0100, Pádraig Brady wrote: >> tag 17470 wontfix >> close 17470 >> stop >> >> On 05/11/2014 11:25 PM, Paul Eggert wrote: >>> Azat Khuzhin wrote: >>> >>>> + fd = mkstemp (file); >>>> + >>>> + if (errno != ENOSPC || temp_dir_index == start_dir_index) >>> >>> This assumes that when mkstemp succeeds then errno != ENOSPC, which is not necessarily true. >>> >>> More generally, it appears that with the patch 'sort' checks whether one can create a file, but 'sort' will still respond poorly if a write to a temp file fails due to filesystem space exhaustion. >> >> Yes I agree. >> >> Now one could use fallocate() where available to preallocate a given amount of space, >> however allocation management can be done outside of sort(1). As a rule of thumb, >> if it's possible to implement outside of a particular functional unit, then it >> probably should be done outside. > > Sometimes it's not possible to do this, because it will likely need in > erasing data in all involved partitions, or it can make _all_ data > unavailable when one of disks will fail. There are many external solutions to that problem. > And it more simpler to use just sort(1) instead of > fdisk/pvcreate/mdadm/...(1). > Occasionally, even restart can be painfull. > > And this patch is relatively small, so this is not even an _allocation > managemenet_. > Maybe if I will update patch to do this only under specific option, what > you think? Well it shouldn't need any interface changes as sort already accepts multiple -T options. However it wouldn't be that simple either requiring fallocations, which are not generally available and which don't guarantee that writes will not give ENOSPC. Also do we always know how much to fallocate? Would that have efficiency problems compared to dynamic allocate? What about sort --compress, ... So still not convinced sorry. Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Wed May 14 10:48:59 2014 Received: (at 17470) by debbugs.gnu.org; 14 May 2014 14:48:59 +0000 Received: from localhost ([127.0.0.1]:35290 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WkaUL-0002ht-SE for submit@debbugs.gnu.org; Wed, 14 May 2014 10:48:58 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:34115) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WkaUI-0002hc-VM for 17470@debbugs.gnu.org; Wed, 14 May 2014 10:48:55 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 09C33A60016; Wed, 14 May 2014 07:48:49 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ng5J8oPG-Bjk; Wed, 14 May 2014 07:48:40 -0700 (PDT) Received: from [192.168.1.9] (pool-108-0-233-62.lsanca.fios.verizon.net [108.0.233.62]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 687B4A6000B; Wed, 14 May 2014 07:48:40 -0700 (PDT) Message-ID: <53738247.80801@cs.ucla.edu> Date: Wed, 14 May 2014 07:48:39 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= , Azat Khuzhin Subject: Re: bug#17470: [PATCH] sort: rotate on ENOSPC while creating tmp files References: <1399840635-19019-1-git-send-email-a3at.mail@gmail.com> <536FF8F4.8090500@cs.ucla.edu> <53700645.3000302@draigBrady.com> <20140514110702.GH10319@azat> <5373527E.4070408@draigBrady.com> In-Reply-To: <5373527E.4070408@draigBrady.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Score: -3.0 (---) X-Debbugs-Envelope-To: 17470 Cc: 17470@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.0 (---) Pádraig Brady wrote: > Also do we always know how much to fallocate? Not if we're using compression on the temporaries, no. I think a patch along these lines could be worthwhile, if it was simple and if it actually worked (the current one doesn't). Something along the following lines, say. When multiple -T options are specified (-T FOO, -T FOP, -T FOQ, ...) and one of them runs out of disk space when creating a temporary file FOO/BAR, 'sort' stops creating files in FOO (effectively removing FOO from the option list) creates a file FOP/BAR instead, and redoes the process (whatever it was) that sent output to FOO/BAR, sending the output to FOP/BAR this time. I don't have the energy right now to write that, but if someone else wrote it I'd review it. From debbugs-submit-bounces@debbugs.gnu.org Mon May 26 15:44:47 2014 Received: (at 17470) by debbugs.gnu.org; 26 May 2014 19:44:47 +0000 Received: from localhost ([127.0.0.1]:60739 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wp0pC-00016C-MH for submit@debbugs.gnu.org; Mon, 26 May 2014 15:44:47 -0400 Received: from mail-la0-f54.google.com ([209.85.215.54]:60174) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wp0pA-00015l-FL for 17470@debbugs.gnu.org; Mon, 26 May 2014 15:44:45 -0400 Received: by mail-la0-f54.google.com with SMTP id pv20so5865630lab.41 for <17470@debbugs.gnu.org>; Mon, 26 May 2014 12:44:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=tosYeD6L7ZayE80tNTwjgAge7z7Hgksxbn6zyYdH9uk=; b=JWTiCf8M3fxOpyPn/KYGeyxWWI9z0TILMv7vWtG9qVwBNNFKvyl9eWTjeLtLi/y8xh 6mHKq5J/DNRFwJMlB/pirEipeRruIkSifDYATy5h4WiI4zGQGJkEVi0XI7rT5J54OK65 UANAImKs0h6JtmEyAnxKRpa37mQ1qtwUJxRaYPf9Sq8pj0fQWoVapd2XUlXKXVlc+U55 SvnC2P5hUAZab6+/nK1H/PYe+vC1uuHxl3GtzKR0QzG9YqUtJ6nO9usEsrqZNLveoNob 4FFdZF1i6i6LrLyewIHl5jKtbk1vnd7LJ/R7Jjq4UAaB4u+o/q+KyZd1Jt76p8oX5ozC fjYw== X-Received: by 10.112.72.230 with SMTP id g6mr18633984lbv.10.1401133478089; Mon, 26 May 2014 12:44:38 -0700 (PDT) Received: from localhost ([188.134.22.24]) by mx.google.com with ESMTPSA id ax10sm12948101lbc.7.2014.05.26.12.44.36 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 26 May 2014 12:44:36 -0700 (PDT) Date: Mon, 26 May 2014 23:44:29 +0400 From: Azat Khuzhin To: Paul Eggert Subject: Re: bug#17470: [PATCH] sort: rotate on ENOSPC while creating tmp files Message-ID: <20140526194429.GG26948@azat> References: <1399840635-19019-1-git-send-email-a3at.mail@gmail.com> <536FF8F4.8090500@cs.ucla.edu> <53700645.3000302@draigBrady.com> <20140514110702.GH10319@azat> <5373527E.4070408@draigBrady.com> <53738247.80801@cs.ucla.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <53738247.80801@cs.ucla.edu> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 17470 Cc: 17470@debbugs.gnu.org, =?iso-8859-1?Q?P=E1draig?= Brady X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) On Wed, May 14, 2014 at 07:48:39AM -0700, Paul Eggert wrote: > Pádraig Brady wrote: > >Also do we always know how much to fallocate? > > Not if we're using compression on the temporaries, no. > > I think a patch along these lines could be worthwhile, if it was simple and > if it actually worked (the current one doesn't). The current patch only look while files is created, but this is not enough `I agree with you, it must check write(2) and fallback to creating when write(2) will fail with ENOSPC. This is what you mean? > Something along the > following lines, say. When multiple -T options are specified (-T FOO, -T > FOP, -T FOQ, ...) and one of them runs out of disk space when creating a > temporary file FOO/BAR, 'sort' stops creating files in FOO (effectively > removing FOO from the option list) creates a file FOP/BAR instead, and > redoes the process (whatever it was) that sent output to FOO/BAR, sending > the output to FOP/BAR this time. I don't think that redoes is worth it, since when we have ENOSPC it means that we already won't create any more files there, and one file with relatively small size is not a big deal. I also think that dropping direcotory from list is a good idea, user can notice this (using some monitorings) and clean it. For example recently I need to sort relatively huge amount of data, and I don't have enough space for writing all tmp files (why --compress-program not works for me is another story), so I wrote script, that did next: When free space < %3 on some of partitions, I run pkill -STOP sort, and then move some files (that not currently opened by sort(1)) to partition that have more free space, and create symlinks for old locations. When all the free space was eliminated on all available partitions, I archived existed tmp files (that also not opened by sort(1)) and create a pipes, that redirects output of $(gzip -d) into it, and using this hacks sort finished successfully for me. > > I don't have the energy right now to write that, but if someone else wrote > it I'd review it. Thanks for you notes Pádraig. -- Respectfully Azat Khuzhin From debbugs-submit-bounces@debbugs.gnu.org Mon May 26 16:44:32 2014 Received: (at 17470) by debbugs.gnu.org; 26 May 2014 20:44:32 +0000 Received: from localhost ([127.0.0.1]:60789 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wp1l2-000312-5B for submit@debbugs.gnu.org; Mon, 26 May 2014 16:44:32 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:48459) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wp1kz-00030g-JB for 17470@debbugs.gnu.org; Mon, 26 May 2014 16:44:30 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 2A2ED39E8014; Mon, 26 May 2014 13:44:23 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vyvjNR5jigRs; Mon, 26 May 2014 13:44:14 -0700 (PDT) Received: from [192.168.1.9] (pool-108-0-233-62.lsanca.fios.verizon.net [108.0.233.62]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 8C60B39E8012; Mon, 26 May 2014 13:44:14 -0700 (PDT) Message-ID: <5383A79E.1010201@cs.ucla.edu> Date: Mon, 26 May 2014 13:44:14 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: Azat Khuzhin Subject: Re: bug#17470: [PATCH] sort: rotate on ENOSPC while creating tmp files References: <1399840635-19019-1-git-send-email-a3at.mail@gmail.com> <536FF8F4.8090500@cs.ucla.edu> <53700645.3000302@draigBrady.com> <20140514110702.GH10319@azat> <5373527E.4070408@draigBrady.com> <53738247.80801@cs.ucla.edu> <20140526194429.GG26948@azat> In-Reply-To: <20140526194429.GG26948@azat> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -3.0 (---) X-Debbugs-Envelope-To: 17470 Cc: 17470@debbugs.gnu.org, =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.0 (---) Azat Khuzhin wrote: > The current patch only look while files is created, but this is not > enough `I agree with you, it must check write(2) and fallback to creating > when write(2) will fail with ENOSPC. > This is what you mean? Yes. > when we have ENOSPC it > means that we already won't create any more files there, and one file > with relatively small size is not a big deal. OK. The point is that 'sort' shouldn't lose the data (including the possibly-incomplete trailing line) that's already in the temporary file when a write to that file fails. Also, the code could treat EIO like ENOSPC, I suppose, to be more robust in the presence of bad temporary devices. But beware file systems that report ENOSPC and EIO in a delayed fashion, i.e., not immediately upon the failing write, but somewhat later, typically when closing the output file. From debbugs-submit-bounces@debbugs.gnu.org Mon May 26 16:56:31 2014 Received: (at 17470) by debbugs.gnu.org; 26 May 2014 20:56:31 +0000 Received: from localhost ([127.0.0.1]:60819 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wp1wc-0003LJ-Oa for submit@debbugs.gnu.org; Mon, 26 May 2014 16:56:31 -0400 Received: from mail-la0-f53.google.com ([209.85.215.53]:57913) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Wp1wa-0003L4-9C for 17470@debbugs.gnu.org; Mon, 26 May 2014 16:56:29 -0400 Received: by mail-la0-f53.google.com with SMTP id ty20so4498668lab.12 for <17470@debbugs.gnu.org>; Mon, 26 May 2014 13:56:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=3EV8cG91QnWCbmTBPyAj8tzlK9lkdKxQx7gjMpVxQdo=; b=XwHNaLbO9UUwOo4oAS1PSZy+WAYv7uYoqEgMCVjZbEPieO8IG7MbnxhZLkg9HV67hC uwDhjI6urnUFtvWEVWVWMd/mKiq+U4QFQrPIrnL5FvNqBgM9+mkhrIeUbyS3zb9ZDEUG 8YrMAX631QddHdKxFaBEGCjZDI0faRg73b51LGLRvBnv4L95okVHrH4oXeccrXGVodJC qcmgeMSpx9f/q3s5BKO2FvIKRQxL3Sl4jTJ6XpRL4+Cyab3PqgRQs3NXYkIVEJb3XS6h tkQ4jMQNKdmjPQRL2tukfw/sT0G9UxGrhZd0gqVTDwanshvbB+4GcXHK6j0L9mfDxDqM 4FWw== X-Received: by 10.112.33.83 with SMTP id p19mr3341338lbi.90.1401137781994; Mon, 26 May 2014 13:56:21 -0700 (PDT) Received: from localhost ([188.134.22.24]) by mx.google.com with ESMTPSA id zx3sm13099028lbc.2.2014.05.26.13.56.20 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 26 May 2014 13:56:21 -0700 (PDT) Date: Tue, 27 May 2014 00:56:11 +0400 From: Azat Khuzhin To: Paul Eggert Subject: Re: bug#17470: [PATCH] sort: rotate on ENOSPC while creating tmp files Message-ID: <20140526205611.GH26948@azat> References: <1399840635-19019-1-git-send-email-a3at.mail@gmail.com> <536FF8F4.8090500@cs.ucla.edu> <53700645.3000302@draigBrady.com> <20140514110702.GH10319@azat> <5373527E.4070408@draigBrady.com> <53738247.80801@cs.ucla.edu> <20140526194429.GG26948@azat> <5383A79E.1010201@cs.ucla.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5383A79E.1010201@cs.ucla.edu> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 17470 Cc: 17470@debbugs.gnu.org, =?iso-8859-1?Q?P=E1draig?= Brady X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) On Mon, May 26, 2014 at 01:44:14PM -0700, Paul Eggert wrote: > Azat Khuzhin wrote: > >The current patch only look while files is created, but this is not > >enough `I agree with you, it must check write(2) and fallback to creating > >when write(2) will fail with ENOSPC. > >This is what you mean? > > Yes. > > >when we have ENOSPC it > >means that we already won't create any more files there, and one file > >with relatively small size is not a big deal. > > OK. The point is that 'sort' shouldn't lose the data (including the > possibly-incomplete trailing line) that's already in the temporary file when > a write to that file fails. Thanks for the explanation, I see what you mean. > > Also, the code could treat EIO like ENOSPC, I suppose, to be more robust in > the presence of bad temporary devices. > > But beware file systems that report ENOSPC and EIO in a delayed fashion, > i.e., not immediately upon the failing write, but somewhat later, typically > when closing the output file. Yeah, that's a good catch, I will keep this in mind when I will start working on this. Thanks, Azat. From unknown Mon Jun 23 02:21:23 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Tue, 24 Jun 2014 11:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator