GNU bug report logs - #14752
sort fails to fork() + execlp(compress_program) if overcommit limit is reached

Previous Next

Package: coreutils;

Reported by: Petros Aggelatos <petrosagg <at> gmail.com>

Date: Sun, 30 Jun 2013 05:23:01 UTC

Severity: normal

Done: Bernhard Voelker <mail <at> bernhard-voelker.de>

Bug is archived. No further changes may be made.

Full log


Message #10 received at 14752-done <at> debbugs.gnu.org (full text, mbox):

From: Bernhard Voelker <mail <at> bernhard-voelker.de>
To: Petros Aggelatos <petrosagg <at> gmail.com>
Cc: 14752-done <at> debbugs.gnu.org
Subject: Re: bug#14752: sort fails to fork() + execlp(compress_program) if
 overcommit limit is reached
Date: Mon, 01 Jul 2013 09:16:37 +0200
tag 14752 notabug
close 14752
stop

On 06/30/2013 03:42 AM, Petros Aggelatos wrote:
> I was trying to sort a big file (22GB, 5GB gzipped) with `sort
> --compress-program=gzip -S40% data`. My /tmp filesystem is a 6GB tmpfs
> and the total system RAM is 16GB. The problem was that after a while
> sort would write uncompressed temp files in /tmp causing it to fill up
> and then crash for having no free space.

Thanks for reporting this.  However, I think that your system's memory
is just too small for sorting that file (that way, see below).

You already recognized yourself that sort(1) was writing huge chunk files
into the /tmp directory which is a tmpfs file system, i.e., that all that
data is decreasing the memory available for running processes.
The overhead for spawning a new process is negligible compared to such
an amount of data.

In such a case, you're much better off telling sort(1) to use a different
directory for the temporary files.

Here's an excerpt from the texinfo manual
(info coreutils 'sort invocation'):

     If the environment variable `TMPDIR' is set, `sort' uses its value
  as the directory for temporary files instead of `/tmp'.  The
  `--temporary-directory' (`-T') option in turn overrides the environment
  variable.

  ...

  `-T TEMPDIR'
  `--temporary-directory=TEMPDIR'
       Use directory TEMPDIR to store temporary files, overriding the
       `TMPDIR' environment variable.  If this option is given more than
       once, temporary files are stored in all the directories given.  If
       you have a large sort or merge that is I/O-bound, you can often
       improve performance by using this option to specify directories on
       different disks and controllers.

Have a nice day,
Berny




This bug report was last modified 9 years and 162 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.