GNU bug report logs - #11816
sort -o: error comes late if opening the outfile fails

Previous Next

Package: coreutils;

Reported by: Bernhard Voelker <mail <at> bernhard-voelker.de>

Date: Fri, 29 Jun 2012 12:00:02 UTC

Severity: normal

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Bernhard Voelker <mail <at> bernhard-voelker.de>
Subject: bug#11816: closed (Re: bug#11816: sort -o: error comes late if
 opening the outfile fails)
Date: Mon, 02 Jul 2012 10:27:02 +0000
[Message part 1 (text/plain, inline)]
Your bug report

#11816: sort -o: error comes late if opening the outfile fails

which was filed against the coreutils package, has been closed.

The explanation is attached below, along with your original report.
If you require more details, please reply to 11816 <at> debbugs.gnu.org.

-- 
11816: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=11816
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Pádraig Brady <P <at> draigBrady.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 11816-done <at> debbugs.gnu.org, Bernhard Voelker <mail <at> bernhard-voelker.de>
Subject: Re: bug#11816: sort -o: error comes late if opening the outfile fails
Date: Mon, 02 Jul 2012 12:21:38 +0200
[Message part 3 (text/plain, inline)]
On 06/30/2012 03:11 PM, Pádraig Brady wrote:
> On 06/30/2012 12:53 PM, Paul Eggert wrote:
>> On 06/29/2012 07:55 AM, Pádraig Brady wrote:
>>> Also the in==out case, you'd like to check for write-ability too.
>>>
>>> Both cases could be handled I think with something like:
>>>
>>> if (access (outfile, W_OK) != 0 && errno != ENOENT)
>>>   error (...);
>>
>> Wouldn't it be better to actually open the output file,
>> but not truncate it?  We can then truncate it just before
>> actually writing to the file.  That would avoid a race
>> condition or two.
>>
>> In the in==out case, we could tune this by opening
>> the file just once, with O_RDWR.  If the file is not
>> a regular file, we might have to give up and open such
>> a file twice, but that should be rare.
>>
> 
> The race would be unlikely and
> only fallback to the existing operation
> of slower failure.
> 
> Though I suppose opening the file is a
> more direct check and would also obviate the
> need to check for writeability of the containing dir
> in the case of a non existent file.
> 
> OK I'm leaning towards an early open so.
> 
> As for cleaning up an empty created file,
> `sort` already has an exit_cleanup() function,
> so we can unlink there.
> 
> I'm not sure it's worth tuning the in==out case TBH.

So I didn't bother unlinking created empty files
as this is problematic in the presence of symlinks.
To mitigate this I create the output after all option
validation is done, just before sort/merge process is started.

Also we must be careful to handle the `sort -o missing missing` case.
I.E. we don't want to create an empty file, resulting in the
above failing to notice the missing file and returning succesfully.
So to avoid that I explicitly check all inputs are readable first.
In addition to catering for the above case, it's a general improvement
to avoid redundant processing.  That was already handled in the merge case,
but in the sorting case only a stat was done as a side effect
of input size checking, and that didn't handle the case
where input was present but unreadable.

Patch attached.

cheers,
Pádraig.
[sort-exit-early.diff (text/plain, attachment)]
[Message part 5 (message/rfc822, inline)]
From: Bernhard Voelker <mail <at> bernhard-voelker.de>
To: bug-coreutils <at> gnu.org
Subject: sort -o: error comes late if opening the outfile fails
Date: Fri, 29 Jun 2012 13:54:32 +0200
If opening the output file for writing is not possible - e.g. because the user
doesn't have sufficient privileges, then sort issues an error. The problem
is that the whole - then useless - computation is already done.

In the following little example, it took ~15s to sort the data,
and then to realize that it can't write the result:

  $ time seq 1000000 | src/sort --random-sort -o /cantwritehere
  src/sort: open failed: /cantwritehere: Permission denied

  real    0m14.955s
  user    0m14.936s
  sys     0m0.042s

I'd have expected sort to give the error immediately after startup
instead of wasting time for sorting.
Shouldn't sort open the outfile right at the beginning (unless in==out),
or is this behavior required by some standard?

Have a nice day,
Berny




This bug report was last modified 13 years and 16 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.