GNU bug report logs - #47883
sort -o loses data when it crashes

Previous Next

Package: coreutils;

Reported by: "Peter van Dijk" <peter <at> 7bits.nl>

Date: Sun, 18 Apr 2021 22:44:01 UTC

Severity: normal

To reply to this bug, email your comments to 47883 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#47883; Package coreutils. (Sun, 18 Apr 2021 22:44:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to "Peter van Dijk" <peter <at> 7bits.nl>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Sun, 18 Apr 2021 22:44:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: "Peter van Dijk" <peter <at> 7bits.nl>
To: bug-coreutils <at> gnu.org
Subject: sort -o loses data when it crashes
Date: Sun, 18 Apr 2021 19:46:01 +0200
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/sort.html: -o  output
    Specify the name of an output file to be used instead of the standard output. This file can be the same as one of the input files.

https://www.gnu.org/software/coreutils/manual/html_node/sort-invocation.html: "data may be lost if the system crashes or sort encounters an I/O or other serious error while a file is being sorted in place" and "sort with --merge (-m) can open the output file before reading all input"

While the manual (but not the manpage) mentions the data loss, I think it would be great if sort did not have this problem at all, and I think the OpenGroup text also says it should not have this problem. I looked around, and a lot of software does get this right (by opening a randomly-named temp file to write to, and only moving it into place when it is written successfuly) - GNU sed -i, OpenBSD sort, and surely there are more. As a bonus, doing this would also make the `-o someinputfile -m` case safe.

Reproduction of the data loss is easy:

$ seq 10000 > 10000 ; prlimit --fsize=10 sort -R -o 10000 10000 ; wc -l 10000
File size limit exceeded (core dumped)
2 10000


(coreutils shuf has the same problem even though not all code appears to be shared - for example, sorts open the file for writing even before it opens it for reading, while shuf reverses the order of those two operations. That difference makes no difference in the effect, though.)

-- 
  Peter van Dijk
  peter <at> 7bits.nl




Information forwarded to bug-coreutils <at> gnu.org:
bug#47883; Package coreutils. (Thu, 22 Apr 2021 02:12:01 GMT) Full text and rfc822 format available.

Message #8 received at 47883 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Peter van Dijk <peter <at> 7bits.nl>
Cc: 47883 <at> debbugs.gnu.org
Subject: Re: bug#47883: sort -o loses data when it crashes
Date: Wed, 21 Apr 2021 19:11:14 -0700
On 4/18/21 10:46 AM, Peter van Dijk wrote:
> While the manual (but not the manpage) mentions the data loss, I think it would be great if sort did not have this problem at all, and I think the OpenGroup text also says it should not have this problem.

I don't know of any 'sort' implementation that does not have the problem 
at all. For example, FreeBSD 'sort -o file file' can lose 'file' in some 
(rare) cases. The only portable way to avoid this problem in a shell 
script is to output to some other file first and make sure that worked, 
before attempting to replace the input file.

Also, I don't see where the Open Group spec says what you're saying. On 
the contrary, the spec merely says that '-o output' should cause output 
to be sent to the output file. If there are multiple hard links to the 
output file, this suggests 'sort' should update the output file's 
contents without breaking any hard links. Admittedly the Open Group spec 
is a bit vague in this area, but I certainly don't see anything implying 
that GNU 'sort' does not conform to POSIX in this area.

FreeBSD 'sort' has a problem, in that 'sort -o A B' preserves all hard 
links to A's file, but 'sort -o A A' does not because it breaks the link 
from A. That's confusing.

Traditional Unix 'sort -o A' behaves the way GNU 'sort' does; it 
preserves all hard links to A's file. So there is a compatibility 
argument for doing things the way GNU 'sort' does them, even if that 
might lead to more data loss in rare cases.




Merged 47059 47883 48002. Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Sat, 24 Apr 2021 21:46:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-coreutils <at> gnu.org:
bug#47883; Package coreutils. (Sat, 24 Apr 2021 22:01:02 GMT) Full text and rfc822 format available.

Message #13 received at 47883 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: L A Walsh <coreutils <at> tlinx.org>
Cc: 47883 <at> debbugs.gnu.org, peter <at> 7bits.nl
Subject: Re: bug#47883: sort -o loses data when it crashes
Date: Sat, 24 Apr 2021 15:00:21 -0700
As I wrote you privately last month, the coreutils maintainers (who are 
not me) are pretty busy. The proposed change in bug#47883 would be 
incompatible with longstanding tradition and would almost certainly 
break some existing scripts running on GNU/Linux. This is not something 
to do lightly.

It might be possible to come up with a different change that would 
address the issue raised without being so disruptive. Whatever change 
(if any) is chosen, someone needs to think it through, code it up, 
document it, and test it. Although nobody's found the time to do that, 
perhaps you could volunteer or find someone who could volunteer; that 
would surely accelerate the process.

You mentioned that we have multiple bug reports (now 47059, 47883, 
48002) on basically the same topic, so I have taken the liberty of 
merging them.




Disconnected #47883 from all other report(s). Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Mon, 21 Feb 2022 09:06:02 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 113 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.