GNU bug report logs -
#47883
sort -o loses data when it crashes
Previous Next
To reply to this bug, email your comments to 47883 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-coreutils <at> gnu.org
:
bug#47883
; Package
coreutils
.
(Sun, 18 Apr 2021 22:44:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
"Peter van Dijk" <peter <at> 7bits.nl>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Sun, 18 Apr 2021 22:44:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/sort.html: -o output
Specify the name of an output file to be used instead of the standard output. This file can be the same as one of the input files.
https://www.gnu.org/software/coreutils/manual/html_node/sort-invocation.html: "data may be lost if the system crashes or sort encounters an I/O or other serious error while a file is being sorted in place" and "sort with --merge (-m) can open the output file before reading all input"
While the manual (but not the manpage) mentions the data loss, I think it would be great if sort did not have this problem at all, and I think the OpenGroup text also says it should not have this problem. I looked around, and a lot of software does get this right (by opening a randomly-named temp file to write to, and only moving it into place when it is written successfuly) - GNU sed -i, OpenBSD sort, and surely there are more. As a bonus, doing this would also make the `-o someinputfile -m` case safe.
Reproduction of the data loss is easy:
$ seq 10000 > 10000 ; prlimit --fsize=10 sort -R -o 10000 10000 ; wc -l 10000
File size limit exceeded (core dumped)
2 10000
(coreutils shuf has the same problem even though not all code appears to be shared - for example, sorts open the file for writing even before it opens it for reading, while shuf reverses the order of those two operations. That difference makes no difference in the effect, though.)
--
Peter van Dijk
peter <at> 7bits.nl
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#47883
; Package
coreutils
.
(Thu, 22 Apr 2021 02:12:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 47883 <at> debbugs.gnu.org (full text, mbox):
On 4/18/21 10:46 AM, Peter van Dijk wrote:
> While the manual (but not the manpage) mentions the data loss, I think it would be great if sort did not have this problem at all, and I think the OpenGroup text also says it should not have this problem.
I don't know of any 'sort' implementation that does not have the problem
at all. For example, FreeBSD 'sort -o file file' can lose 'file' in some
(rare) cases. The only portable way to avoid this problem in a shell
script is to output to some other file first and make sure that worked,
before attempting to replace the input file.
Also, I don't see where the Open Group spec says what you're saying. On
the contrary, the spec merely says that '-o output' should cause output
to be sent to the output file. If there are multiple hard links to the
output file, this suggests 'sort' should update the output file's
contents without breaking any hard links. Admittedly the Open Group spec
is a bit vague in this area, but I certainly don't see anything implying
that GNU 'sort' does not conform to POSIX in this area.
FreeBSD 'sort' has a problem, in that 'sort -o A B' preserves all hard
links to A's file, but 'sort -o A A' does not because it breaks the link
from A. That's confusing.
Traditional Unix 'sort -o A' behaves the way GNU 'sort' does; it
preserves all hard links to A's file. So there is a compatibility
argument for doing things the way GNU 'sort' does them, even if that
might lead to more data loss in rare cases.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#47883
; Package
coreutils
.
(Sat, 24 Apr 2021 22:01:02 GMT)
Full text and
rfc822 format available.
Message #13 received at 47883 <at> debbugs.gnu.org (full text, mbox):
As I wrote you privately last month, the coreutils maintainers (who are
not me) are pretty busy. The proposed change in bug#47883 would be
incompatible with longstanding tradition and would almost certainly
break some existing scripts running on GNU/Linux. This is not something
to do lightly.
It might be possible to come up with a different change that would
address the issue raised without being so disruptive. Whatever change
(if any) is chosen, someone needs to think it through, code it up,
document it, and test it. Although nobody's found the time to do that,
perhaps you could volunteer or find someone who could volunteer; that
would surely accelerate the process.
You mentioned that we have multiple bug reports (now 47059, 47883,
48002) on basically the same topic, so I have taken the liberty of
merging them.
Disconnected #47883 from all other report(s).
Request was from
Paul Eggert <eggert <at> cs.ucla.edu>
to
control <at> debbugs.gnu.org
.
(Mon, 21 Feb 2022 09:06:02 GMT)
Full text and
rfc822 format available.
This bug report was last modified 3 years and 113 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.