Anytime you have multiple blank lines in a row,
you have consecutive line feeds.
For typical sort processed data; concurrent LF might be uncommon.
When the event does not become then by the specialized code the CPU cycles could be wasted.
----
I'm sure if you submitted a working patch + documentation
+ rights assigned to GNU, and first born child given to FSF,
the coreutil maintainers would consider it.
Past self authored cat patches were declined. :(
For self only desired modifications; self authored software gains immediate approval. :)
From not parsing sort source code;
accidental source copy is mitigated and the boons and banes can not be inherited.
From dissenting output the question became; "Why LF before TAB?"
Program sort's performance is fine.
A complaint did not exist.
In all executed software the same performance bane exists.
A fork + execve overhead or
a relevant functions + posix_spawnp overhead exists.
Or more succinctly put
for program start CPU cycles are required.
The overhead might seem insignificant,
but in a script for each program launch the program reliance overhead accumulates.
Performance is lost.
In contrast to program provided code; library provided code can be loaded once and used frequently.
As compared to program invocation duration the library load duration also is less.
To sort a small line amount by program sort invocation a considerable program launch overhead duration becomes.
From a library perspective the following potentially complex tasks seem attractive: cp; mv; sort; tsort; wc.
cp and mv implementations can be surprisingly complex.
Self authored implementations already exist.
For coreutils provided cp and mv; parameter options that from the kernel cache purge used file data could be useful.
From files that were copied or moved the content is probably not again immediately useful
yet lingers in the kernel cache.
By the kernel cache when almost all available RAM is used then file copy performance tanks.
By a large and irrelevantly stocked kernel cache search
performance also tanks.
A less than ideally configured in use kernel seems plausible.
For this task perhaps .../vfs_cache_pressure and
.../drop_caches might not suffice?
Function posix_fadvise seems useful.
But on descriptors posix_fadvise must be invoked.
For directory cache data, however, posix_fadvise does not seem useful.
Thanks again for maintaining and sharing coreutils.