GNU bug report logs - #78910
tail does not support -r added by POSIX.1-2024

Previous Next

Package: coreutils;

Reported by: Collin Funk <collin.funk1 <at> gmail.com>

Date: Fri, 27 Jun 2025 05:37:03 UTC

Severity: normal

Full log


View this message in rfc822 format

From: Pádraig Brady <P <at> draigBrady.com>
To: Bruno Haible <bruno <at> clisp.org>, 78910 <at> debbugs.gnu.org
Subject: bug#78910: tail does not support -r added by POSIX.1-2024
Date: Mon, 30 Jun 2025 12:48:06 +0100
On 30/06/2025 00:42, Bruno Haible via GNU coreutils Bug Reports wrote:
> Jim Meyering wrote:
>> That is an option no GNU system needs, since they've all had tac since
>> before 1992-era textutils.
> 
> But 'tac' does not have a line-number-limit argument.
> 
> The POSIX rationale [1] has
> 
>    "While both
>       tail -n$n | tac
>     and
>       tac | head -n$n
>     can be used to output a fixed length of reversed line output, the
>     standard developers decided that it was preferable to have a single
>     utility tail -r -n$n for the same purpose."

Right these are equivalent, so it's only worth considering
the more efficient tail -n$n | tac
> The second of these alternatives, 'tac | head -n$n' will not work well
> with non-seekable files: it requires 'tac' to buffer the *entire* input
> (as huge as it may be), before extracting a few lines of it.
> 
> The first alternative looks better: 'tail -n$n | tac'. But thinking
> through it, it seems the logic that 'tail' uses for 'tail -n$n' is
> also nearly suitable for 'tail -r -n$n':
>    - In function file_lines(), instead of calling dump_remainder at the
>      end, the loop would call xwrite_stdout once for each line (with
>      special considerations for lines that span more than 1 buffer).
>    - In function pipe_lines(), all the relevant data is in memory at
>      the end. It's only a question of doing the xwrite_stdout calls
>      on smaller pieces and in reverse order.
> 
> When implemented this way, this will be more efficient than to spawn
> 'tac' as a separate subprocess.

That's not really the unix model though.
Having separate processes also implicitly leverages multiple processors
so you'd have to account for that.

Saying all that I'm not strongly against it,
especially since POSIX standardised it,
but I'm just surprised they standardised it.

Note there are cases where merging functionality can have algorithmic advantages,
in which case there is a much stronger argument for merging.
For example we have previously mentioned sort --tail=$n or --head=$n
would be useful (and more commonly required) functionality.
See: https://lists.gnu.org/archive/html/bug-coreutils/2004-04/msg00157.html
It would be especially useful if implemented in O(n log n) complexity.

cheers,
Padraig




This bug report was last modified 38 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.