GNU bug report logs - #73068
printf: please implement POSIX:2024 argument reordering

Previous Next

Package: coreutils;

Reported by: Bruno Haible <bruno <at> clisp.org>

Date: Fri, 6 Sep 2024 14:07:01 UTC

Severity: normal

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 73068 in the body.
You can then email your comments to 73068 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#73068; Package coreutils. (Fri, 06 Sep 2024 14:07:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Bruno Haible <bruno <at> clisp.org>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Fri, 06 Sep 2024 14:07:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: bug-coreutils <at> gnu.org
Subject: printf: please implement POSIX:2024 argument reordering
Date: Fri, 06 Sep 2024 16:06:34 +0200
Hi,

POSIX:2024 specifies that printf(1) should support numbered conversion
specifications:
https://pubs.opengroup.org/onlinepubs/9799919799/utilities/printf.html
https://austingroupbugs.net/view.php?id=1592

Could this support please be added to GNU coreutils? As of coreutils 9.5,
I still get:

  $ /usr/bin/printf 'abc%2$sdef%1$sxxx\n' 1 2
  abc/usr/bin/printf: %2$: invalid conversion specification

Rationale: It was pointed out in https://austingroupbugs.net/view.php?id=1592
that these four statements do all the same thing:

1) pid=$$; eval_gettext "Running as process number \$pid."; echo
2) printf_gettext "Running as process number %d." $$; echo
3) printf "`gettext 'Running as process number %d.'`" $$; echo
4) printf $(gettext 'Running as process number %d.') $$; echo

The first one has the drawback that it requires the programmer to
add backslashes to their format strings.

The second one has the drawback that it requires a 'printf_gettext'
program (that does not yet exist).

The third and fourth one (suggested by Jörg Schilling, IIRC) feel more
natural to a shell script programmer. However, they require that
printf supports numbered arguments. In the first time, we would use
a shorthand
  $printf
where (on most GNU systems) printf='/usr/bin/printf', until bash, dash,
etc. support it as well.

The long-term goal is to be able to change the GNU gettext documentation
https://www.gnu.org/software/gettext/manual/html_node/sh.html
to list:
  "
   Formatting with positions
     printf
  "

Bruno







Information forwarded to bug-coreutils <at> gnu.org:
bug#73068; Package coreutils. (Mon, 09 Sep 2024 18:32:02 GMT) Full text and rfc822 format available.

Message #8 received at 73068 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Bruno Haible <bruno <at> clisp.org>, 73068 <at> debbugs.gnu.org
Subject: Re: bug#73068: printf: please implement POSIX:2024 argument reordering
Date: Mon, 9 Sep 2024 19:30:38 +0100
On 06/09/2024 15:06, Bruno Haible wrote:
> Hi,
> 
> POSIX:2024 specifies that printf(1) should support numbered conversion
> specifications:
> https://pubs.opengroup.org/onlinepubs/9799919799/utilities/printf.html
> https://austingroupbugs.net/view.php?id=1592
> 
> Could this support please be added to GNU coreutils? As of coreutils 9.5,
> I still get:
> 
>    $ /usr/bin/printf 'abc%2$sdef%1$sxxx\n' 1 2
>    abc/usr/bin/printf: %2$: invalid conversion specification


This make sense to implement.
I see ksh and FreeBSD at least, already have.
I'll have a look at doing this.

thank you,
Pádraig




Reply sent to Pádraig Brady <P <at> draigBrady.com>:
You have taken responsibility. (Thu, 12 Sep 2024 16:18:01 GMT) Full text and rfc822 format available.

Notification sent to Bruno Haible <bruno <at> clisp.org>:
bug acknowledged by developer. (Thu, 12 Sep 2024 16:18:02 GMT) Full text and rfc822 format available.

Message #13 received at 73068-done <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Bruno Haible <bruno <at> clisp.org>, 73068-done <at> debbugs.gnu.org
Subject: Re: bug#73068: printf: please implement POSIX:2024 argument reordering
Date: Thu, 12 Sep 2024 17:15:56 +0100
[Message part 1 (text/plain, inline)]
On 09/09/2024 19:30, Pádraig Brady wrote:
> On 06/09/2024 15:06, Bruno Haible wrote:
>> Hi,
>>
>> POSIX:2024 specifies that printf(1) should support numbered conversion
>> specifications:
>> https://pubs.opengroup.org/onlinepubs/9799919799/utilities/printf.html
>> https://austingroupbugs.net/view.php?id=1592
>>
>> Could this support please be added to GNU coreutils? As of coreutils 9.5,
>> I still get:
>>
>>     $ /usr/bin/printf 'abc%2$sdef%1$sxxx\n' 1 2
>>     abc/usr/bin/printf: %2$: invalid conversion specification
> 
> 
> This make sense to implement.
> I see ksh and FreeBSD at least, already have.
> I'll have a look at doing this.

I'll apply the attached sometime tomorrow.

Marking this as done.

cheers,
Pádraig
[printf-indexed.patch (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#73068; Package coreutils. (Thu, 12 Sep 2024 17:08:02 GMT) Full text and rfc822 format available.

Message #16 received at 73068-done <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: Pádraig Brady <P <at> draigbrady.com>
Cc: 73068-done <at> debbugs.gnu.org
Subject: Re: bug#73068: printf: please implement POSIX:2024 argument reordering
Date: Thu, 12 Sep 2024 19:06:52 +0200
Pádraig Brady wrote:
> I'll apply the attached sometime tomorrow.

Nice! Thank you.

There seems to be a typo in the unit test, though: It defines a shell
function 'printf_checki_err' but the function it then invokes is
'printf_check_err'.

Bruno







Information forwarded to bug-coreutils <at> gnu.org:
bug#73068; Package coreutils. (Thu, 12 Sep 2024 17:43:02 GMT) Full text and rfc822 format available.

Message #19 received at 73068 <at> debbugs.gnu.org (full text, mbox):

From: Collin Funk <collin.funk1 <at> gmail.com>
To: 73068 <at> debbugs.gnu.org
Cc: bruno <at> clisp.org, P <at> draigBrady.com
Subject: Re: bug#73068: printf: please implement POSIX:2024 argument reordering
Date: Thu, 12 Sep 2024 10:40:48 -0700
Hi Pádraig,

Pádraig Brady <P <at> draigBrady.com> writes:

> I'll apply the attached sometime tomorrow.
>
> Marking this as done.

Patch looks good, thanks.

One small comment, though.

> +#define GET_CURR_ARG(POS)				\
> +do {							\
> +  char *arge;						\
> +  intmax_t arg = POS==3 ? 0 : strtoimax (f, &arge, 10);	\
> +  if (0 < arg && arg <= INT_MAX && *arge == '$')	\
> +    /* Process indexed %i$ format.  */			\
> +    /* Note '$' comes before any flags.  */		\

Shouldn't you check errno here, like:

  char *arge;
  errno = 0;
  intmax_t arg = POS==3 ? 0 : strtoimax (f, &arge, 10);
  if (errno == 0 && 0 < arg && arg <= INT_MAX && *arge == '$')
  [...]

I think that would handle all bad cases.

For example, I think "%$" might return 0 but set errno to EINVAL.

Collin




Information forwarded to bug-coreutils <at> gnu.org:
bug#73068; Package coreutils. (Thu, 12 Sep 2024 18:08:02 GMT) Full text and rfc822 format available.

Message #22 received at 73068-done <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Bruno Haible <bruno <at> clisp.org>
Cc: 73068-done <at> debbugs.gnu.org
Subject: Re: bug#73068: printf: please implement POSIX:2024 argument reordering
Date: Thu, 12 Sep 2024 19:06:28 +0100
[Message part 1 (text/plain, inline)]
On 12/09/2024 18:06, Bruno Haible wrote:
> Pádraig Brady wrote:
>> I'll apply the attached sometime tomorrow.
> 
> Nice! Thank you.
> 
> There seems to be a typo in the unit test, though: It defines a shell
> function 'printf_checki_err' but the function it then invokes is
> 'printf_check_err'.

Hah, good catch.
That hid other errors in the test.
Fixed up with the attached.

thanks for the review,
Pádraig
[printf-indexed-adj1.diff (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#73068; Package coreutils. (Thu, 12 Sep 2024 19:06:02 GMT) Full text and rfc822 format available.

Message #25 received at 73068 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Collin Funk <collin.funk1 <at> gmail.com>, 73068 <at> debbugs.gnu.org
Cc: bruno <at> clisp.org
Subject: Re: bug#73068: printf: please implement POSIX:2024 argument reordering
Date: Thu, 12 Sep 2024 20:03:51 +0100
On 12/09/2024 18:40, Collin Funk wrote:
> Hi Pádraig,
> 
> Pádraig Brady <P <at> draigBrady.com> writes:
> 
>> I'll apply the attached sometime tomorrow.
>>
>> Marking this as done.
> 
> Patch looks good, thanks.
> 
> One small comment, though.
> 
>> +#define GET_CURR_ARG(POS)				\
>> +do {							\
>> +  char *arge;						\
>> +  intmax_t arg = POS==3 ? 0 : strtoimax (f, &arge, 10);	\
>> +  if (0 < arg && arg <= INT_MAX && *arge == '$')	\
>> +    /* Process indexed %i$ format.  */			\
>> +    /* Note '$' comes before any flags.  */		\
> 
> Shouldn't you check errno here, like:
> 
>    char *arge;
>    errno = 0;
>    intmax_t arg = POS==3 ? 0 : strtoimax (f, &arge, 10);
>    if (errno == 0 && 0 < arg && arg <= INT_MAX && *arge == '$')
>    [...]
> 
> I think that would handle all bad cases.
> 
> For example, I think "%$" might return 0 but set errno to EINVAL.

A fair point, but note we only accept 1 ... INT_MAX,
so that implicitly excludes any of the possible error returns.
I should at least add a comment.

Though it got me thinking that strtol() may be too lenient
in what it accepts, resulting in possible confusion for users.
For example some printf flags like ' ' or '+' would be accepted
as part of a number, when ideally they should not be.

For example, the user might do:

$ printf '[% 1$d]\n' 1234
[1234]

When they really intended:

$ printf '[%1$ d]\n' 1234
[ 1234]

This is tricky enough, that we should be as restrictive as possible here,
so I may resort to strspn(f, "0123456789") to parse instead.
I'll think a bit about it.

thanks!
Pádraig




Information forwarded to bug-coreutils <at> gnu.org:
bug#73068; Package coreutils. (Thu, 12 Sep 2024 19:34:02 GMT) Full text and rfc822 format available.

Message #28 received at 73068 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Pádraig Brady <P <at> draigBrady.com>,
 Collin Funk <collin.funk1 <at> gmail.com>, 73068 <at> debbugs.gnu.org
Cc: bruno <at> clisp.org
Subject: Re: bug#73068: printf: please implement POSIX:2024 argument reordering
Date: Thu, 12 Sep 2024 12:33:07 -0700
On 2024-09-12 12:03, Pádraig Brady wrote:


> This is tricky enough, that we should be as restrictive as possible here,
> so I may resort to strspn(f, "0123456789") to parse instead.
> I'll think a bit about it.

The code's also assuming INT_MAX < INTMAX_MAX, which POSIX doesn't 
require. You could put in a static_assert to that effect, I suppose, to 
document the assumption.

More important, though, if you're not in the C locale all bets are off 
as far as what strtoimax will also parse.

When I ran into this problem with GNU tar, I ended by giving up on 
strtoimax and did my own little integer parser. It does exactly what I 
want and I don't have to fire up the strtoimax complexity+locale engine.




Information forwarded to bug-coreutils <at> gnu.org:
bug#73068; Package coreutils. (Fri, 13 Sep 2024 12:57:01 GMT) Full text and rfc822 format available.

Message #31 received at 73068 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>, Collin Funk <collin.funk1 <at> gmail.com>,
 73068 <at> debbugs.gnu.org
Cc: bruno <at> clisp.org
Subject: Re: bug#73068: printf: please implement POSIX:2024 argument reordering
Date: Fri, 13 Sep 2024 13:55:26 +0100
[Message part 1 (text/plain, inline)]
On 12/09/2024 20:33, Paul Eggert wrote:
> On 2024-09-12 12:03, Pádraig Brady wrote:
> 
> 
>> This is tricky enough, that we should be as restrictive as possible here,
>> so I may resort to strspn(f, "0123456789") to parse instead.
>> I'll think a bit about it.
> 
> The code's also assuming INT_MAX < INTMAX_MAX, which POSIX doesn't
> require. You could put in a static_assert to that effect, I suppose, to
> document the assumption.

Indeed. We would have incorrectly taken the INTMAX_MAX arg in the
(albeit unlikely) case where INT_MAX >= INTMAX_MAX,
and the provided number overflowed INTMAX_MAX.
To be explicit, strtol() doesn't return 0 in that case,
so we need to check overflow (like Colin suggested).

> More important, though, if you're not in the C locale all bets are off
> as far as what strtoimax will also parse.
> 
> When I ran into this problem with GNU tar, I ended by giving up on
> strtoimax and did my own little integer parser. It does exactly what I
> want and I don't have to fire up the strtoimax complexity+locale engine.

Right, it's best to preparse for the above reason,
and to avoid any confusion re leading spaces etc. like I previously mentioned.

The attached adjustment does the preparse with strspn(),
and only does the strtoimax() for appropriate strings.

cheers,
Pádraig
[printf-indexed-adj2.diff (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#73068; Package coreutils. (Fri, 13 Sep 2024 16:34:02 GMT) Full text and rfc822 format available.

Message #34 received at 73068 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>, Collin Funk <collin.funk1 <at> gmail.com>,
 73068 <at> debbugs.gnu.org
Cc: bruno <at> clisp.org
Subject: Re: bug#73068: printf: please implement POSIX:2024 argument reordering
Date: Fri, 13 Sep 2024 17:32:41 +0100
On 13/09/2024 13:55, Pádraig Brady wrote:
> On 12/09/2024 20:33, Paul Eggert wrote:
>> On 2024-09-12 12:03, Pádraig Brady wrote:
>>
>>
>>> This is tricky enough, that we should be as restrictive as possible here,
>>> so I may resort to strspn(f, "0123456789") to parse instead.
>>> I'll think a bit about it.
>>
>> The code's also assuming INT_MAX < INTMAX_MAX, which POSIX doesn't
>> require. You could put in a static_assert to that effect, I suppose, to
>> document the assumption.
> 
> Indeed. We would have incorrectly taken the INTMAX_MAX arg in the
> (albeit unlikely) case where INT_MAX >= INTMAX_MAX,
> and the provided number overflowed INTMAX_MAX.
> To be explicit, strtol() doesn't return 0 in that case,
> so we need to check overflow (like Colin suggested).
> 
>> More important, though, if you're not in the C locale all bets are off
>> as far as what strtoimax will also parse.
>>
>> When I ran into this problem with GNU tar, I ended by giving up on
>> strtoimax and did my own little integer parser. It does exactly what I
>> want and I don't have to fire up the strtoimax complexity+locale engine.
> 
> Right, it's best to preparse for the above reason,
> and to avoid any confusion re leading spaces etc. like I previously mentioned.
> 
> The attached adjustment does the preparse with strspn(),
> and only does the strtoimax() for appropriate strings.

I adjusted and pushed a further simplification that
clamps %<huge values>$ to %INT_MAX$, which is equivalent
as argc can practically only be <= INT_MAX - 2.
That simplifies the strtoimax() error handling,
and removes any limits on valid decimal numbers.

cheers,
Pádraig




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 12 Oct 2024 11:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 251 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.