GNU bug report logs - #79382
patch needed after gnulib changed

Previous Next

Package: diffutils;

Reported by: Bruno Haible <bruno <at> clisp.org>

Date: Wed, 3 Sep 2025 22:55:02 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

To reply to this bug, email your comments to 79382 AT debbugs.gnu.org.
There is no need to reopen the bug first.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-diffutils <at> gnu.org:
bug#79382; Package diffutils. (Wed, 03 Sep 2025 22:55:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Bruno Haible <bruno <at> clisp.org>:
New bug report received and forwarded. Copy sent to bug-diffutils <at> gnu.org. (Wed, 03 Sep 2025 22:55:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: bug-diffutils <at> gnu.org
Subject: patch needed after gnulib changed
Date: Thu, 04 Sep 2025 00:53:51 +0200
[Message part 1 (text/plain, inline)]
Hi,

There was a change today in gnulib, that requires a small change in
packages that use gnulib-tool --with-tests with --makefile-name.
GNU diffutils is one such package.

Currently, './bootstrap' fails like this:
...
autoreconf: running: automake --add-missing --copy --force-missing
gnulib-tests/gnulib.mk:47: error: AM_CFLAGS must be set with '=' before using '+='
gnulib-tests/Makefile.am:1:   'gnulib-tests/gnulib.mk' included from here
autoreconf: error: automake failed with exit status: 1
./bootstrap: autoreconf failed

The attached proposed patch fixes it.

[0001-build-Update-after-gnulib-changed.patch (text/x-patch, attachment)]

Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Fri, 05 Sep 2025 22:56:01 GMT) Full text and rfc822 format available.

Notification sent to Bruno Haible <bruno <at> clisp.org>:
bug acknowledged by developer. (Fri, 05 Sep 2025 22:56:02 GMT) Full text and rfc822 format available.

Message #10 received at 79382-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Bruno Haible <bruno <at> clisp.org>
Cc: 79382-done <at> debbugs.gnu.org
Subject: Re: [bug-diffutils] bug#79382: patch needed after gnulib changed
Date: Fri, 5 Sep 2025 15:55:19 -0700
[Message part 1 (text/plain, inline)]
Thanks, I installed that and am marking this bug report as done.

I used the opportunity to sync GNU diff with current Gnulib, causing a 
few minor changes to Gnulib.

I also installed the attached patch, to work around more the places 
where Gnulib drags in some multithreading and/or locale code that GNU 
diff (which is single-threaded and not that picky about locales) doesn't 
need. Not sure if these suggest any Gnulib changes.
[0001-maint-reduce-Gnulib-module-usage.patch (text/x-patch, attachment)]

Information forwarded to bug-diffutils <at> gnu.org:
bug#79382; Package diffutils. (Sun, 07 Sep 2025 22:02:02 GMT) Full text and rfc822 format available.

Message #13 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: bug-diffutils <at> gnu.org
Subject: Re: commit e124541148d38cd8b7f962aceb72fb44e7cc0aab
Date: Mon, 08 Sep 2025 00:00:21 +0200
Paul Eggert wrote on 2025-09-05:
> I also installed the attached patch, to work around more the places 
> where Gnulib drags in some multithreading and/or locale code that GNU 
> diff (which is single-threaded and not that picky about locales) doesn't 
> need. Not sure if these suggest any Gnulib changes.

The changes in configure.ac regarding mbrtowc and mbrtoc32
are specific to diffutils, I would say. I'm not even sure what you
mean by "Do not worry about multibyte C locales." : glibc and many
other platform have a "C.UTF-8" locale, which behaves like the "C"
locale regarding i18n, but is multibyte.

The changes in configure.ac regarding 'threadlib' are a customization
possibility that is already provided by Gnulib.

The two functions hard_locale_LC_MESSAGES and hard_locale_LC_TIME
look like heuristics to me; I wouldn't bet that they are correct
in all situations.

So, I don't see worthwile Gnulib changes in these areas.

The change in configure.ac finally

+AC_DEFINE([SUPPORT_NON_GREG_CALENDARS_IN_STRFTIME], [false],
+  [Do not worry about GNU strftime behavior for non-Gregorian calendars.])

breaks "make distcheck":

make[1]: Entering directory '/home/runner/work/ci-check/ci-check/diffutils'
make my-distcheck
make[2]: Entering directory '/home/runner/work/ci-check/ci-check/diffutils'
make syntax-check
make[3]: Entering directory '/home/runner/work/ci-check/ci-check/diffutils'
GFDL_version
/usr/bin/grep: .gitmodules: No such file or directory
0.01 GFDL_version
GPL_version
/usr/bin/grep: .gitmodules: No such file or directory
0.01 GPL_version
Wundef_boolean
./lib/config.h:2334:#define SUPPORT_NON_GREG_CALENDARS_IN_STRFTIME false
maint.mk: Use 0 or 1 for macro values
make[3]: *** [maint.mk:1458: sc_Wundef_boolean] Error 1
make[3]: Leaving directory '/home/runner/work/ci-check/ci-check/diffutils'
make[2]: *** [dist-check.mk:148: my-distcheck] Error 2
make[2]: Leaving directory '/home/runner/work/ci-check/ci-check/diffutils'
make[1]: *** [Makefile:2879: distcheck-hook] Error 2
make[1]: Leaving directory '/home/runner/work/ci-check/ci-check/diffutils'
make: *** [Makefile:2662: distcheck] Error 1

Either SUPPORT_NON_GREG_CALENDARS_IN_STRFTIME needs to be defined to 0,
not false. Or the syntax check sc_Wundef_boolean needs to be tweaked.

Bruno







Information forwarded to bug-diffutils <at> gnu.org:
bug#79382; Package diffutils. (Mon, 08 Sep 2025 00:02:06 GMT) Full text and rfc822 format available.

Message #16 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Bruno Haible <bruno <at> clisp.org>
Cc: bug-diffutils <at> gnu.org
Subject: Re: commit e124541148d38cd8b7f962aceb72fb44e7cc0aab
Date: Sun, 7 Sep 2025 17:00:31 -0700
[Message part 1 (text/plain, inline)]
On 2025-09-07 15:00, Bruno Haible wrote:
> I'm not even sure what you
> mean by "Do not worry about multibyte C locales."

I meant to not worry about platforms where the "C" (not "C.utf8") locale 
is multibyte. I don't know of how diffutils would misbehave in such 
locales (other than not be strictly POSIX-conforming in unusual cases 
where native tools aren't either), so I wanted Gnulib to not worry about 
the possibility.


> The two functions hard_locale_LC_MESSAGES and hard_locale_LC_TIME
> look like heuristics to me; I wouldn't bet that they are correct
> in all situations.

For what it's worth, GNU Emacs has used a similar heuristic for a decade 
(see emacs/src/emacs.c's using_utf8) without reported trouble. 
Admittedly this part of Emacs is not the mainline as Emacs normally uses 
its own UTF-8 decoder, but I think it unlikely that the mentioned 
functions will misbehave in practice (and if they do, surely can fix 
them without needing support for multithreading and locks).

To some extent everything in this area is a heuristic, even Gnulib's 
hard_locale which is what diffutils formerly used. If the heuristic 
works in practice, that's good enough.


> +AC_DEFINE([SUPPORT_NON_GREG_CALENDARS_IN_STRFTIME], [false],
> +  [Do not worry about GNU strftime behavior for non-Gregorian calendars.])

> Either SUPPORT_NON_GREG_CALENDARS_IN_STRFTIME needs to be defined to 0,
> not false. Or the syntax check sc_Wundef_boolean needs to be tweaked.
Thanks for mentioning that. lib/strftime.c's comment suggests 'false', 
which is why I defined it to 'false'.

These days it should be OK to use 'true' and 'false' due to C23 and the 
near-ubiquitous use of the 'bool' module, so I installed the attached.
[0001-maint-allow-false-true-in-C-macros.patch (text/x-patch, attachment)]

Information forwarded to bug-diffutils <at> gnu.org:
bug#79382; Package diffutils. (Mon, 08 Sep 2025 07:15:02 GMT) Full text and rfc822 format available.

Message #19 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: bug-diffutils <at> gnu.org
Subject: Re: commit e124541148d38cd8b7f962aceb72fb44e7cc0aab
Date: Mon, 08 Sep 2025 09:13:38 +0200
Paul Eggert wrote:
> > I'm not even sure what you
> > mean by "Do not worry about multibyte C locales."
> 
> I meant to not worry about platforms where the "C" (not "C.utf8") locale 
> is multibyte. I don't know of how diffutils would misbehave in such 
> locales (other than not be strictly POSIX-conforming in unusual cases 
> where native tools aren't either), so I wanted Gnulib to not worry about 
> the possibility.

This possibility actually occurs on Android ≥ 5.0. Comments in gnulib/tests/
say:
     On Android ≥ 5.0, the default locale is the "C.UTF-8" locale, not the
     "C" locale.  Furthermore, when you attempt to set the "C" or "POSIX"
     locale via setlocale(), what you get is a "C" locale with UTF-8 encoding,
     that is, effectively the "C.UTF-8" locale.

> > The two functions hard_locale_LC_MESSAGES and hard_locale_LC_TIME
> > look like heuristics to me; I wouldn't bet that they are correct
> > in all situations.
> 
> For what it's worth, GNU Emacs has used a similar heuristic for a decade 
> (see emacs/src/emacs.c's using_utf8) without reported trouble.

For the LC_CTYPE locale category, the code you refer to w.r.t. Emacs and that
you recently added in Gnulib (modules quotearg, propername-lite) looks safe,
because there are only finitely many locale encodings (and none will be
added in the future, hopefully). But for LC_MESSAGES and LC_TIME, there
are some assumptions:
  * hard_locale_LC_MESSAGES assumes that
      - diffutils.pot contains the strings from lib/version-etc.c
        (which are now actually in gnulib.pot),
      - the translator will not translate "(C)" by "(C)",
      - the user does not use LANGUAGE with a precedence list.
  * hard_locale_LC_TIME assumes that no locale, not even the en_US locale,
    uses the same internal format string for "%c" as the C locale.

Bruno







Information forwarded to bug-diffutils <at> gnu.org:
bug#79382; Package diffutils. (Tue, 09 Sep 2025 17:37:02 GMT) Full text and rfc822 format available.

Message #22 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Bruno Haible <bruno <at> clisp.org>
Cc: bug-diffutils <at> gnu.org
Subject: Re: commit e124541148d38cd8b7f962aceb72fb44e7cc0aab
Date: Tue, 9 Sep 2025 10:35:49 -0700
[Message part 1 (text/plain, inline)]
On 2025-09-08 00:13, Bruno Haible wrote:
> Paul Eggert wrote:
>> I meant to not worry about platforms where the "C" (not "C.utf8") locale
>> is multibyte. I don't know of how diffutils would misbehave in such
>> locales (other than not be strictly POSIX-conforming in unusual cases
>> where native tools aren't either), so I wanted Gnulib to not worry about
>> the possibility.
> 
> This possibility actually occurs on Android ≥ 5.0.

Yes, and if it causes a real problem in diffutils we should fix that as 
it comes up. I don't offhand know why it'd be a real problem.

>   * hard_locale_LC_TIME assumes that no locale, not even the en_US locale,
>     uses the same internal format string for "%c" as the C locale.

The assumption is a bit different: it's merely that only in the POSIX 
locale does "%c" produce that output for that particular date. Even if 
the other locale uses the same internal format string "%a %b %e %T %Y", 
in a non-English locale if it's quite likely they won't match POSIX's 
English-language abbreviations.

I don't know of any platform where hard_locale_LC_TIME incorrectly 
returns false. However, even if it does, diffutils' behavior will still 
be OK: it'll conform to POSIX and users will surely understand the 
output. And if a user complains about this extremely minor glitch I 
assume we can fix that as it comes up.

>        - the translator will not translate "(C)" by "(C)",
>        - the user does not use LANGUAGE with a precedence list.

Not quite following, but it's OK if in unusual cases the program outputs 
"(C)" when "©" would be better, so long as in ordinary cases "©" is 
output when it works, and so long as "©" is not output when it would 
display as gibberish.

>    * hard_locale_LC_MESSAGES assumes that
>        - diffutils.pot contains the strings from lib/version-etc.c
>          (which are now actually in gnulib.pot),

Yes, that's a problem, and thanks for mentioning it. It stems from quite 
a comedy of errors:

(a) diffutils' en translation is not installed.

(b) cmp looks in the wrong catalog for the "(C)" message.

(c) The gnulib.pot/gnulib.mo mechanism is not yet working widely even 
for packages other than diffutils. On current Fedora 42 if I run this 
shell command:

  LC_ALL=en_US.utf8 cat --version

although I see "Torbjörn" in UTF-8 as desired, I also see "Copyright (C) 
2025" which is wrong: it should be "Copyright © 2025". Worse, I see the 
exact same English message when I run this shell command:

  LC_ALL=fr_FR.utf8 cat --version

This is because even though 
/usr/share/locale/fr/LC_MESSAGES/coreutils.mo is installed, there is no 
file /usr/share/locale/fr/LC_MESSAGES/gnulib.mo, and Fedora does not 
supply a gnulib.mo file in any package that I can see. I reported this 
newish bug to Fedora yesterday 
<https://bugzilla.redhat.com/show_bug.cgi?id=2393892>.

(d) In response to that Fedora bug report, Lukáš Zaoral set in motion a 
fix. But he asked, "Since gnulib is meant to be bundled, how do you deal 
with the situation when the messages in the sources of the bundled
gnulib and gnulib-i10n differ? Do you have some upstream policy to make 
sure that they don't diverge?" Do we have an answer for that? I'm not 
sure myself.

(e) Even if Fedora started installing a gnulib.mo file, diffutils "make 
install" does not install such a file, so a standalone build of 
diffutils with './configure --prefix' would not work since it does not 
install gnulib.mo.


Given all this configuration mess, for now I took the following 
conservative approach in Diffutils.

(0) Update diffutils to use need-formatstring-macros when calling 
AM_GNU_GETTEXT. I discovered this issue while looking into the other 
problems. Perhaps need-formatstring-macros should be the only behavior 
nowadays? It hardly seems worth the hassle about worrying about older 
gettext versions.

(1) Change cmp's hard_locale_LC_MESSAGES to test via setlocale, not via 
gettext. setlocale should work fine if ENABLE_NLS is nonzero in Diffutils.

(2) Remove diffutils' po/en.po file. It is an unused revenant.

(3) Stick with the longstanding approach of having the Diffutils message 
catalog translate all messages, including those taken from Gnulib. This 
has worked for decades, translators are used to it, and the Gnulib part 
of the catalog hardly ever changes.

(4) Modify Gnulib to let Diffutils override the textdomain that Gnulib 
uses. Done via Gnulib commit 
<https://cgit.git.savannah.gnu.org/cgit/gnulib.git/commit/?id=2b2bcdbc3bf3de2838a4b5051e32366e9a94f1e3>.

(5) Use this new Gnulib feature in Diffutils.

I installed the attached patches to Diffutils to do this.

An alternative to (4) and (5) would be to let config.h specify the "_" 
macro, and have Gnulib define this macro only if it is not already 
defined. This would make for slightly smaller executables. However, it 
would be brittler and more intrusive. Or perhaps you can think of a 
better way to do what is wanted in (3).
[0001-maint-use-need-formatstring-macros.patch (text/x-patch, attachment)]
[0002-cmp-improve-LC_MESSAGES-test.patch (text/x-patch, attachment)]
[0003-maint-remove-po-en.po.patch (text/x-patch, attachment)]
[0004-build-update-gnulib-submodule-to-latest.patch (text/x-patch, attachment)]
[0005-maint-use-our-textdomain-for-Gnulib.patch (text/x-patch, attachment)]

This bug report was last modified 3 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.