GNU bug report logs - #79382
patch needed after gnulib changed

Previous Next

Package: diffutils;

Reported by: Bruno Haible <bruno <at> clisp.org>

Date: Wed, 3 Sep 2025 22:55:02 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Full log


Message #19 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: bug-diffutils <at> gnu.org
Subject: Re: commit e124541148d38cd8b7f962aceb72fb44e7cc0aab
Date: Mon, 08 Sep 2025 09:13:38 +0200
Paul Eggert wrote:
> > I'm not even sure what you
> > mean by "Do not worry about multibyte C locales."
> 
> I meant to not worry about platforms where the "C" (not "C.utf8") locale 
> is multibyte. I don't know of how diffutils would misbehave in such 
> locales (other than not be strictly POSIX-conforming in unusual cases 
> where native tools aren't either), so I wanted Gnulib to not worry about 
> the possibility.

This possibility actually occurs on Android ≥ 5.0. Comments in gnulib/tests/
say:
     On Android ≥ 5.0, the default locale is the "C.UTF-8" locale, not the
     "C" locale.  Furthermore, when you attempt to set the "C" or "POSIX"
     locale via setlocale(), what you get is a "C" locale with UTF-8 encoding,
     that is, effectively the "C.UTF-8" locale.

> > The two functions hard_locale_LC_MESSAGES and hard_locale_LC_TIME
> > look like heuristics to me; I wouldn't bet that they are correct
> > in all situations.
> 
> For what it's worth, GNU Emacs has used a similar heuristic for a decade 
> (see emacs/src/emacs.c's using_utf8) without reported trouble.

For the LC_CTYPE locale category, the code you refer to w.r.t. Emacs and that
you recently added in Gnulib (modules quotearg, propername-lite) looks safe,
because there are only finitely many locale encodings (and none will be
added in the future, hopefully). But for LC_MESSAGES and LC_TIME, there
are some assumptions:
  * hard_locale_LC_MESSAGES assumes that
      - diffutils.pot contains the strings from lib/version-etc.c
        (which are now actually in gnulib.pot),
      - the translator will not translate "(C)" by "(C)",
      - the user does not use LANGUAGE with a precedence list.
  * hard_locale_LC_TIME assumes that no locale, not even the en_US locale,
    uses the same internal format string for "%c" as the C locale.

Bruno







This bug report was last modified 3 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.