GNU bug report logs - #73660
[PATCH] gexp: Improve support of Unicode characters.

Previous Next

Package: guix-patches;

Reported by: Tomas Volf <~@wolfsden.cz>

Date: Sun, 6 Oct 2024 15:44:01 UTC

Severity: normal

Tags: patch

Done: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Ludovic Courtès <ludo <at> gnu.org>
To: Tomas Volf <~@wolfsden.cz>
Cc: Josselin Poiret <dev <at> jpoiret.xyz>, Maxim Cournoyer <maxim.cournoyer <at> gmail.com>, Simon Tournier <zimon.toutoune <at> gmail.com>, Mathieu Othacehe <othacehe <at> gnu.org>, Tobias Geerinckx-Rice <me <at> tobias.gr>, Florian Pelz <pelzflorian <at> pelzflorian.de>, 73660 <at> debbugs.gnu.org, Christopher Baines <guix <at> cbaines.net>
Subject: [bug#73660] [PATCH] gexp: Improve support of Unicode characters.
Date: Sun, 12 Jan 2025 16:19:35 +0100
Hello,

Tomas Volf <~@wolfsden.cz> skribis:

> * guix/gexp.scm (computed-file):  Set LANG to C.UTF-8 by default.
> (compiled-modules): Try to `setlocale'.
> (gexp->script), (gexp->file): New `locale' argument defaulting to C.UTF-8.
> (text-file*): Set output port encoding to UTF-8.
> * doc/guix.texi (G-Expressions)[computed-file]: Document the changes.  Use
> @var.  Document #:guile.
> [gexp->script]: Document #:locale.  Fix default value for #:target.
> [gexp->file]: Document #:locale, #:system and #:target.
>
> Change-Id: Ib323b51af88a588b780ff48ddd04db8be7c729fb

[...]

>  (define* (computed-file name gexp
> -                        #:key guile (local-build? #t) (options '()))
> +                        #:key
> +                        guile
> +                        (local-build? #t)
> +                        (options '(#:env-vars (("LANG" . "C.UTF-8")))))

I’d suggest LC_CTYPE (or LC_ALL?) rather than LANG.

Also, what about making it the default for the #:env-vars of
‘gexp->derivation’?  That way it wouldn’t need to be repeated in several
places.

> @@ -1700,6 +1703,9 @@ (define* (compiled-modules modules
>                         (system base target)
>                         (system base compile))
>  
> +          ;; Best effort.  The locale is not installed in all contexts.
> +          (false-if-exception (setlocale LC_ALL "C.UTF-8"))

Sounds good.  I would make it a separate patch.

s/in all contexts/when cross-compiling/

> @@ -1990,7 +1996,8 @@ (define* (gexp->script name exp
>                         #:key (guile (default-guile))
>                         (module-path %load-path)
>                         (system (%current-system))
> -                       (target 'current))
> +                       (target 'current)
> +                       (locale "C.UTF-8"))

I would remove this argument and instead add an explicit, hard-coded:

  (set-port-encoding! port "UTF-8")

in the body of ‘call-with-output-file’ here, just like you did below.

>  (define* (text-file* name #:rest text)
>    "Return as a monadic value a derivation that builds a text file containing
> @@ -2108,6 +2119,7 @@ (define* (text-file* name #:rest text)
>    (define builder
>      (gexp (call-with-output-file (ungexp output "out")
>              (lambda (port)
> +              (set-port-encoding! port "UTF-8")
>                (display (string-append (ungexp-splicing text)) port)))))

LGTM.  This can be moved to a separate file.

How does that sound?

Apologies for not replying earlier!

Ludo’.




This bug report was last modified 120 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.