Ludovic Courtès writes: > Hello, > > Tomas Volf <~@wolfsden.cz> skribis: > >> * guix/gexp.scm (computed-file): Set LANG to C.UTF-8 by default. >> (compiled-modules): Try to `setlocale'. >> (gexp->script), (gexp->file): New `locale' argument defaulting to C.UTF-8. >> (text-file*): Set output port encoding to UTF-8. >> * doc/guix.texi (G-Expressions)[computed-file]: Document the changes. Use >> @var. Document #:guile. >> [gexp->script]: Document #:locale. Fix default value for #:target. >> [gexp->file]: Document #:locale, #:system and #:target. >> >> Change-Id: Ib323b51af88a588b780ff48ddd04db8be7c729fb > > [...] > >> (define* (computed-file name gexp >> - #:key guile (local-build? #t) (options '())) >> + #:key >> + guile >> + (local-build? #t) >> + (options '(#:env-vars (("LANG" . "C.UTF-8"))))) > > I’d suggest LC_CTYPE (or LC_ALL?) rather than LANG. Oh, yeah, you are right, after reading the specification (and verifying Guile takes the variable into account), LC_CTYPE seems like a better fit. > > Also, what about making it the default for the #:env-vars of > ‘gexp->derivation’? That way it wouldn’t need to be repeated in several > places. I *think* the original motivation was to keep the gexp->derivation as impartial as possible, since I do not know what people are using it for. But it is some time since I wrote this, so I am not fully sure. There are few downsides. Testing any changes now takes a long time since I need to do full bootstrap due to changing gexp->derivation and there are (I presume new and harmless) warnings `warning: failed to install locale: Invalid argument' during the bootstrap. But the change itself is much more localized now and the Unicode support more likely to "just work" for any new gexp forms added in the future. So maybe you are right and it is a right way. Anyway, I followed the suggestion and v2 moves the LC_CTYPE setting to gexp->derivation. The test script from the commit message still (after many hours of bootstrapping) works. > >> @@ -1700,6 +1703,9 @@ (define* (compiled-modules modules >> (system base target) >> (system base compile)) >> >> + ;; Best effort. The locale is not installed in all contexts. >> + (false-if-exception (setlocale LC_ALL "C.UTF-8")) > > Sounds good. I would make it a separate patch. Somewhat done. I have made it a separate commit, but still included in v2. > > s/in all contexts/when cross-compiling/ Interesting, I have modified the comment, however would you be willing to expand on this a bit? Why is the C.UTF-8 locale not available when cross-compiling? The Guile running this script runs on the build host, using build host's glibc and build host's locale definitions no? So I assumed the locale *should* be available. I feel like I am missing something fundamental about how Guix works here. > >> @@ -1990,7 +1996,8 @@ (define* (gexp->script name exp >> #:key (guile (default-guile)) >> (module-path %load-path) >> (system (%current-system)) >> - (target 'current)) >> + (target 'current) >> + (locale "C.UTF-8")) > > I would remove this argument and instead add an explicit, hard-coded: > > (set-port-encoding! port "UTF-8") > > in the body of ‘call-with-output-file’ here, just like you did below. Done. > >> (define* (text-file* name #:rest text) >> "Return as a monadic value a derivation that builds a text file containing >> @@ -2108,6 +2119,7 @@ (define* (text-file* name #:rest text) >> (define builder >> (gexp (call-with-output-file (ungexp output "out") >> (lambda (port) >> + (set-port-encoding! port "UTF-8") >> (display (string-append (ungexp-splicing text)) port))))) > > LGTM. This can be moved to a separate file. By "separate file" you mean separate patch and/or commit? > > How does that sound? > > Apologies for not replying earlier! No worries, thank you for finding the time to look at this. ^_^ The v2 is much smaller. Tomas -- There are only two hard things in Computer Science: cache invalidation, naming things and off-by-one errors.