GNU bug report logs - #35785
'string->uri' fails in sv_SE locale

Previous Next

Package: guile;

Reported by: Einar Largenius <einar.largenius <at> gmail.com>

Date: Fri, 17 May 2019 21:21:01 UTC

Severity: important

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


Message #24 received at 35785 <at> debbugs.gnu.org (full text, mbox):

From: Ricardo Wurmus <rekado <at> elephly.net>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 35785 <at> debbugs.gnu.org, Einar Largenius <einar.largenius <at> gmail.com>
Subject: Re: bug#35785: ‘string->uri’ is
 locale-dependent and breaks in ‘sv_SE’
Date: Mon, 27 May 2019 13:05:29 +0200
Ludovic Courtès <ludo <at> gnu.org> writes:

> Using the “lower” regexp class instead of “[a-z]” works:
>
> --8<---------------cut here---------------start------------->8---
> scheme@(guile-user)> (string-match "[[:lower:]]" "w")
> $12 = #("w" (0 . 1))
> --8<---------------cut here---------------end--------------->8---
>
> However, it’s not clear to me whether the “lower” class is supposed to
> be the same for all locales or if we’re just lucky:
>
>   http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html
>
> Thoughts?

The lower class is much larger than [a-z].  If we only wanted to work
around this particular problem we could explicitly spell out the range,
which would be the same in all locales.  (Obviously, that wouldn’t be
pretty.)

But can’t URI parts contain more than those characters?  To circumvent
the question whether the lower class is locale dependent we could
generate an explicit range from a charset.

--
Ricardo





This bug report was last modified 6 years and 72 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.