From debbugs-submit-bounces@debbugs.gnu.org Sun May 26 16:44:23 2019 Received: (at submit) by debbugs.gnu.org; 26 May 2019 20:44:23 +0000 Received: from localhost ([127.0.0.1]:51886 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hV00F-0004mV-3y for submit@debbugs.gnu.org; Sun, 26 May 2019 16:44:23 -0400 Received: from eggs.gnu.org ([209.51.188.92]:37212) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hV00E-0004mJ-0v for submit@debbugs.gnu.org; Sun, 26 May 2019 16:44:22 -0400 Received: from lists.gnu.org ([209.51.188.17]:52656) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hV008-0001n3-09 for submit@debbugs.gnu.org; Sun, 26 May 2019 16:44:16 -0400 Received: from eggs.gnu.org ([209.51.188.92]:51575) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hV004-0003mr-RH for bug-guile@gnu.org; Sun, 26 May 2019 16:44:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hUzzx-0001aX-KY for bug-guile@gnu.org; Sun, 26 May 2019 16:44:11 -0400 Received: from world.peace.net ([64.112.178.59]:39876) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hUzzw-0001Qk-TE for bug-guile@gnu.org; Sun, 26 May 2019 16:44:04 -0400 Received: from mhw by world.peace.net with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1hUzzj-0000L8-QU; Sun, 26 May 2019 16:43:51 -0400 From: Mark H Weaver To: Christopher Lam Subject: strftime incorrectly assumes that nstrftime will produce UTF-8 References: <877ebt7tc0.fsf@netris.org> <87tvew4efa.fsf@netris.org> <875zrb3ydk.fsf@netris.org> <871s1z3tbq.fsf@netris.org> Date: Sun, 26 May 2019 16:41:57 -0400 In-Reply-To: (Christopher Lam's message of "Sun, 26 May 2019 18:52:16 +0800") Message-ID: <87v9xxq767.fsf_-_@netris.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 64.112.178.59 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Spam-Score: -1.3 (-) X-Debbugs-Envelope-To: submit Cc: bug-guile@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) Hi Christopher, Christopher Lam writes: > Addendum - wish to confirm if guile bug (guile-2.2 on Windows): > - set locale to non-Anglo so that (setlocale LC_ALL) returns > "French_France.1252" > - call (strftime "%B" 4000000) - that's 4x10^6 -- this should return > "f=C3=A9vrier 1970" > > but the following error arises: > Throw to key `decoding-error' with args `("scm_from_utf8_stringn" "input > locale conversion error" 0 #vu8(102 233 118 114 105 101 114 32 49 57 55 > 48))'. > > Is this a bug? Yes. Guile's 'strftime' procedure currently assumes that the underlying 'nstrftime' C function (from Gnulib) will produce output in UTF-8, although it almost certainly produces output in the locale encoding. Indeed, the bytevector #vu8(102 233 118 114 105 101 114 32 49 57 55 48) represents the characters "f=C3=A9vrier 1970" in Windows-1252 encoding. I'm CC'ing this reply to , so that a bug ticket will be created. In the future, that's the preferred address for sending bug reports. Anyway, thanks for letting us know about this. I'll work on it soon. Mark From debbugs-submit-bounces@debbugs.gnu.org Sun May 26 16:55:10 2019 Received: (at 35920) by debbugs.gnu.org; 26 May 2019 20:55:10 +0000 Received: from localhost ([127.0.0.1]:51898 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hV0Ag-00056m-76 for submit@debbugs.gnu.org; Sun, 26 May 2019 16:55:10 -0400 Received: from world.peace.net ([64.112.178.59]:58088) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hV0Ad-00056M-1g for 35920@debbugs.gnu.org; Sun, 26 May 2019 16:55:07 -0400 Received: from mhw by world.peace.net with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1hV0AX-0000SM-9g; Sun, 26 May 2019 16:55:01 -0400 From: Mark H Weaver To: Christopher Lam Subject: Re: bug#35920: strftime incorrectly assumes that nstrftime will produce UTF-8 References: <877ebt7tc0.fsf@netris.org> <87tvew4efa.fsf@netris.org> <875zrb3ydk.fsf@netris.org> <871s1z3tbq.fsf@netris.org> <87v9xxq767.fsf_-_@netris.org> Date: Sun, 26 May 2019 16:53:08 -0400 In-Reply-To: <87v9xxq767.fsf_-_@netris.org> (Mark H. Weaver's message of "Sun, 26 May 2019 16:41:57 -0400") Message-ID: <87muj9q6nk.fsf@netris.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 35920 Cc: 35920@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) There might also be related problems with 'strptime'. These problems date back to when Guile was first extended to support non-ASCII strings. Here's the relevant commit in 2009 that added non-ASCII support to 'strftime' and 'strptime', but did so imperfectly: 587a33556fdef90025c1b7d4d172af649c8ebba8 Mark From debbugs-submit-bounces@debbugs.gnu.org Sun May 26 17:50:28 2019 Received: (at 35920) by debbugs.gnu.org; 26 May 2019 21:50:28 +0000 Received: from localhost ([127.0.0.1]:51999 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hV12C-0006VD-Cw for submit@debbugs.gnu.org; Sun, 26 May 2019 17:50:28 -0400 Received: from world.peace.net ([64.112.178.59]:58144) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hV12B-0006Uy-6N for 35920@debbugs.gnu.org; Sun, 26 May 2019 17:50:27 -0400 Received: from mhw by world.peace.net with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1hV125-000105-1J; Sun, 26 May 2019 17:50:21 -0400 From: Mark H Weaver To: Christopher Lam Subject: Re: bug#35920: strftime incorrectly assumes that nstrftime will produce UTF-8 References: <877ebt7tc0.fsf@netris.org> <87tvew4efa.fsf@netris.org> <875zrb3ydk.fsf@netris.org> <871s1z3tbq.fsf@netris.org> <87v9xxq767.fsf_-_@netris.org> <87muj9q6nk.fsf@netris.org> Date: Sun, 26 May 2019 17:48:27 -0400 In-Reply-To: <87muj9q6nk.fsf@netris.org> (Mark H. Weaver's message of "Sun, 26 May 2019 16:53:08 -0400") Message-ID: <87imtwrint.fsf@netris.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 35920 Cc: 35920@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Here's a patch that might fix the problem, but I don't have time to test it right now. Mark --8<---------------cut here---------------start------------->8--- diff --git a/libguile/stime.c b/libguile/stime.c index b681d7ee3..9a21b61fe 100644 --- a/libguile/stime.c +++ b/libguile/stime.c @@ -662,9 +662,9 @@ SCM_DEFINE (scm_strftime, "strftime", 2, 0, 0, SCM_VALIDATE_STRING (1, format); bdtime2c (stime, &t, SCM_ARG2, FUNC_NAME); - /* Convert string to UTF-8 so that non-ASCII characters in the - format are passed through unchanged. */ - fmt = scm_to_utf8_stringn (format, &len); + /* Convert the format string to the locale encoding, as the underlying + 'strftime' C function expects. */ + fmt = scm_to_locale_stringn (format, &len); /* Ugly hack: strftime can return 0 if its buffer is too small, but some valid time strings (e.g. "%p") can sometimes produce @@ -727,7 +727,7 @@ SCM_DEFINE (scm_strftime, "strftime", 2, 0, 0, #endif } - result = scm_from_utf8_string (tbuf + 1); + result = scm_from_locale_string (tbuf + 1); free (tbuf); free (myfmt); #if HAVE_STRUCT_TM_TM_ZONE @@ -754,16 +754,16 @@ SCM_DEFINE (scm_strptime, "strptime", 2, 0, 0, { struct tm t; char *fmt, *str, *rest; - size_t used_len; + SCM used_len; long zoff; SCM_VALIDATE_STRING (1, format); SCM_VALIDATE_STRING (2, string); - /* Convert strings to UTF-8 so that non-ASCII characters are passed - through unchanged. */ - fmt = scm_to_utf8_string (format); - str = scm_to_utf8_string (string); + /* Convert strings to the locale encoding, as the underlying + 'strptime' C function expects. */ + fmt = scm_to_locale_string (format); + str = scm_to_locale_string (string); /* initialize the struct tm */ #define tm_init(field) t.field = 0 @@ -807,14 +807,14 @@ SCM_DEFINE (scm_strptime, "strptime", 2, 0, 0, zoff = 0; #endif - /* Compute the number of UTF-8 characters. */ - used_len = u8_strnlen ((scm_t_uint8*) str, rest-str); + /* Compute the number of characters parsed. */ + used_len = scm_string_length (scm_from_locale_stringn (str, rest-str)); scm_remember_upto_here_2 (format, string); free (str); free (fmt); return scm_cons (filltime (&t, zoff, NULL), - scm_from_signed_integer (used_len)); + used_len); } #undef FUNC_NAME #endif /* HAVE_STRPTIME */ --8<---------------cut here---------------end--------------->8--- From debbugs-submit-bounces@debbugs.gnu.org Sun May 26 20:29:54 2019 Received: (at submit) by debbugs.gnu.org; 27 May 2019 00:29:54 +0000 Received: from localhost ([127.0.0.1]:52192 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hV3WT-0002Gs-TE for submit@debbugs.gnu.org; Sun, 26 May 2019 20:29:54 -0400 Received: from eggs.gnu.org ([209.51.188.92]:33991) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hV38O-0001am-Sx for submit@debbugs.gnu.org; Sun, 26 May 2019 20:05:01 -0400 Received: from lists.gnu.org ([209.51.188.17]:46558) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hV38J-0005pL-RA for submit@debbugs.gnu.org; Sun, 26 May 2019 20:04:55 -0400 Received: from eggs.gnu.org ([209.51.188.92]:48373) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hV38I-0003UZ-Me for bug-guile@gnu.org; Sun, 26 May 2019 20:04:55 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM, HTML_MESSAGE,URIBL_BLOCKED autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hV38H-0005oK-Ib for bug-guile@gnu.org; Sun, 26 May 2019 20:04:54 -0400 Received: from mail-ed1-x531.google.com ([2a00:1450:4864:20::531]:33592) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hV38H-0005nR-BI for bug-guile@gnu.org; Sun, 26 May 2019 20:04:53 -0400 Received: by mail-ed1-x531.google.com with SMTP id n17so24021019edb.0 for ; Sun, 26 May 2019 17:04:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=7ae3WpF2b3mO0GO/rJ+oR0QNGx3MPn7M0UDfrvJEOm4=; b=R0HfeeMrhhAE25kvOtURwZht4kNcG4aEMs3SFWvemRPcQBaqkyXO2QGQjOOXzjdhb9 H06E3I0vGB0qRFUO7Mj7MgR2Vi/DU3Mxx6Fn6/vAJ/8w8AM/KnbCov9MkSLgQ+UX0e+I qpQKwW31sVZJJhIDjp22g0pW7CsHqAMfwRi+y6a2TG9d7Ge6JtwxYDHF2miC4AoytSmf fVknf1p6AXNM9ciGVJcj/6UxqxX4ORuHJF5TBr9HsOmNuwMNv39P4Fqu5PpJsl5uJdap raqky8YGprRDKOOlgaEoHmFW4+BmjCYJ0v7M0X9d7NU6e8cLi9rvjk+3crL0+hJX1R2r sH1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=7ae3WpF2b3mO0GO/rJ+oR0QNGx3MPn7M0UDfrvJEOm4=; b=YvZJjWyklAbzGbqS9VQLIp3Ak4UsRNNwGkjA8su/XdMYb45l/gPAyUSiyRxEuogJ6/ 2J+nPSoIl7K1mCzftJy7t+DPofJhYiVyPwfvQMaFC/vDohOK3Rkk8eUDY6h+/cUmlhmH o76zldY6McBwEj6lrg1CMQ3gRXfwguklPmWuC3Q2TQzaLfGWo445oeiDKD3+TdugBe+e jAuhEQHjkH0gEdIVyV48pJFoYy/6VRUzOtPpTQIPRWZNRGp5VAJD3vh9gC6/VJ7mgSQZ rntcg82jhu2ocR4s096plh7xGMj4TUQ9hYbUiqIKxaBNdCHIQYeJdOKGPTpGoNFiYNow 8X/g== X-Gm-Message-State: APjAAAWqeJiq0bnNhSFr5FR9FHe5cL6FgWKATw8+PloMGP1UT4/svr9l ZqBtO25TQcvhWyZkFkdgih7iNb1s3pZZUnzJwCg= X-Google-Smtp-Source: APXvYqwSKPfbiZo3nCklOJHc1HthESvvRcfhgkixl4hT8C4YJ/jRQ9x4ThYImhiBS1H71oZC9hJ6MxjkbLoWM/4U+Jk= X-Received: by 2002:a17:906:31d4:: with SMTP id f20mr52550650ejf.275.1558915490802; Sun, 26 May 2019 17:04:50 -0700 (PDT) MIME-Version: 1.0 References: <877ebt7tc0.fsf@netris.org> <87tvew4efa.fsf@netris.org> <875zrb3ydk.fsf@netris.org> <871s1z3tbq.fsf@netris.org> <87v9xxq767.fsf_-_@netris.org> In-Reply-To: <87v9xxq767.fsf_-_@netris.org> From: Christopher Lam Date: Mon, 27 May 2019 10:04:39 +1000 Message-ID: Subject: Re: strftime incorrectly assumes that nstrftime will produce UTF-8 To: Mark H Weaver Content-Type: multipart/alternative; boundary="00000000000010720a0589d34c1e" X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:4864:20::531 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Spam-Score: -1.3 (-) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Sun, 26 May 2019 20:29:52 -0400 Cc: bug-guile@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) --00000000000010720a0589d34c1e Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thanks! I'm glad to know this. I have adequate fluency in guile now but very basic C hence some bugs are very opaque to me. On Mon., 27 May 2019, 04:43 Mark H Weaver, wrote: > Hi Christopher, > > Christopher Lam writes: > > > Addendum - wish to confirm if guile bug (guile-2.2 on Windows): > > - set locale to non-Anglo so that (setlocale LC_ALL) returns > > "French_France.1252" > > - call (strftime "%B" 4000000) - that's 4x10^6 -- this should return > > "f=C3=A9vrier 1970" > > > > but the following error arises: > > Throw to key `decoding-error' with args `("scm_from_utf8_stringn" "inpu= t > > locale conversion error" 0 #vu8(102 233 118 114 105 101 114 32 49 57 55 > > 48))'. > > > > Is this a bug? > > Yes. Guile's 'strftime' procedure currently assumes that the underlying > 'nstrftime' C function (from Gnulib) will produce output in UTF-8, > although it almost certainly produces output in the locale encoding. > Indeed, the bytevector #vu8(102 233 118 114 105 101 114 32 49 57 55 48) > represents the characters "f=C3=A9vrier 1970" in Windows-1252 encoding. > > I'm CC'ing this reply to , so that a bug ticket will > be created. In the future, that's the preferred address for sending bug > reports. > > Anyway, thanks for letting us know about this. I'll work on it soon. > > Mark > --00000000000010720a0589d34c1e Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thanks! I'm glad to know this. I have adequate fluenc= y in guile now but very basic C hence some bugs are very opaque to me.
On Mo= n., 27 May 2019, 04:43 Mark H Weaver, <mhw@netris.org> wrote:
Hi C= hristopher,

Christopher Lam <christopher.lck@gmail.com> writes:

> Addendum - wish to confirm if guile bug (guile-2.2 on Windows):
> - set locale to non-Anglo so that (setlocale LC_ALL) returns
> "French_France.1252"
> - call (strftime "%B" 4000000) - that's 4x10^6 -- this s= hould return
> "f=C3=A9vrier 1970"
>
> but the following error arises:
> Throw to key `decoding-error' with args `("scm_from_utf8_stri= ngn" "input
> locale conversion error" 0 #vu8(102 233 118 114 105 101 114 32 49= 57 55
> 48))'.
>
> Is this a bug?

Yes.=C2=A0 Guile's 'strftime' procedure currently assumes that = the underlying
'nstrftime' C function (from Gnulib) will produce output in UTF-8,<= br> although it almost certainly produces output in the locale encoding.
Indeed, the bytevector #vu8(102 233 118 114 105 101 114 32 49 57 55 48)
represents the characters "f=C3=A9vrier 1970" in Windows-1252 enc= oding.

I'm CC'ing this reply to <bug-guile@gnu.org>, so that a bu= g ticket will
be created.=C2=A0 In the future, that's the preferred address for sendi= ng bug
reports.

Anyway, thanks for letting us know about this.=C2=A0 I'll work on it so= on.

=C2=A0 =C2=A0 =C2=A0 Mark
--00000000000010720a0589d34c1e-- From debbugs-submit-bounces@debbugs.gnu.org Sun Jun 30 15:51:52 2019 Received: (at 35920-done) by debbugs.gnu.org; 30 Jun 2019 19:51:52 +0000 Received: from localhost ([127.0.0.1]:47323 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hhfrc-0004aE-Fn for submit@debbugs.gnu.org; Sun, 30 Jun 2019 15:51:52 -0400 Received: from eggs.gnu.org ([209.51.188.92]:48698) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hhfra-0004ZX-2F for 35920-done@debbugs.gnu.org; Sun, 30 Jun 2019 15:51:50 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:37841) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hhfrU-0005Sj-J4; Sun, 30 Jun 2019 15:51:44 -0400 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=40010 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1hhfrU-000168-4K; Sun, 30 Jun 2019 15:51:44 -0400 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: Mark H Weaver Subject: Re: bug#35920: strftime incorrectly assumes that nstrftime will produce UTF-8 References: <877ebt7tc0.fsf@netris.org> <87tvew4efa.fsf@netris.org> <875zrb3ydk.fsf@netris.org> <871s1z3tbq.fsf@netris.org> <87v9xxq767.fsf_-_@netris.org> <87muj9q6nk.fsf@netris.org> <87imtwrint.fsf@netris.org> Date: Sun, 30 Jun 2019 21:51:42 +0200 In-Reply-To: <87imtwrint.fsf@netris.org> (Mark H. Weaver's message of "Sun, 26 May 2019 17:48:27 -0400") Message-ID: <874l46svg1.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 35920-done Cc: 35920-done@debbugs.gnu.org, Christopher Lam X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Hi Mark, Mark H Weaver skribis: > Here's a patch that might fix the problem, but I don't have time to test > it right now. It works! :-) I wrote tests and pushed it as ab2fd70ef1e36c6532128b73082809ef3c056556. I forgot to change the commit author to you before pushing, apologies! Thanks, Ludo=E2=80=99. From debbugs-submit-bounces@debbugs.gnu.org Sun Jun 30 17:13:10 2019 Received: (at 35920) by debbugs.gnu.org; 30 Jun 2019 21:13:10 +0000 Received: from localhost ([127.0.0.1]:47390 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hhh8H-0006fu-Pk for submit@debbugs.gnu.org; Sun, 30 Jun 2019 17:13:09 -0400 Received: from world.peace.net ([64.112.178.59]:55378) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hhh8F-0006f2-MY; Sun, 30 Jun 2019 17:13:08 -0400 Received: from mhw by world.peace.net with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1hhh89-0000uE-5o; Sun, 30 Jun 2019 17:13:01 -0400 From: Mark H Weaver To: Ludovic =?utf-8?Q?Court=C3=A8s?= Subject: Re: bug#35920: strftime incorrectly assumes that nstrftime will produce UTF-8 In-Reply-To: <874l46svg1.fsf@gnu.org> References: <877ebt7tc0.fsf@netris.org> <87tvew4efa.fsf@netris.org> <875zrb3ydk.fsf@netris.org> <871s1z3tbq.fsf@netris.org> <87v9xxq767.fsf_-_@netris.org> <87muj9q6nk.fsf@netris.org> <87imtwrint.fsf@netris.org> <874l46svg1.fsf@gnu.org> Date: Sun, 30 Jun 2019 17:12:45 -0400 Message-ID: <87blyekcaa.fsf@netris.org> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 35920 Cc: 35920@debbugs.gnu.org, Christopher Lam X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) reopen 35920 thanks Hi Ludovic, > Mark H Weaver skribis: > >> Here's a patch that might fix the problem, but I don't have time to test >> it right now. > > It works! :-) I wrote tests and pushed it as > ab2fd70ef1e36c6532128b73082809ef3c056556. On my system, I found that my proposed patch caused one of the existing tests to fail. The problem is that if the format string includes characters that are not representable in the current locale encoding, it will fail. It seems to me that this could break existing code that currently works. User code that uses 'strftime' might never encode the resulting string in the locale encoding. I was planning to rewrite the code to scan for the '%' escapes ourselves, to call 'strftime' for each escape sequence (without including the surrounding text), and to concatenate the results. > I forgot to change the commit author to you before pushing, apologies! No worries. Thanks for working on it. Mark From unknown Fri Jun 13 11:14:48 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: Did not alter fixed versions and reopened. Date: Sun, 30 Jun 2019 21:14:02 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # Did not alter fixed versions and reopened. thanks # This fakemail brought to you by your local debbugs # administrator From debbugs-submit-bounces@debbugs.gnu.org Sun Jun 30 18:37:47 2019 Received: (at 35920) by debbugs.gnu.org; 30 Jun 2019 22:37:48 +0000 Received: from localhost ([127.0.0.1]:47473 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hhiSB-0004Xs-HQ for submit@debbugs.gnu.org; Sun, 30 Jun 2019 18:37:47 -0400 Received: from mail-wr1-f50.google.com ([209.85.221.50]:35638) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hhiS9-0004Xa-1A for 35920@debbugs.gnu.org; Sun, 30 Jun 2019 18:37:46 -0400 Received: by mail-wr1-f50.google.com with SMTP id c27so3968113wrb.2 for <35920@debbugs.gnu.org>; Sun, 30 Jun 2019 15:37:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ccil-org.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=AVpDHzVWJwM4LFHconQpMa/dQQjdoHHDLiwPj5XmVQ0=; b=XtE+7Y5fz0EnEoGg0juTxsiOuiWCcgSrYoNrevLp7e5Wusfz5WkzXU1ErUAEeKPLgj MHBIH3AYrl+TJuZG22mdLYj/87/YtjqreczqtNJ8/x5NoPG2wD0kjG4JecJXQ2XQIgIA wqU5gDJZlZKG6rkzK4GW3qyTPm4aIq0Yp8D+7JrKQqqxBtZv05WqeiIcu3ESnXwdybFM 5bpkU1lcghWVynBOd53Xsibjr37j+tEB+wMMh0A7N6uVmfBxVXGA7PmN7VxqbietsT96 9lJPAq2+gtE1PMQun9Iv7F3P36uy5JAxVMjBgIAyixM0whC2X+gi5xZO0tl4M9TmTQli TyQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=AVpDHzVWJwM4LFHconQpMa/dQQjdoHHDLiwPj5XmVQ0=; b=t2Z3eTBDKwXXcWq5NMiO+C/FicMOkT9YN549n21TSLWA25H6ZYtZEK0Q4MbfHScg3k ZcUuLVRVa9YPXAmOy/Ef0z35NVOa5DIKXO5O3516ez+rFqMwEhDHHj8bsP2/JF/A8tYG 1gwQO9q/XESX0GjNmzgnE2Qb93Pz3wPoCG3P6wIlXAlBznr31/G/VFgL0jBio570/+53 iJ+zAmul3bLR5+XvocrGr4LRUkmxA1PZQZwxw30C5wziiTiYLpyXMpT5D5MPVHEjU0Y0 0IqqlstplWxKccVzvZ5bDPykLvEMEdY2NRRlFI36BusLvLvKKoudZYZz82HRCvsujM9W txCQ== X-Gm-Message-State: APjAAAXW8eOFJk591xraYi5DIgqlQ4ps7+r12FzdZqLFotkyqxQZoahA SoCH1H9VCPgZ1kesT6UVh0Oj5va8xuRFT5fRhaC62w== X-Google-Smtp-Source: APXvYqxCNeJqdw1YjQluS1cdrNGxU6hhe7k1DcTg4Mce5uDM8MYKAiOXrIYuSN3RAq4YWzAKR455yMFGtWEESGvTYBY= X-Received: by 2002:adf:f3c7:: with SMTP id g7mr16244664wrp.133.1561934259030; Sun, 30 Jun 2019 15:37:39 -0700 (PDT) MIME-Version: 1.0 References: <877ebt7tc0.fsf@netris.org> <87tvew4efa.fsf@netris.org> <875zrb3ydk.fsf@netris.org> <871s1z3tbq.fsf@netris.org> <87v9xxq767.fsf_-_@netris.org> <87muj9q6nk.fsf@netris.org> <87imtwrint.fsf@netris.org> <874l46svg1.fsf@gnu.org> <87blyekcaa.fsf@netris.org> In-Reply-To: <87blyekcaa.fsf@netris.org> From: John Cowan Date: Sun, 30 Jun 2019 18:37:26 -0400 Message-ID: Subject: Re: bug#35920: strftime incorrectly assumes that nstrftime will produce UTF-8 To: Mark H Weaver Content-Type: multipart/alternative; boundary="000000000000ac22bd058c9228a8" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 35920 Cc: 35920@debbugs.gnu.org, Christopher Lam , =?UTF-8?Q?Ludovic_Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) --000000000000ac22bd058c9228a8 Content-Type: text/plain; charset="UTF-8" That's a mug's game: I've been there and tried it (not in Scheme). I recommend writing a strftime in Scheme from scratch. It's not that hard; the most annoying thing is getting into the locale files to handle the locale-sensitive directives (month name, weekday name, AM/PM, and the ordering of dates). On Sun, Jun 30, 2019 at 5:14 PM Mark H Weaver wrote: > reopen 35920 > thanks > > Hi Ludovic, > > > Mark H Weaver skribis: > > > >> Here's a patch that might fix the problem, but I don't have time to test > >> it right now. > > > > It works! :-) I wrote tests and pushed it as > > ab2fd70ef1e36c6532128b73082809ef3c056556. > > On my system, I found that my proposed patch caused one of the existing > tests to fail. The problem is that if the format string includes > characters that are not representable in the current locale encoding, it > will fail. It seems to me that this could break existing code that > currently works. User code that uses 'strftime' might never encode the > resulting string in the locale encoding. > > I was planning to rewrite the code to scan for the '%' escapes > ourselves, to call 'strftime' for each escape sequence (without > including the surrounding text), and to concatenate the results. > > > I forgot to change the commit author to you before pushing, apologies! > > No worries. Thanks for working on it. > > Mark > > > > --000000000000ac22bd058c9228a8 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
That's a mug's game: I've been there and tried= it (not in Scheme). I recommend writing a strftime in Scheme from scratch.= =C2=A0 It's not that hard; the most annoying thing is getting into the = locale files to handle the locale-sensitive directives (month name, weekday= name, AM/PM, and the ordering of dates).


On Sun, Jun 30, 201= 9 at 5:14 PM Mark H Weaver <mhw@netris= .org> wrote:
reopen 35920
thanks

Hi Ludovic,

> Mark H Weaver <= mhw@netris.org> skribis:
>
>> Here's a patch that might fix the problem, but I don't hav= e time to test
>> it right now.
>
> It works! :-)=C2=A0 I wrote tests and pushed it as
> ab2fd70ef1e36c6532128b73082809ef3c056556.

On my system, I found that my proposed patch caused one of the existing
tests to fail.=C2=A0 The problem is that if the format string includes
characters that are not representable in the current locale encoding, it will fail.=C2=A0 It seems to me that this could break existing code that currently works.=C2=A0 User code that uses 'strftime' might never e= ncode the
resulting string in the locale encoding.

I was planning to rewrite the code to scan for the '%' escapes
ourselves, to call 'strftime' for each escape sequence (without
including the surrounding text), and to concatenate the results.

> I forgot to change the commit author to you before pushing, apologies!=

No worries.=C2=A0 Thanks for working on it.

=C2=A0 =C2=A0 =C2=A0 Mark



--000000000000ac22bd058c9228a8-- From debbugs-submit-bounces@debbugs.gnu.org Sun Jun 30 19:06:53 2019 Received: (at 35920) by debbugs.gnu.org; 30 Jun 2019 23:06:53 +0000 Received: from localhost ([127.0.0.1]:47477 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hhiuL-0005Dp-29 for submit@debbugs.gnu.org; Sun, 30 Jun 2019 19:06:53 -0400 Received: from world.peace.net ([64.112.178.59]:55486) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hhiuJ-0005DW-8r for 35920@debbugs.gnu.org; Sun, 30 Jun 2019 19:06:52 -0400 Received: from mhw by world.peace.net with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1hhiuC-0001eq-Rd; Sun, 30 Jun 2019 19:06:44 -0400 From: Mark H Weaver To: John Cowan Subject: Re: bug#35920: strftime incorrectly assumes that nstrftime will produce UTF-8 References: <877ebt7tc0.fsf@netris.org> <87tvew4efa.fsf@netris.org> <875zrb3ydk.fsf@netris.org> <871s1z3tbq.fsf@netris.org> <87v9xxq767.fsf_-_@netris.org> <87muj9q6nk.fsf@netris.org> <87imtwrint.fsf@netris.org> <874l46svg1.fsf@gnu.org> <87blyekcaa.fsf@netris.org> Date: Sun, 30 Jun 2019 19:06:28 -0400 In-Reply-To: (John Cowan's message of "Sun, 30 Jun 2019 18:37:26 -0400") Message-ID: <877e92k70r.fsf@netris.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 35920 Cc: 35920@debbugs.gnu.org, Christopher Lam , Ludovic =?utf-8?Q?Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Hi John, John Cowan writes: > That's a mug's game: I've been there and tried it (not in Scheme). I > recommend writing a strftime in Scheme from scratch. It's not that > hard; the most annoying thing is getting into the locale files to > handle the locale-sensitive directives (month name, weekday name, > AM/PM, and the ordering of dates). Is there a portable way to find the relevant locale files and interpret them, on both POSIX and Windows systems? If so, can you point out the relevant documentation? Thanks, Mark From debbugs-submit-bounces@debbugs.gnu.org Sun Jun 30 21:28:38 2019 Received: (at 35920) by debbugs.gnu.org; 1 Jul 2019 01:28:38 +0000 Received: from localhost ([127.0.0.1]:47511 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hhl7V-0008Q2-Rk for submit@debbugs.gnu.org; Sun, 30 Jun 2019 21:28:38 -0400 Received: from mail-wr1-f41.google.com ([209.85.221.41]:37313) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hhl7T-0008Pp-Nk for 35920@debbugs.gnu.org; Sun, 30 Jun 2019 21:28:36 -0400 Received: by mail-wr1-f41.google.com with SMTP id v14so11873732wrr.4 for <35920@debbugs.gnu.org>; Sun, 30 Jun 2019 18:28:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ccil-org.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=8N+GZ2uajQ6riIJqY5mrMpqKE2U/SwvREp1L4IaI3I0=; b=PDBe8M886Rnxh8dPgnQP1erQ7m0OZIsUVDvXp+xGgSg4a5VudxvTtsC9rFvIF2OOch R3XNQdkntBVtzOplyBZFwVqoEWZIaAfoRkElxq5ndzSkJhgYbwntfnZmu6e1QIGqhesD LtjNJsShTtQ5TLjn9+UhSkHihirEp36uPkCo42vp98cy0h8azAyBjAH3KuqzBHBeeFnH ZZ6zAbvGHFjFlNUhr9cuBUK87V+XszYjtNHZwQ8n8RNCgZ85NbYozzX/C4tlFX7nC2a1 YlEf+WhYZvpSi3hFcDFcyPplees2Xwt6QCcIVHNINo831C3cdXuSTijw2d3vxZ/6kGv6 2klQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=8N+GZ2uajQ6riIJqY5mrMpqKE2U/SwvREp1L4IaI3I0=; b=QzR+ZVG8EojnXqXFa8ho0jMSGc4dCIncUSLGJ+04L+ttwj7uyODI2XESjJtkWCGNAt C5mCXv/wObRvqzac8K6Q29OTHofgrV1Uq+dX0UN1YmaZCI5/TybtYVyXA0Y4nhXdn+Uc GwmzrmyskcY1f1/5fi3ENfW1SRQRXOjZAaGSm9UmjgORWHa0c097A2YviYn9Wf0ASTCM 1I6T3XFFHlMCCbOWzVCsLvSzYSd2SO2Dyh4PCCDfIqxGoLvyHA04wa04WHcy0MiIiD+6 +uWfXMl1mU0JooY9cDwKPIJyBNE9xlG/nlHRawbTUNH1M/0JN8+dE8fsZA7oqmUhmNJR tfew== X-Gm-Message-State: APjAAAXRljT3rnaYbobiZ0dDTPToJv+9EcghbglWzmPFhNnZ/bUwnZpT vOui301qAYW7zKqWfKEsHs5TdvSqQA9iQydPcffBgA== X-Google-Smtp-Source: APXvYqxmTbxsPIb0pXSGEeM5aYuyjxaVpDOCfPbL/Ewre/j9abae45FzN+rI3GYOTHg/azHjrLvh5iHMDT2CouWg9c0= X-Received: by 2002:a5d:5308:: with SMTP id e8mr7525220wrv.219.1561944509906; Sun, 30 Jun 2019 18:28:29 -0700 (PDT) MIME-Version: 1.0 References: <877ebt7tc0.fsf@netris.org> <87tvew4efa.fsf@netris.org> <875zrb3ydk.fsf@netris.org> <871s1z3tbq.fsf@netris.org> <87v9xxq767.fsf_-_@netris.org> <87muj9q6nk.fsf@netris.org> <87imtwrint.fsf@netris.org> <874l46svg1.fsf@gnu.org> <87blyekcaa.fsf@netris.org> <877e92k70r.fsf@netris.org> In-Reply-To: <877e92k70r.fsf@netris.org> From: John Cowan Date: Sun, 30 Jun 2019 21:28:18 -0400 Message-ID: Subject: Re: bug#35920: strftime incorrectly assumes that nstrftime will produce UTF-8 To: Mark H Weaver Content-Type: multipart/alternative; boundary="000000000000ac13f4058c948b2b" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 35920 Cc: 35920@debbugs.gnu.org, Christopher Lam , =?UTF-8?Q?Ludovic_Court=C3=A8s?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) --000000000000ac13f4058c948b2b Content-Type: text/plain; charset="UTF-8" On Sun, Jun 30, 2019 at 7:06 PM Mark H Weaver wrote: Is there a portable way to find the relevant locale files and interpret > them, on both POSIX and Windows systems? If so, can you point out the > relevant documentation? > Portable in the sense that the information can be obtained on both Posix and Windows, but not with exactly the same code. On Posix, you need the nl_langinfo() and nl_langinfo_l() functions from . These functions are documented at < http://pubs.opengroup.org/onlinepubs/9699919799/functions/nl_langinfo.html>, and the constants d at < http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/langinfo.h.html>. On Windows, you need to call EnumCalendarInfoExEx if you have dropped support for Vista and earlier versions, or if not, then follow the links from the page about it. The function is documented at < https://docs.microsoft.com/en-us/windows/desktop/api/Winnls/nf-winnls-enumcalendarinfoexex>, and the constants that specify particular pieces of information at < https://docs.microsoft.com/en-us/windows/desktop/Intl/calendar-type-information>. (I have never used these interfaces myself.) I hope this is helpful. John Cowan http://vrici.lojban.org/~cowan cowan@ccil.org Eric Raymond is the Margaret Mead of the Open Source movement. --Bruce Perens, a long time ago --000000000000ac13f4058c948b2b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Sun, Jun 30, 2019 at 7:06 PM Mark H We= aver <mhw@netris.org> wrote:

Is there a portable way to find the rel= evant locale files and interpret
them, on both POSIX and Windows systems?=C2=A0 If so, can you point out the=
relevant documentation?

Por= table in the sense that the information can be obtained on both Posix and W= indows, but not with exactly the same code.

On Posix= , you need the nl_langinfo() and nl_langinfo_l() functions from <langinf= o.h>.=C2=A0 These functions are documented at <http://pub= s.opengroup.org/onlinepubs/9699919799/functions/nl_langinfo.html>, a= nd the constants d at <http://pubs.opengroup.org/onlinepubs/96= 99919799/basedefs/langinfo.h.html>.

On Wind= ows, you need to call=C2=A0EnumCalendarInfoExEx if you have dropped support= for Vista and earlier versions, or if not, then follow the links from the = page about it.=C2=A0 The function is documented at <https://docs.microsoft.com/en-us/windows/desktop/api/Winnls/nf-winn= ls-enumcalendarinfoexex>, and the constants that specify particular = pieces of information at <https://docs.microsoft.com/en= -us/windows/desktop/Intl/calendar-type-information>.=C2=A0 (I have n= ever used these interfaces myself.)

I hope this is= helpful.

=C2=A0
=
John Cowan =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0http://vrici.lojban.org/~cowan =C2=A0 =C2=A0 =C2=A0 = =C2=A0cowan@ccil.org
Eric Raymond = is the Margaret Mead of the Open Source movement.
=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 --Bruce Perens, a long time ago
--000000000000ac13f4058c948b2b-- From debbugs-submit-bounces@debbugs.gnu.org Tue Jul 02 04:58:46 2019 Received: (at 35920) by debbugs.gnu.org; 2 Jul 2019 08:58:46 +0000 Received: from localhost ([127.0.0.1]:46558 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hiEcg-0004hC-3U for submit@debbugs.gnu.org; Tue, 02 Jul 2019 04:58:46 -0400 Received: from eggs.gnu.org ([209.51.188.92]:47412) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hiEcf-0004h0-C8 for 35920@debbugs.gnu.org; Tue, 02 Jul 2019 04:58:45 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:45046) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hiEcZ-0002sb-Lr; Tue, 02 Jul 2019 04:58:39 -0400 Received: from [2001:660:6102:320:e120:2c8f:8909:cdfe] (port=50388 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1hiEcT-0001RY-RB; Tue, 02 Jul 2019 04:58:36 -0400 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: Mark H Weaver Subject: Re: bug#35920: strftime incorrectly assumes that nstrftime will produce UTF-8 References: <877ebt7tc0.fsf@netris.org> <87tvew4efa.fsf@netris.org> <875zrb3ydk.fsf@netris.org> <871s1z3tbq.fsf@netris.org> <87v9xxq767.fsf_-_@netris.org> <87muj9q6nk.fsf@netris.org> <87imtwrint.fsf@netris.org> <874l46svg1.fsf@gnu.org> <87blyekcaa.fsf@netris.org> <877e92k70r.fsf@netris.org> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 14 Messidor an 227 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Tue, 02 Jul 2019 10:58:32 +0200 In-Reply-To: <877e92k70r.fsf@netris.org> (Mark H. Weaver's message of "Sun, 30 Jun 2019 19:06:28 -0400") Message-ID: <87woh0ak3r.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 35920 Cc: 35920@debbugs.gnu.org, Christopher Lam , John Cowan X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Hi, Mark H Weaver skribis: > John Cowan writes: > >> That's a mug's game: I've been there and tried it (not in Scheme). I >> recommend writing a strftime in Scheme from scratch. It's not that >> hard; the most annoying thing is getting into the locale files to >> handle the locale-sensitive directives (month name, weekday name, >> AM/PM, and the ordering of dates). > > Is there a portable way to find the relevant locale files and interpret > them, on both POSIX and Windows systems? If so, can you point out the > relevant documentation? The (ice-9 i18n) module provides bindings to nl_langinfo et al. The actual data format is specific to the C library, so I think we cannot portably go deeper than what (ice-9 i18n) does. Ludo=E2=80=99. From debbugs-submit-bounces@debbugs.gnu.org Tue Jul 02 05:07:11 2019 Received: (at 35920) by debbugs.gnu.org; 2 Jul 2019 09:07:11 +0000 Received: from localhost ([127.0.0.1]:46562 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hiEkp-0004uC-0e for submit@debbugs.gnu.org; Tue, 02 Jul 2019 05:07:11 -0400 Received: from eggs.gnu.org ([209.51.188.92]:48821) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hiEkn-0004tz-I7 for 35920@debbugs.gnu.org; Tue, 02 Jul 2019 05:07:09 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:45143) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hiEki-0000hP-63; Tue, 02 Jul 2019 05:07:04 -0400 Received: from [2001:660:6102:320:e120:2c8f:8909:cdfe] (port=50390 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1hiEkh-0006hb-MM; Tue, 02 Jul 2019 05:07:03 -0400 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: Mark H Weaver Subject: Re: bug#35920: strftime incorrectly assumes that nstrftime will produce UTF-8 References: <877ebt7tc0.fsf@netris.org> <87tvew4efa.fsf@netris.org> <875zrb3ydk.fsf@netris.org> <871s1z3tbq.fsf@netris.org> <87v9xxq767.fsf_-_@netris.org> <87muj9q6nk.fsf@netris.org> <87imtwrint.fsf@netris.org> <874l46svg1.fsf@gnu.org> <87blyekcaa.fsf@netris.org> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 14 Messidor an 227 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Tue, 02 Jul 2019 11:07:01 +0200 In-Reply-To: <87blyekcaa.fsf@netris.org> (Mark H. Weaver's message of "Sun, 30 Jun 2019 17:12:45 -0400") Message-ID: <87imskajpm.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 35920 Cc: 35920@debbugs.gnu.org, Christopher Lam X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Hi Mark, Mark H Weaver skribis: >> Mark H Weaver skribis: >> >>> Here's a patch that might fix the problem, but I don't have time to test >>> it right now. >> >> It works! :-) I wrote tests and pushed it as >> ab2fd70ef1e36c6532128b73082809ef3c056556. > > On my system, I found that my proposed patch caused one of the existing > tests to fail. Which test? In commit ab2fd70ef1e36c6532128b73082809ef3c056556 I modified the test that passes \u0100 to run in a UTF-8 locale, on the grounds that the previous behavior was fragile: =E2=80=9Craw bytes=E2=80=9D= of the input string would be preserved, but they could be mixed with things like month names in the current locale encoding. The result is rather unpredictable. > The problem is that if the format string includes characters that are > not representable in the current locale encoding, it will fail. It > seems to me that this could break existing code that currently works. > User code that uses 'strftime' might never encode the resulting string > in the locale encoding. In theory yes, but I cannot think of a scenario where the previous behavior would be =E2=80=9Cuseful=E2=80=9D, because it=E2=80=99s hard to ev= en describe what it means. > I was planning to rewrite the code to scan for the '%' escapes > ourselves, to call 'strftime' for each escape sequence (without > including the surrounding text), and to concatenate the results. I think we should deprecate =E2=80=98strftime=E2=80=99 and =E2=80=98strptim= e=E2=80=99: (srfi srfi-19) provides similar functionality, it uses (ice-9 i18n) for the locale stuff, and it has a better API. Perhaps something we can do in 3.0? Thanks, Ludo=E2=80=99. From debbugs-submit-bounces@debbugs.gnu.org Tue Jul 02 12:51:37 2019 Received: (at 35920) by debbugs.gnu.org; 2 Jul 2019 16:51:37 +0000 Received: from localhost ([127.0.0.1]:47800 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hiM0H-000653-DK for submit@debbugs.gnu.org; Tue, 02 Jul 2019 12:51:37 -0400 Received: from mail-wr1-f41.google.com ([209.85.221.41]:46737) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hiM0F-00064p-FK for 35920@debbugs.gnu.org; Tue, 02 Jul 2019 12:51:36 -0400 Received: by mail-wr1-f41.google.com with SMTP id n4so18615523wrw.13 for <35920@debbugs.gnu.org>; Tue, 02 Jul 2019 09:51:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ccil-org.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=cjgwqrjwXQT14hKbrzf1xAuV6cuIynzFgAbiMxZ/Qzs=; b=jBnvOLtHC230E94HJmFTSlEV0BvBSspUJbi47XwR2vddCuOU7/dy3AHNGkcW/FQ+xA D3iJdj+Y/LzF6rNFCSgleXD5Ux7jsI05yraw5bk8zsjyftHeNpfmO62r3mQNijJyfqOX aulJfvKxkMkPe5QgfY1heZW7Y3vLn/snQ3qo+YbHJLYNdNbTJtE3Ol9TBaUc2LLGtBFb ThCz5voJdHxlQ/QEXmgDP/LRqX36afMhuKmhFF8WzKS3JpjBDbgRLOIAUZBECQ0juPCS C34Ls/pVLr3EdZSeObl3yu7tyyldI5IexzB89/f5Ei1TNVLnS/Ur3ou4KstLJlxng4eA f/WQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=cjgwqrjwXQT14hKbrzf1xAuV6cuIynzFgAbiMxZ/Qzs=; b=CSKxnSPxASsk4T0Zbaflp2VQSAWJhwrYr6E8A1aI2u5/QjK/PzWCAoDp8nNX3Nla9K G6hbuWhYEgDdP4TBV8MOG1gCv4l4C7PruYqjhuiUtvAeEr78shYPyAFoU6BnSinHCCeq 7LTslroeKTAR2mTn6zQjyRbw5lu28VO+ANq7pGThRkliq9ZP/P5bYVK0SZqVUwDcQU32 U3T/v/ZIKN+QByTq9neaTdleyV77RI10JfRHG8kfFS00s5mkNsbLvdcDem9kmB10lhpe gQLMmLa6kBNazVl7Y2AKFVSztGTdXmwPdpS2I/0K+tekQnijWvsbXZiY0yG0/PUruAJ0 q2YQ== X-Gm-Message-State: APjAAAUK9lI7y7GhraZcDvvrEG9zkqSA9HebUMikjBOHmsOjtLwaKXY8 dBJX01I5bKAGseRqhFB4cmCwGnJjkJ0G6lF5UK1fmw== X-Google-Smtp-Source: APXvYqx+dm7vialvlwlHsbGyhyo8DFt0Cb1NqYlvI143TqPjs8lt2gS9xKkR98J5egcdD4fR0Rms61KUm+bXvbXQlbw= X-Received: by 2002:adf:dc81:: with SMTP id r1mr24145250wrj.298.1562086289620; Tue, 02 Jul 2019 09:51:29 -0700 (PDT) MIME-Version: 1.0 References: <877ebt7tc0.fsf@netris.org> <87tvew4efa.fsf@netris.org> <875zrb3ydk.fsf@netris.org> <871s1z3tbq.fsf@netris.org> <87v9xxq767.fsf_-_@netris.org> <87muj9q6nk.fsf@netris.org> <87imtwrint.fsf@netris.org> <874l46svg1.fsf@gnu.org> <87blyekcaa.fsf@netris.org> <87imskajpm.fsf@gnu.org> In-Reply-To: <87imskajpm.fsf@gnu.org> From: John Cowan Date: Tue, 2 Jul 2019 12:51:18 -0400 Message-ID: Subject: Re: bug#35920: strftime incorrectly assumes that nstrftime will produce UTF-8 To: =?UTF-8?Q?Ludovic_Court=C3=A8s?= Content-Type: multipart/alternative; boundary="00000000000066d424058cb58e81" X-Spam-Score: 0.7 (/) X-Debbugs-Envelope-To: 35920 Cc: 35920@debbugs.gnu.org, Mark H Weaver , Christopher Lam X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.3 (/) --00000000000066d424058cb58e81 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Jul 2, 2019 at 5:08 AM Ludovic Court=C3=A8s wrote: I think we should deprecate =E2=80=98strftime=E2=80=99 and =E2=80=98strptim= e=E2=80=99: (srfi srfi-19) > provides similar functionality, it uses (ice-9 i18n) for the locale > stuff, and it has a better API. > Just a heads-up. I don't consider SRFI 19 to have a very good API, and I'm working on a pre-SRFI for dates and times. There is an outline of it (very subject to change) at < https://bitbucket.org/cowan/r7rs-wg1-infra/src/default/TimeAdvancedCowan.md= >. Note that it does not do localization except for timezones, however, so is probably not directly relevant. I'd appreciate review comments at cowan@ccil.org anyway. Thanks. John Cowan http://vrici.lojban.org/~cowan cowan@ccil.org Is a chair finely made tragic or comic? Is the portrait of Mona Lisa good if I desire to see it? Is the bust of Sir Philip Crampton lyrical, epical or dramatic? If a man hacking in fury at a block of wood make there an image of a cow, is that image a work of art? If not, why not? --Stephen Dedalus --00000000000066d424058cb58e81 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Tue, Jul 2, 2019 at 5:08 AM Ludovi= c Court=C3=A8s <ludo@gnu.org> wro= te:

I think we should deprecate =E2=80=98st= rftime=E2=80=99 and =E2=80=98strptime=E2=80=99: (srfi srfi-19)
provides similar functionality, it uses (ice-9 i18n) for the locale
stuff, and it has a better API.

Just a = heads-up.=C2=A0 I don't consider SRFI 19 to have a very good API, and I= 'm working on a pre-SRFI for dates and times.=C2=A0 There is an outline= of it (very subject to change) at <https://bitbucket.org/c= owan/r7rs-wg1-infra/src/default/TimeAdvancedCowan.md>.=C2=A0 =C2=A0N= ote that it does not do localization except for timezones, however, so is p= robably not directly relevant.=C2=A0 I'd appreciate review comments at = cowan@ccil.org anyway.=C2=A0 Thanks.<= /div>


John Cowan =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0http://vrici.lojban.or= g/~cowan =C2=A0 =C2=A0 =C2=A0 =C2=A0c= owan@ccil.org
Is a chair finely made tragic or comic? Is the portrai= t of Mona Lisa
good if I desire to see it? Is the bust of Sir Philip Cra= mpton lyrical,
epical or dramatic?=C2=A0 If a man hacking in fury at a b= lock of wood make
there an image of a cow, is that image a work of art? = If not, why not?
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= --Stephen Dedalus

--00000000000066d424058cb58e81--