From unknown Sun Jun 15 08:41:30 2025
X-Loop: help-debbugs@gnu.org
Subject: bug#7968: RE : Re: 16-bit wchar_t on Windows and Cygwin
Resent-From: Bastien ROUCARIES
Original-Sender: debbugs-submit-bounces@debbugs.gnu.org
Resent-To: owner@debbugs.gnu.org
Resent-CC: bug-coreutils@gnu.org
Resent-Date: Wed, 02 Feb 2011 19:04:02 +0000
Resent-Message-ID:
Resent-Sender: help-debbugs@gnu.org
X-GNU-PR-Message: report 7968
X-GNU-PR-Package: coreutils
X-GNU-PR-Keywords:
To: Bruno Haible
Cc: 7968@debbugs.gnu.org, cygwin@cygwin.com, bug-gnulib@gnu.org, eblake@redhat.com
X-Debbugs-Original-Cc: bug-coreutils , cygwin , bug-gnulib@gnu.org, Eric Blake
Received: via spool by submit@debbugs.gnu.org id=B.129667341820203
(code B ref -1); Wed, 02 Feb 2011 19:04:02 +0000
Received: (at submit) by debbugs.gnu.org; 2 Feb 2011 19:03:38 +0000
Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from )
id 1PkhzN-0005Fn-LL
for submit@debbugs.gnu.org; Wed, 02 Feb 2011 14:03:38 -0500
Received: from eggs.gnu.org ([140.186.70.92])
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from ) id 1Pkhjr-0004sF-Tn
for submit@debbugs.gnu.org; Wed, 02 Feb 2011 13:47:36 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
(envelope-from ) id 1Pkhrx-0002XZ-4H
for submit@debbugs.gnu.org; Wed, 02 Feb 2011 13:56:02 -0500
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org
X-Spam-Level:
X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,FREEMAIL_FROM,
HTML_MESSAGE,RCVD_IN_DNSWL_LOW,T_DKIM_INVALID autolearn=unavailable
version=3.3.1
Received: from lists.gnu.org ([199.232.76.165]:38972)
by eggs.gnu.org with esmtp (Exim 4.71)
(envelope-from ) id 1Pkhrw-0002XV-Vh
for submit@debbugs.gnu.org; Wed, 02 Feb 2011 13:55:57 -0500
Received: from [140.186.70.92] (port=56113 helo=eggs.gnu.org)
by lists.gnu.org with esmtp (Exim 4.43) id 1Pkhrr-0001Np-E1
for bug-coreutils@gnu.org; Wed, 02 Feb 2011 13:55:56 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
(envelope-from ) id 1Pkhrm-0002SB-5g
for bug-coreutils@gnu.org; Wed, 02 Feb 2011 13:55:51 -0500
Received: from mail-ey0-f169.google.com ([209.85.215.169]:34709)
by eggs.gnu.org with esmtp (Exim 4.71)
(envelope-from )
id 1PkhrA-0002K6-RZ; Wed, 02 Feb 2011 13:55:09 -0500
Received: by eyh6 with SMTP id 6so242663eyh.0
for ; Wed, 02 Feb 2011 10:55:07 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
h=domainkey-signature:mime-version:date:message-id:subject:from:to:cc
:content-type; bh=znKKZ+mCSsNPfdA5jc0It7XjEkhzwMj9wrZckFDB05Y=;
b=p/FnlpanLvzC3RvenLi+tj/SPNhKLxbgd4XeF5X1Z8xr7cBlvniBIdNF+plTf7i6Bt
+0vCGWZSpLzM938jiPHTDFjlIpeE85d1T640NBuvHNIZQpjU29cIgaqNpY7Q2NIhb7v6
27kj83iD000yyDVBjf2brtrbVEw5Nukza/NAM=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
h=mime-version:date:message-id:subject:from:to:cc:content-type;
b=MhcoxgXHjQnBEAJOPEAsUcLuZIoyco8ljrd21blknH21IvdZiN+CG4zAMTorez6cag
z7XtrugGji6o4gfMYnwJBKye6mb6iMangjEd6dsN8Es6nqhF2crQ8o1ukvZWuMXkiUhX
SSrepp7H+sUWZaeIWuOlfCbZpXXq2CERCLn+o=
MIME-Version: 1.0
Received: by 10.204.117.77 with SMTP id p13mr8712897bkq.19.1296672823188; Wed,
02 Feb 2011 10:53:43 -0800 (PST)
Received: by 10.204.176.135 with HTTP; Wed, 2 Feb 2011 10:53:43 -0800 (PST)
Received: by 10.204.176.135 with HTTP; Wed, 2 Feb 2011 10:53:43 -0800 (PST)
Date: Wed, 2 Feb 2011 19:53:43 +0100
Message-ID:
From: Bastien ROUCARIES
Content-Type: multipart/alternative; boundary=0016e6d647db041d1a049b512b2a
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2)
X-Received-From: 209.85.215.169
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2)
X-Received-From: 199.232.76.165
X-Spam-Score: -5.9 (-----)
X-Mailman-Approved-At: Wed, 02 Feb 2011 14:03:36 -0500
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.11
Precedence: list
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Sender: debbugs-submit-bounces@debbugs.gnu.org
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
X-Spam-Score: -5.9 (-----)
--0016e6d647db041d1a049b512b2a
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Using -fno-short-wchar will avoid to change the api.
Bastien
Le 2 f=E9vr. 2011 18:42, "Bruno Haible" a =E9crit :
Hello Corinna,
> And, please note the wording in SUSv4, for instance in
> http://calimero.vinschen.de/susv4/functions/iswalpha.html
Likewise in POSIX:2008, at the URL
http://www.opengroup.org/onlinepubs/9699919799/functions/iswalpha.html
> The wc argument is a wint_t, the value of which the application shall
> ^^^^^^ ^^^^^^^^^^^
> ensure is a wide-character code corresponding to a valid character in
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> the current locale, or equal to the value of the macro WEOF. If the
> argument has any other value, the behavior is undefined.
What this sentence means in formulas, is that when an application passes
a 'wint_t x' to iswalpha(), it has to satisfy
x =3D=3D (wint_t) (wchar_t) x || x =3D=3D EOF
> iswalpha takes wint_t, not wchar_t. Since sizeof (wint_t) is 4 byte,
> the function can return the correct value, provided that the application
> converts the UTF-16 surrogate to UTF-32 before calling iswalpha.
When an application does this, is passes an invalid wint_t value to
iswalpha(), according to the spec paragraph that you have just cited.
So the application uses an extension to POSIX functionality, not
POSIX itself.
I see that Cygwin 1.7.x iswalpha() works in this way you describe (but
mingw's iswalpha() doesn't). So this means that gnulib's proposed
iswwalpha(wwchar_t) function could be implemented using iswalpha()
on Cygwin 1.7.x and will not cause the Unicode based tables to be
included in the executable. This is good and nice.
But if you say that the application should convert UTF-16 surrogates
to UTF-32 before calling iswalpha: That's certainly a requirement
for Cygwin 1.7.x application that want to support the entire Unicode
character set. But it's outside of POSIX, and many GNU programs will
not want to include this added complexity. Just try to apply this
suggestion to gnulib's quotearg.c, then estimate the time someone
would need to apply it also to regcomp.c, strftime.c, mbscasestr.c,
coreutils/src/wc.c, and so on.
For this reason I propose the wwchar_t type with an API that is similar
to POSIX but includes the surrogate handling, rather than
pushing it into each application's code.
Bruno
--=20
In memoriam Carl Friedrich Goerdeler <
http://en.wikipedia.org/wiki/Carl_Friedrich_Goerdel...
--0016e6d647db041d1a049b512b2a
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Using -fno-short-wchar will avoid to change the api.
Bastien
Le=A02 f=E9vr. 2011 18:42, "Bruno Haible&=
quot; <bruno@clisp.org>=A0a =
=E9crit=A0:
Hello Corinna,
> And, please note the wording in SUSv4, for instance in
> http://calimero.vinschen.de/susv4/functions/iswalpha.html=
Likewise in POSIX:2008, at the URL
http://www.opengroup.org/onlinepubs/9699919799/fu=
nctions/iswalpha.html
> =A0 The wc argument is a wint_t, the value of which the application sh=
all
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0^^^^^^ =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ^^^^^^^^^^^
> =A0 ensure is a wide-character code corresponding to a valid character=
in
=A0 =A0 =A0 =A0 =A0 =A0 =A0^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^=
^^^^^^^^^
> =A0 the current locale, or equal to the value of the macro WEOF. If th=
e
> =A0 argument has any other value, the behavior is undefined.
What this sentence means in formulas, is that when an application passes
a 'wint_t x' to iswalpha(), it has to satisfy
=A0 x =3D=3D (wint_t) (wchar_t) x || x =3D=3D EOF
> iswalpha takes wint_t, not wchar_t. =A0Since sizeof (wint_t) is 4 byte=
,
> the function can return the correct value, provided that the applicati=
on
> converts the UTF-16 surrogate to UTF-32 before calling iswalpha.
When an application does this, is passes an invalid wint_t value to
iswalpha(), according to the spec paragraph that you have just cited.
So the application uses an extension to POSIX functionality, not
POSIX itself.
I see that Cygwin 1.7.x iswalpha() works in this way you describe (but
mingw's iswalpha() doesn't). So this means that gnulib's propos=
ed
iswwalpha(wwchar_t) function could be implemented using iswalpha()
on Cygwin 1.7.x and will not cause the Unicode based tables to be
included in the executable. This is good and nice.
But if you say that the application should convert UTF-16 surrogates
to UTF-32 before calling iswalpha: That's certainly a requirement
for Cygwin 1.7.x application that want to support the entire Unicode
character set. But it's outside of POSIX, and many GNU programs will
not want to include this added complexity. Just try to apply this
suggestion to gnulib's quotearg.c, then estimate the time someone
would need to apply it also to regcomp.c, strftime.c, mbscasestr.c,
coreutils/src/wc.c, and so on.
For this reason I propose the wwchar_t type with an API that is similar
to POSIX <wctype.h> but includes the surrogate handling, rather than<=
br>
pushing it into each application's code.
Bruno
--
In memoriam Carl Friedrich G=
oerdeler <http://en.wikipedia.org/wiki/Carl_Friedrich_Goerdel...
--0016e6d647db041d1a049b512b2a--
From debbugs-submit-bounces@debbugs.gnu.org Thu Aug 30 04:07:08 2012
Received: (at control) by debbugs.gnu.org; 30 Aug 2012 08:07:08 +0000
Received: from localhost ([127.0.0.1]:57030 helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.72)
(envelope-from )
id 1T6zmO-0004un-2q
for submit@debbugs.gnu.org; Thu, 30 Aug 2012 04:07:08 -0400
Received: from out1-smtp.messagingengine.com ([66.111.4.25]:33853)
by debbugs.gnu.org with esmtp (Exim 4.72)
(envelope-from ) id 1T6zmL-0004uf-2I
for control@debbugs.gnu.org; Thu, 30 Aug 2012 04:07:06 -0400
Received: from compute5.internal (compute5.nyi.mail.srv.osa [10.202.2.45])
by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id ED43A20DA5
for ; Thu, 30 Aug 2012 04:05:55 -0400 (EDT)
Received: from web5.nyi.mail.srv.osa ([10.202.2.215])
by compute5.internal (MEProxy); Thu, 30 Aug 2012 04:05:55 -0400
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=
messagingengine.com; h=message-id:from:to:mime-version
:content-transfer-encoding:content-type:subject:date; s=smtpout;
bh=g3x1YkP2vluBEiAMNfVG/RrTs8w=; b=TZifXzPLwB25W+uO3zOoS/hKp7Oe
KhJauDa80k+NMovM0kYUSZt+0p/bpnOh0CKI1A233+hxUgr8wXVw3M1yaiA+SIjp
3lcXH7H2asp6p+g+dl6KDunycJuQGMWel+95+AZrDypuVZfWuzQ8lk/q3aAG3yXQ
TT9u7m6/S5244xc=
Received: by web5.nyi.mail.srv.osa (Postfix, from userid 99)
id BD7B04C0212; Thu, 30 Aug 2012 04:05:55 -0400 (EDT)
Message-Id: <1346313955.6848.140661121418585.20DF5764@webmail.messagingengine.com>
X-Sasl-Enc: T2AqFlIWN6GQ1vxU9Q18H9U0y8luevpcLNcOSgz5PZxR 1346313955
From: era eriksson
To: control@debbugs.gnu.org
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Type: text/plain
X-Mailer: MessagingEngine.com Webmail Interface
Subject: Bug maintenance
Date: Thu, 30 Aug 2012 11:05:55 +0300
X-Spam-Score: -2.6 (--)
X-Debbugs-Envelope-To: control
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.13
Precedence: list
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Sender: debbugs-submit-bounces@debbugs.gnu.org
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
X-Spam-Score: -2.6 (--)
forcemerge 6366 10924
merge 7389 7394
tags 7394 + patch
merge 7948 7963 7968
retitle 9140 [du] broken on OSX 10.7 (Lion) for >4TB file systems
retitle 10003 [df] information differs from GUI
retitle 10013 [ls] document origin of name
tags 10013 + patch
retitle 10054 [cp] 8.13: cp -au may replace newer files [sr #107876]
retitle 10877 [sort] too eager to use temp files
retitle 11760 [mv] data loss after ctrl-C on ntfs -> ntfs move
thanks
I also wanted to retitle 10900 but it's too opaque, and should be
followed up.
I took this out because I'm not sure I was able to summarize it
correctly.
retitle 10639 [cp] test-copy-acl fails on Solaris 64bit + NFS
/* era */
--
If this were a real .signature, it would suck less. Well, maybe not.
From debbugs-submit-bounces@debbugs.gnu.org Fri Oct 19 12:41:51 2018
Received: (at control) by debbugs.gnu.org; 19 Oct 2018 16:41:51 +0000
Received: from localhost ([127.0.0.1]:59698 helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from )
id 1gDXqQ-0006AY-SN
for submit@debbugs.gnu.org; Fri, 19 Oct 2018 12:41:51 -0400
Received: from mail-pg1-f180.google.com ([209.85.215.180]:41767)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from ) id 1gDXqP-0006AL-Qr
for control@debbugs.gnu.org; Fri, 19 Oct 2018 12:41:50 -0400
Received: by mail-pg1-f180.google.com with SMTP id 23-v6so15989031pgc.8
for ; Fri, 19 Oct 2018 09:41:49 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
h=to:from:message-id:date:user-agent:mime-version:content-language
:content-transfer-encoding;
bh=7ocZM2/+C97vN3RMbonXlXp8vUHXy31J0HD14OjHeS0=;
b=TVZtKXonLN04sQrD6jN/CKcYsfpusx6XD1YFovlz7i81tmMqKgD9FMocYWYQgdDQTc
RgugDCS4Lbf1cqDQjgr2fKLXYZ4r2Jgeix1cySbeDK3Pf6sen+cM/D6ZT0E5jleOSNtL
+lHkO9xUtUwVP4Ap3jnJfuSltoNgU8DutOx8LRRpgS4r9ajxGckq4vmX//CPf/iXVp6d
QLEvb96j4/xCAYvHPIAXGz6ZCcYAp/MvKayG+C98npoS/Otcd7ceijJEzDmk9lFFvwe+
Ny3BTdng8f/EBuhtZnzyXBFoRpvVqYl8DFOx2XptmmlbFqmudaGin8CfORIOLjgt+M7c
/yHQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20161025;
h=x-gm-message-state:to:from:message-id:date:user-agent:mime-version
:content-language:content-transfer-encoding;
bh=7ocZM2/+C97vN3RMbonXlXp8vUHXy31J0HD14OjHeS0=;
b=WOKYdliw4itbOHiAmpVl2k8OrNt3kleXHbntXVzKCLfcFtZDrN0vtRVCxRpm5kR8E6
qaspS0ANTSCS9Dif9iFLjgLl49hf+SNJ5pW+AgObco5YgXtnmGVzGOMD8kbgeGdhguQ3
5I6C6lf1MJTPkMfbQTdbH/KKHyoGnTJopf6KNAux67PcMa3MgbqMioqU5Fm/gnsmsh3E
TGaUbD+Nzt2embCkLBw5ByF9jn5qwBtQowJsbg6OIKCL1fADoAd+CrllGWNoLUylU+LE
04nmWlahziIGfgKJ93o3MTLBo7LJ0g+wXunvYZvJObK2UfyOLmV3TK335GkQ5HxEZJzd
KADw==
X-Gm-Message-State: ABuFfojow+OGfheMxFMV1jJ6P0EdQDs9EyCbHr/IgR5ES57GqzQlifIc
PnS9z1t/mihV1AZEeD0il/V6Rzvt1lQ=
X-Google-Smtp-Source: ACcGV60yaJkZ6yxzdlLC/8TqHc/ZR4p2aKacScLHwgw9Fsu9/GcTSr6OIyEicTw7NQ2L/ogcYD65xA==
X-Received: by 2002:a62:8685:: with SMTP id
x127-v6mr12183544pfd.252.1539967303166;
Fri, 19 Oct 2018 09:41:43 -0700 (PDT)
Received: from tomato.housegordon.com (moose.housegordon.com. [184.68.105.38])
by smtp.googlemail.com with ESMTPSA id
n63-v6sm29398857pfn.9.2018.10.19.09.41.41
for
(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
Fri, 19 Oct 2018 09:41:41 -0700 (PDT)
To: control@debbugs.gnu.org
From: Assaf Gordon
Message-ID: <07538c27-65e2-2b19-b1ef-de29f919064d@gmail.com>
Date: Fri, 19 Oct 2018 10:41:40 -0600
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
Thunderbird/52.9.1
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
X-Spam-Score: 2.0 (++)
X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org",
has NOT identified this incoming email as spam. The original
message has been attached to this so you can view it or label
similar future email. If you have any questions, see
the administrator of that system for details.
Content preview: severity 7948 wishlist retitle 7948 multibyte: 16-bit wchar_t
on Windows and Cygwin [...]
Content analysis details: (2.0 points, 10.0 required)
pts rule name description
---- ---------------------- --------------------------------------------------
-0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at http://www.dnswl.org/, no
trust [209.85.215.180 listed in list.dnswl.org]
0.0 RCVD_IN_MSPIKE_H3 RBL: Good reputation (+3)
[209.85.215.180 listed in wl.mailspike.net]
0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
(assafgordon[at]gmail.com)
-0.0 SPF_PASS SPF: sender matches SPF record
0.0 RCVD_IN_MSPIKE_WL Mailspike good senders
1.8 MISSING_SUBJECT Missing Subject: header
0.2 NO_SUBJECT Extra score for no subject
X-Debbugs-Envelope-To: control
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
Sender: "Debbugs-submit"
X-Spam-Score: 1.0 (+)
severity 7948 wishlist
retitle 7948 multibyte: 16-bit wchar_t on Windows and Cygwin