From unknown Fri Jun 20 05:34:36 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#32272 <32272@debbugs.gnu.org> To: bug#32272 <32272@debbugs.gnu.org> Subject: Status: [PATCH] iscntrl: behavior for chars >= 0x80 Reply-To: bug#32272 <32272@debbugs.gnu.org> Date: Fri, 20 Jun 2025 12:34:36 +0000 retitle 32272 [PATCH] iscntrl: behavior for chars >=3D 0x80 reassign 32272 coreutils submitter 32272 L A Walsh severity 32272 normal tag 32272 patch thanks From debbugs-submit-bounces@debbugs.gnu.org Wed Jul 25 13:23:38 2018 Received: (at submit) by debbugs.gnu.org; 25 Jul 2018 17:23:38 +0000 Received: from localhost ([127.0.0.1]:57520 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fiNVi-00059h-LK for submit@debbugs.gnu.org; Wed, 25 Jul 2018 13:23:38 -0400 Received: from eggs.gnu.org ([208.118.235.92]:36786) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fiNVh-00059Q-BR for submit@debbugs.gnu.org; Wed, 25 Jul 2018 13:23:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fiNVZ-000849-9a for submit@debbugs.gnu.org; Wed, 25 Jul 2018 13:23:32 -0400 Received: from lists.gnu.org ([2001:4830:134:3::11]:58848) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fiNVZ-00083y-6Z for submit@debbugs.gnu.org; Wed, 25 Jul 2018 13:23:29 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55956) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fiNVV-0004Nf-To for bug-coreutils@gnu.org; Wed, 25 Jul 2018 13:23:28 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fiNVS-00081W-U8 for bug-coreutils@gnu.org; Wed, 25 Jul 2018 13:23:25 -0400 Received: from ishtar.tlinx.org ([173.164.175.65]:44372 helo=Ishtar.sc.tlinx.org) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fiNVS-0007zD-Jt for bug-coreutils@gnu.org; Wed, 25 Jul 2018 13:23:22 -0400 Received: from [192.168.3.12] (Athenae [192.168.3.12]) by Ishtar.sc.tlinx.org (8.14.7/8.14.4/SuSE Linux 0.8) with ESMTP id w6PHNFVI011901 for ; Wed, 25 Jul 2018 10:23:18 -0700 Message-ID: <5B58B203.5000106@tlinx.org> Date: Wed, 25 Jul 2018 10:23:15 -0700 From: L A Walsh User-Agent: Thunderbird MIME-Version: 1.0 To: Coreutils Subject: Re: [PATCH] iscntrl: behavior for chars >= 0x80 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x (no timestamps) [generic] [fuzzy] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -6.0 (------) P=C3=A1draig Brady wrote: > +This function does not support arguments outside of the range of the > +unsigned char type in locales with large character sets, on some platf= orms. > +OS X 10.5 will return non zero for characters >=3D 0x80 in UTF-8 local= es. > =20 --- According to Unicode, characters 0x80-0x9F are control characters, but characters >=3D0xA0 are not (and have different classifications (at least in Unicode). The patch doesn't say if OS X 10.5 is classifying them correctly or not. For example, 0xA0 is a type of Space, some are symbols, some ar= e letters, some are a type of punctuation, etc... Perhaps OS X is using their Unicode definition for characters defined= to be in a Unicode compatible encoding? =20