From unknown Sat Aug 09 09:33:17 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#16048 <16048@debbugs.gnu.org> To: bug#16048 <16048@debbugs.gnu.org> Subject: Status: 24.3.50; String compare surprise Reply-To: bug#16048 <16048@debbugs.gnu.org> Date: Sat, 09 Aug 2025 16:33:17 +0000 retitle 16048 24.3.50; String compare surprise reassign 16048 emacs submitter 16048 michael.albinus@gmx.de severity 16048 normal thanks From debbugs-submit-bounces@debbugs.gnu.org Wed Dec 04 06:44:46 2013 Received: (at submit) by debbugs.gnu.org; 4 Dec 2013 11:44:47 +0000 Received: from localhost ([127.0.0.1]:57415 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoAso-0001Jo-0w for submit@debbugs.gnu.org; Wed, 04 Dec 2013 06:44:46 -0500 Received: from eggs.gnu.org ([208.118.235.92]:36900) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoAsg-0001JY-R2 for submit@debbugs.gnu.org; Wed, 04 Dec 2013 06:44:43 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VoAsa-00063Y-Jq for submit@debbugs.gnu.org; Wed, 04 Dec 2013 06:44:38 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:50240) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VoAsa-00063U-HR for submit@debbugs.gnu.org; Wed, 04 Dec 2013 06:44:32 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:38588) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VoAsV-0005EL-1f for bug-gnu-emacs@gnu.org; Wed, 04 Dec 2013 06:44:32 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VoAsP-00061W-G0 for bug-gnu-emacs@gnu.org; Wed, 04 Dec 2013 06:44:26 -0500 Received: from mout.gmx.net ([212.227.17.20]:63855) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VoAsP-00060w-7O for bug-gnu-emacs@gnu.org; Wed, 04 Dec 2013 06:44:21 -0500 Received: from uw001237 ([79.193.173.75]) by mail.gmx.com (mrgmx101) with ESMTPA (Nemesis) id 0MTO3f-1WCkRo0fNn-00SR7O for ; Wed, 04 Dec 2013 12:44:19 +0100 From: michael.albinus@gmx.de To: bug-gnu-emacs@gnu.org Subject: 24.3.50; String compare surprise Date: Wed, 04 Dec 2013 12:44:18 +0100 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Provags-ID: V03:K0:inIj5E7mBWPP9gxBev+9aw/0vLxflDgcAzbofFwvM5olDYczVWO F2sj1k9IY5LZn8ePr1cG7VuS/kedq0sWSdp5HKz7Fi3kF5qUGhQDNgxd0UYeN0Pb4i2g2Wg mYmADoJoWjOodV89G4ny42yR+BUmiGBXUze9XIS8+JOiQe8EqVsuscWL5en/H/od/XBpU8N QI0kjEkgYbQqijlZXw3sg== X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.1 (----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.1 (----) The following form evals to nil: (string-equal "\377" "=C3=BF") The character code of "=C3=BF" is Char: =C3=BF (255, #o377, #xff, file ...) point=3D244 of 5726 (4%) column= =3D23 In GNU Emacs 24.3.50.10 (i686-pc-linux-gnu, GTK+ Version 2.24.10) of 2013-12-03 on uw001237 Bzr revision: 115361 rudalics@gmx.at-20131203074554-p6glzuiqh5zp4k97 Windowing system distributor `The X.Org Foundation', version 11.0.11103000 System Description: Ubuntu 12.04.3 LTS Important settings: value of $LC_MONETARY: en_US.UTF-8 value of $LC_NUMERIC: en_US.UTF-8 value of $LC_TIME: en_US.UTF-8 value of $LANG: en_US.UTF-8 locale-coding-system: utf-8-unix default enable-multibyte-characters: t Major mode: Group Minor modes in effect: gnus-undo-mode: t erc-notify-mode: t erc-list-mode: t erc-menu-mode: t erc-autojoin-mode: t erc-ring-mode: t erc-networks-mode: t erc-pcomplete-mode: t erc-track-mode: t erc-match-mode: t erc-button-mode: t erc-fill-mode: t erc-stamp-mode: t erc-netsplit-mode: t erc-irccontrols-mode: t erc-noncommands-mode: t erc-move-to-prompt-mode: t erc-readonly-mode: t display-time-mode: t shell-dirtrack-mode: t iswitchb-mode: t icomplete-mode: t show-paren-mode: t tooltip-mode: t electric-indent-mode: t mouse-wheel-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t buffer-read-only: t column-number-mode: t line-number-mode: t transient-mark-mode: t Recent input: C-g y x r e p o r t Recent messages: Opening TLS connection to `imap.gmx.net'...done Opening connection to imap.gmx.net...done Reading active file via nnml... Reading incoming mail from file... nnml: Reading incoming mail (no new mail)...done Reading active file via nnml...done Reading active file via nndraft...done nnimap read 0k from imap.gmx.net Checking new news...done Warning: Quit trying to open server nnimap+email.tieto.com Load-path shadows: /home/albinmic/src/elpa/packages/debbugs/debbugs hides /home/albinmic/.emac= s.d/elpa/debbugs-0.5/debbugs /home/albinmic/src/elpa/packages/debbugs/debbugs-gnu hides /home/albinmic/.= emacs.d/elpa/debbugs-0.5/debbugs-gnu /home/albinmic/src/elpa/packages/debbugs/debbugs-org hides /home/albinmic/.= emacs.d/elpa/debbugs-0.5/debbugs-org /home/albinmic/src/elpa/packages/debbugs/debbugs-pkg hides /home/albinmic/.= emacs.d/elpa/debbugs-0.5/debbugs-pkg /home/albinmic/src/elpa/packages/debbugs/debbugs-autoloads hides /home/albi= nmic/.emacs.d/elpa/debbugs-0.5/debbugs-autoloads ~/src/tramp/lisp/tramp-cache hides /home/albinmic/src/emacs/lisp/net/tramp-= cache ~/src/tramp/lisp/tramp-cmds hides /home/albinmic/src/emacs/lisp/net/tramp-c= mds ~/src/tramp/lisp/tramp-adb hides /home/albinmic/src/emacs/lisp/net/tramp-adb ~/src/tramp/lisp/trampver hides /home/albinmic/src/emacs/lisp/net/trampver ~/src/tramp/lisp/tramp-smb hides /home/albinmic/src/emacs/lisp/net/tramp-smb ~/src/tramp/lisp/tramp hides /home/albinmic/src/emacs/lisp/net/tramp ~/src/tramp/lisp/tramp-ftp hides /home/albinmic/src/emacs/lisp/net/tramp-ftp ~/src/tramp/lisp/tramp-gw hides /home/albinmic/src/emacs/lisp/net/tramp-gw ~/src/tramp/lisp/tramp-gvfs hides /home/albinmic/src/emacs/lisp/net/tramp-g= vfs ~/src/tramp/lisp/tramp-uu hides /home/albinmic/src/emacs/lisp/net/tramp-uu ~/src/tramp/lisp/tramp-sh hides /home/albinmic/src/emacs/lisp/net/tramp-sh ~/src/tramp/lisp/tramp-compat hides /home/albinmic/src/emacs/lisp/net/tramp= -compat ~/src/tramp/lisp/tramp-loaddefs hides /home/albinmic/src/emacs/lisp/net/tra= mp-loaddefs Features: (shadow sort mail-extr warnings emacsbug utf-7 nndraft nnmh nnml gnus-agent gnus-srvr gnus-score score-mode nnvirtual gnus-msg gnus-art mm-uu mml2015 epg-config mm-view mml-smime smime dig mailcap gnus-cache gnus-sum network-stream starttls nnimap parse-time tls utf7 netrc smtpmail sendmail gnus-demon nntp gnus-group gnus-undo nnmail mail-source nnoo gnus-start gnus-spec gnus-int gnus-range message rfc822 mml mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231 rfc2047 rfc2045 ietf-drums mailabbrev gmm-utils mailheader gnus-win gnus gnus-ems nnheader mail-utils erc-notify erc-list erc-menu erc-join erc-ring erc-networks erc-pcomplete erc-track erc-match erc-button wid-edit erc-fill erc-stamp erc-netsplit erc-goodies erc erc-backend erc-compat thingatpt pp cperl-mode info easymenu package time tramp tramp-compat auth-source eieio byte-opt bytecomp byte-compile cconv eieio-core gnus-util mm-util mail-prsvr password-cache tramp-loaddefs cl-macs gv trampver shell pcomplete comint ansi-color ring format-spec advice help-fns cl cl-loaddefs cl-lib iswitchb jka-compr icomplete paren ps-print ps-def lpr vc vc-dispatcher dired time-date tooltip electric uniquify ediff-hook vc-hooks lisp-float-type mwheel x-win x-dnd tool-bar dnd fontset image regexp-opt fringe tabulated-list newcomment lisp-mode prog-mode register page menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock font-lock syntax facemenu font-core frame cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese case-table epa-hook jka-cmpr-hook help simple abbrev minibuffer nadvice loaddefs button faces cus-face macroexp files text-properties overlay sha1 md5 base64 format env code-pages mule custom widget hashtable-print-readable backquote make-network-process dbusbind gfilenotify dynamic-setting system-font-setting font-render-setting move-toolbar gtk x-toolkit x multi-tty emacs) From debbugs-submit-bounces@debbugs.gnu.org Wed Dec 04 08:08:06 2013 Received: (at 16048) by debbugs.gnu.org; 4 Dec 2013 13:08:06 +0000 Received: from localhost ([127.0.0.1]:57459 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoCBS-0004Wr-Dp for submit@debbugs.gnu.org; Wed, 04 Dec 2013 08:08:06 -0500 Received: from mail-out.m-online.net ([212.18.0.9]:41487) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoCBP-0004Wi-KC for 16048@debbugs.gnu.org; Wed, 04 Dec 2013 08:08:04 -0500 Received: from frontend1.mail.m-online.net (unknown [192.168.8.180]) by mail-out.m-online.net (Postfix) with ESMTP id 3dZL2p34xsz4KK6m; Wed, 4 Dec 2013 14:08:02 +0100 (CET) Received: from localhost (dynscan1.mnet-online.de [192.168.6.68]) by mail.m-online.net (Postfix) with ESMTP id 3dZL2p25twzbbhB; Wed, 4 Dec 2013 14:08:02 +0100 (CET) X-Virus-Scanned: amavisd-new at mnet-online.de Received: from mail.mnet-online.de ([192.168.8.180]) by localhost (dynscan1.mail.m-online.net [192.168.6.68]) (amavisd-new, port 10024) with ESMTP id iRFHsnnTP0NZ; Wed, 4 Dec 2013 14:07:59 +0100 (CET) X-Auth-Info: h588YipyUmHqgR3jX+UDNhW4bl7nh3xKlnAoYJWz/ys= Received: from igel.home (ppp-46-244-239-225.dynamic.mnet-online.de [46.244.239.225]) by mail.mnet-online.de (Postfix) with ESMTPA; Wed, 4 Dec 2013 14:07:59 +0100 (CET) Received: by igel.home (Postfix, from userid 1000) id 785D32C0083; Wed, 4 Dec 2013 14:07:59 +0100 (CET) From: Andreas Schwab To: michael.albinus@gmx.de Subject: Re: bug#16048: 24.3.50; String compare surprise References: X-Yow: Are we THERE yet? Date: Wed, 04 Dec 2013 14:07:59 +0100 In-Reply-To: (michael albinus's message of "Wed, 04 Dec 2013 12:44:18 +0100") Message-ID: <87r49six1s.fsf@igel.home> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 16048 Cc: 16048@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) michael.albinus@gmx.de writes: > The following form evals to nil: > > (string-equal "\377" "ÿ") "\377" is a unibyte string. When converted to multibyte it yields "\x3fffff". Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." From debbugs-submit-bounces@debbugs.gnu.org Wed Dec 04 09:01:21 2013 Received: (at 16048) by debbugs.gnu.org; 4 Dec 2013 14:01:21 +0000 Received: from localhost ([127.0.0.1]:57491 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoD0y-0005uG-MT for submit@debbugs.gnu.org; Wed, 04 Dec 2013 09:01:21 -0500 Received: from mail-wg0-f42.google.com ([74.125.82.42]:44532) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoD0v-0005u7-VZ for 16048@debbugs.gnu.org; Wed, 04 Dec 2013 09:01:18 -0500 Received: by mail-wg0-f42.google.com with SMTP id a1so6968806wgh.5 for <16048@debbugs.gnu.org>; Wed, 04 Dec 2013 06:01:16 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc:content-type; bh=DCjD/lIdkigpidRIgP6Z8k0vhVs5yW9VN8Bp2twxrp4=; b=lEwzj47BbhyIk0QC5GTG52Xm73vlhOQVg0RNRt/tTps6K3vFJqoprWKUmX/G/4LF39 HvSNgLTdYV2g70XieMTs+JN1rQHuVQLQdOpeRUlQX/j8QMRzvKDxJPKv3560w9HBrssy 1pgvQT+HjBLkbqAtI8FSoYXI9c5ITbL4XD/jhyVlDVA4yExDq2uOQ+VHpJPEVPk8kkVz ncrmFjPoRtM1jaJ4RM9cQ3mFTZf4myQVoVWmuWAaxxVRYs8eJJHidO41QiDfxKWBgKvE t7fUUfo7uT6F7mvWgOLrTzovFYmRjn2vjzK555XvhAYCbtWVB+xXhxzhtJRN2+W3As2J jJ4w== X-Gm-Message-State: ALoCoQlnCpakMsPBHFY9q+hBUvGkylZiQ4quj4Ts9HYxjGxWGl3Xaaod7XxlCyLcSOLbQx5gGx6M X-Received: by 10.180.89.68 with SMTP id bm4mr7558853wib.0.1386165676805; Wed, 04 Dec 2013 06:01:16 -0800 (PST) MIME-Version: 1.0 Received: by 10.194.24.7 with HTTP; Wed, 4 Dec 2013 06:00:46 -0800 (PST) In-Reply-To: <87r49six1s.fsf@igel.home> References: <87r49six1s.fsf@igel.home> From: Josh Date: Wed, 4 Dec 2013 06:00:46 -0800 X-Google-Sender-Auth: HCGE3-df_o2WduhuVnUC1zPCE00 Message-ID: Subject: Re: bug#16048: 24.3.50; String compare surprise To: Andreas Schwab Content-Type: multipart/alternative; boundary=f46d04462eacc421bb04ecb5d8e5 X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 16048 Cc: 16048@debbugs.gnu.org, Michael Albinus X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --f46d04462eacc421bb04ecb5d8e5 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On Wed, Dec 4, 2013 at 5:07 AM, Andreas Schwab wrote= : > michael.albinus@gmx.de writes: > > > The following form evals to nil: > > > > (string-equal "\377" "=FF") > > "\377" is a unibyte string. When converted to multibyte it yields > "\x3fffff". At least as of 24.3, the manual[0] suggests that such a conversion should not occur in this case: You can also use hexadecimal escape sequences (`\xN') and octal escape sequences (`\N') in string constants. *But beware:* If a string constant contains hexadecimal or octal escape sequences, and these escape sequences all specify unibyte characters (i.e., less than 256), and there are no other literal non-ASCII characters or Unicode-style escape sequences in the string, then Emacs automatically assumes that it is a unibyte string. That is to say, it assumes that all non-ASCII characters occurring in the string are 8-bit raw bytes. [0] (info "(elisp) Non-ASCII in Strings") Josh --f46d04462eacc421bb04ecb5d8e5 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
On Wed, Dec 4, 2013 at 5:07 AM, Andreas Schwab <sch= wab@linux-m68k.org> wrote:
michael.albinus@gmx.de writes:

> The following form evals to nil:
>
> =A0 (string-equal "\377" "=FF")

"\377" is a unibyte string. =A0When converted to multibyte = it yields
"\x3fffff".

At least as of 24.3, the manual[= 0] suggests that such a conversion
should not occur in this case:
=A0=A0=A0 You can also use hexadecimal escape sequences (`\xN') and oc= tal
=A0=A0=A0 escape sequences (`\N') in string constants.=A0 *But beware:*= If a
=A0=A0=A0 string constant contains hexadecimal or octal escape seq= uences,
=A0=A0=A0 and these escape sequences all specify unibyte charact= ers (i.e.,
=A0=A0=A0 less than 256), and there are no other literal non-= ASCII
=A0=A0=A0 characters or Unicode-style escape sequences in the string, then<= br>=A0=A0=A0 Emacs automatically assumes that it is a unibyte string.=A0 Th= at is
=A0=A0=A0 to say, it assumes that all non-ASCII characters occurri= ng in the
=A0=A0=A0 string are 8-bit raw bytes.

[0] (info "(elisp) Non-ASCII in Strings")

Josh
=A0<= /div>
--f46d04462eacc421bb04ecb5d8e5-- From debbugs-submit-bounces@debbugs.gnu.org Wed Dec 04 09:05:20 2013 Received: (at 16048-done) by debbugs.gnu.org; 4 Dec 2013 14:05:20 +0000 Received: from localhost ([127.0.0.1]:57497 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoD4p-00060g-Fk for submit@debbugs.gnu.org; Wed, 04 Dec 2013 09:05:19 -0500 Received: from mout.gmx.net ([212.227.15.15]:62423) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoD4l-00060W-Sx for 16048-done@debbugs.gnu.org; Wed, 04 Dec 2013 09:05:16 -0500 Received: from detlef.gmx.de ([79.230.79.179]) by mail.gmx.com (mrgmx103) with ESMTPS (Nemesis) id 0MdK8t-1W53Gr1HbH-00IXvt for <16048-done@debbugs.gnu.org>; Wed, 04 Dec 2013 15:05:14 +0100 From: Michael Albinus To: Andreas Schwab Subject: Re: bug#16048: 24.3.50; String compare surprise In-Reply-To: <87r49six1s.fsf@igel.home> (Andreas Schwab's message of "Wed, 04 Dec 2013 14:07:59 +0100") References: <87r49six1s.fsf@igel.home> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux) Date: Wed, 04 Dec 2013 15:05:00 +0100 Message-ID: <878uw0ogoj.fsf@gmx.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Provags-ID: V03:K0:CiiEdvVbXeWbDI2a7Eov1DlguJgtqf6VarFKOeyIkIzlMcM0xUG 4jc5a73OXSpsmKJT74NTclZl0rFhXAmD+E1TYWm1kQX7tXb3+uFSfzeVHw5QcbsefctCgq6 OubVGgA0BR1tOv3GNZ8HZgQ+tSd0bCaHWkvT/ypZqv27ILMCop7gJBv06T5CU3qDAB6NJik faSFLBaCzKGNO6l0hJ9eQ== X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 16048-done Cc: 16048-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) Andreas Schwab writes: > michael.albinus@gmx.de writes: > >> The following form evals to nil: >> >> (string-equal "\377" "=C3=BF") > > "\377" is a unibyte string. When converted to multibyte it yields > "\x3fffff". Ah, well. In `dbus-unescape-from-identifier', there is (lambda (x) (format "%c" (string-to-number (substring x 1) 16))) If I replace it by (lambda (x) (byte-to-string (string-to-number (substring x 1) 16))) everything works fine. > Andreas. Thanks, and best regards, Michael (writng dbus-tests.el). From debbugs-submit-bounces@debbugs.gnu.org Wed Dec 04 12:30:12 2013 Received: (at 16048) by debbugs.gnu.org; 4 Dec 2013 17:30:12 +0000 Received: from localhost ([127.0.0.1]:58254 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoGH4-00041b-UF for submit@debbugs.gnu.org; Wed, 04 Dec 2013 12:30:11 -0500 Received: from mtaout22.012.net.il ([80.179.55.172]:58509) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoGGy-00040g-S3 for 16048@debbugs.gnu.org; Wed, 04 Dec 2013 12:30:06 -0500 Received: from conversion-daemon.a-mtaout22.012.net.il by a-mtaout22.012.net.il (HyperSendmail v2007.08) id <0MXA00H00LW91300@a-mtaout22.012.net.il> for 16048@debbugs.gnu.org; Wed, 04 Dec 2013 19:29:32 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout22.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0MXA00GHDLX8FGC0@a-mtaout22.012.net.il>; Wed, 04 Dec 2013 19:29:32 +0200 (IST) Date: Wed, 04 Dec 2013 19:29:30 +0200 From: Eli Zaretskii Subject: Re: bug#16048: 24.3.50; String compare surprise In-reply-to: X-012-Sender: halo1@inter.net.il To: Josh Message-id: <831u1s4j9h.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=iso-8859-1 Content-transfer-encoding: 8BIT References: <87r49six1s.fsf@igel.home> X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 16048 Cc: 16048@debbugs.gnu.org, michael.albinus@gmx.de, schwab@linux-m68k.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > From: Josh > Date: Wed, 4 Dec 2013 06:00:46 -0800 > Cc: Michael Albinus , 16048@debbugs.gnu.org > > On Wed, Dec 4, 2013 at 5:07 AM, Andreas Schwab wrote: > > > michael.albinus@gmx.de writes: > > > > > The following form evals to nil: > > > > > > (string-equal "\377" "ÿ") > > > > "\377" is a unibyte string. When converted to multibyte it yields > > "\x3fffff". > > > At least as of 24.3, the manual[0] suggests that such a conversion > should not occur in this case: And it doesn't occur, indeed: (multibyte-string-p "\377") => nil > You can also use hexadecimal escape sequences (`\xN') and octal > escape sequences (`\N') in string constants. *But beware:* If a > string constant contains hexadecimal or octal escape sequences, > and these escape sequences all specify unibyte characters (i.e., > less than 256), and there are no other literal non-ASCII > characters or Unicode-style escape sequences in the string, then > Emacs automatically assumes that it is a unibyte string. That is > to say, it assumes that all non-ASCII characters occurring in the > string are 8-bit raw bytes. > > [0] (info "(elisp) Non-ASCII in Strings") Best citation contest? you're on! -- Function: string= string1 string2 This function returns `t' if the characters of the two strings match exactly. Symbols are also allowed as arguments, in which case the symbol names are used. Case is always significant, regardless of `case-fold-search'. [...] For technical reasons, a unibyte and a multibyte string are `equal' if and only if they contain the same sequence of character codes and all these codes are either in the range 0 through 127 (ASCII) or 160 through 255 (`eight-bit-graphic'). However, when a unibyte string is converted to a multibyte string, all characters with codes in the range 160 through 255 are converted to characters with higher codes, whereas ASCII characters remain unchanged. Thus, a unibyte string and its conversion to multibyte are only `equal' if the string is all ASCII. Note the last sentence. From debbugs-submit-bounces@debbugs.gnu.org Wed Dec 04 12:34:44 2013 Received: (at 16048) by debbugs.gnu.org; 4 Dec 2013 17:34:44 +0000 Received: from localhost ([127.0.0.1]:58258 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoGLU-00048j-48 for submit@debbugs.gnu.org; Wed, 04 Dec 2013 12:34:44 -0500 Received: from mtaout20.012.net.il ([80.179.55.166]:48390) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoGLR-00048Y-8e for 16048@debbugs.gnu.org; Wed, 04 Dec 2013 12:34:42 -0500 Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0MXA00K00M0MBJ00@a-mtaout20.012.net.il> for 16048@debbugs.gnu.org; Wed, 04 Dec 2013 19:34:40 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0MXA00KXUM5R6I30@a-mtaout20.012.net.il>; Wed, 04 Dec 2013 19:34:40 +0200 (IST) Date: Wed, 04 Dec 2013 19:34:37 +0200 From: Eli Zaretskii Subject: Re: bug#16048: 24.3.50; String compare surprise In-reply-to: <878uw0ogoj.fsf@gmx.de> X-012-Sender: halo1@inter.net.il To: Michael Albinus Message-id: <83zjog34gi.fsf@gnu.org> References: <87r49six1s.fsf@igel.home> <878uw0ogoj.fsf@gmx.de> X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 16048 Cc: 16048@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > From: Michael Albinus > Date: Wed, 04 Dec 2013 15:05:00 +0100 > Cc: 16048-done@debbugs.gnu.org > > Ah, well. In `dbus-unescape-from-identifier', there is > > (lambda (x) (format "%c" (string-to-number (substring x 1) 16))) > > If I replace it by > > (lambda (x) (byte-to-string (string-to-number (substring x 1) 16))) > > everything works fine. Beware: byte-to-string returns a unibyte string. You do NOT want unibyte strings in your application code. The problem you had that started this thread is a very good demonstration why. So I would leave dbus-unescape-from-identifier intact, and instead fix the other side of the string comparison, the one that yields "\377". From debbugs-submit-bounces@debbugs.gnu.org Wed Dec 04 14:12:18 2013 Received: (at 16048) by debbugs.gnu.org; 4 Dec 2013 19:12:18 +0000 Received: from localhost ([127.0.0.1]:58378 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoHru-0006iC-2Y for submit@debbugs.gnu.org; Wed, 04 Dec 2013 14:12:18 -0500 Received: from ironport2-out.teksavvy.com ([206.248.154.181]:26356) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoHrs-0006i4-1s for 16048@debbugs.gnu.org; Wed, 04 Dec 2013 14:12:16 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Av8EABK/CFFFxL6g/2dsb2JhbABEuzWDWRdzgh4BAQQBViMFCws0EhQYDSSIHgbBLZEKA4hhnBmBXoMV X-IPAS-Result: Av8EABK/CFFFxL6g/2dsb2JhbABEuzWDWRdzgh4BAQQBViMFCws0EhQYDSSIHgbBLZEKA4hhnBmBXoMV X-IronPort-AV: E=Sophos;i="4.84,565,1355115600"; d="scan'208";a="41141096" Received: from 69-196-190-160.dsl.teksavvy.com (HELO pastel.home) ([69.196.190.160]) by ironport2-out.teksavvy.com with ESMTP/TLS/ADH-AES256-SHA; 04 Dec 2013 14:12:15 -0500 Received: by pastel.home (Postfix, from userid 20848) id 6EAA260045; Wed, 4 Dec 2013 14:12:15 -0500 (EST) From: Stefan Monnier To: Eli Zaretskii Subject: Re: bug#16048: 24.3.50; String compare surprise Message-ID: References: <87r49six1s.fsf@igel.home> <878uw0ogoj.fsf@gmx.de> <83zjog34gi.fsf@gnu.org> Date: Wed, 04 Dec 2013 14:12:15 -0500 In-Reply-To: <83zjog34gi.fsf@gnu.org> (Eli Zaretskii's message of "Wed, 04 Dec 2013 19:34:37 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 16048 Cc: 16048@debbugs.gnu.org, Michael Albinus X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.3 (/) > Beware: byte-to-string returns a unibyte string. You do NOT want > unibyte strings in your application code. IIUC this is dbus code, so it likely handles marshalled data, which often has to manage bytes rather than chars, so a unibyte string might be the right thing. Stefan "who doesn't actually know what he's talking about" From debbugs-submit-bounces@debbugs.gnu.org Wed Dec 04 15:14:17 2013 Received: (at 16048) by debbugs.gnu.org; 4 Dec 2013 20:14:17 +0000 Received: from localhost ([127.0.0.1]:58419 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoIps-0008Ii-Gv for submit@debbugs.gnu.org; Wed, 04 Dec 2013 15:14:17 -0500 Received: from mail-wi0-f182.google.com ([209.85.212.182]:54450) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoIpp-0008IY-HC for 16048@debbugs.gnu.org; Wed, 04 Dec 2013 15:14:14 -0500 Received: by mail-wi0-f182.google.com with SMTP id en1so8786691wid.3 for <16048@debbugs.gnu.org>; Wed, 04 Dec 2013 12:14:12 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc:content-type; bh=MowrIl11JGkgBTg6TBsk6RUBMV47PMZhFQV1JjkCxqc=; b=kAY5SinZEcuEC/3IbGpnvu2ka/HP/DeihdWyATpy55WrFshWbwNmkYjYvp75tAr4YD AyzM0Jwog5bLYBJI1R4PEoRElmI+4i4nZxmxdzx4IpGnuR/a3nC/HsDoQNAo215Bs45K 7oK6b91mkUXQ1AEDuRdgqISx5Bj2IXq+9KzYJkOhCn8Kit2H/2HfMQOpUr8y87YzVXP1 DCze9HHYUTCxMDyH9/8MbG2hXt7GTAjF7kr+AKNhM0uWkOSO9KcbZZv0POuRinqXHfHt 40gST9hlZSXe4COAf59oCcmoEq8t6sJw/hyLcFXImdhVLTuEqPmf9NzvjnJJ7OtViHlQ 1ZuA== X-Gm-Message-State: ALoCoQmNlq0K87e7It1k/6nASOf0aB+BP5fZVQHzOVezVBNHDL2g8ujs9FAIx9vOfoDb9zcLqPEj X-Received: by 10.181.12.20 with SMTP id em20mr9062077wid.0.1386188052496; Wed, 04 Dec 2013 12:14:12 -0800 (PST) MIME-Version: 1.0 Received: by 10.194.24.7 with HTTP; Wed, 4 Dec 2013 12:13:42 -0800 (PST) In-Reply-To: <831u1s4j9h.fsf@gnu.org> References: <87r49six1s.fsf@igel.home> <831u1s4j9h.fsf@gnu.org> From: Josh Date: Wed, 4 Dec 2013 12:13:42 -0800 X-Google-Sender-Auth: F8LjqOdgy_wKjVGULlAgvDOc680 Message-ID: Subject: Re: bug#16048: 24.3.50; String compare surprise To: Eli Zaretskii Content-Type: multipart/alternative; boundary=f46d043bdfa2761e2304ecbb0e81 X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 16048 Cc: 16048@debbugs.gnu.org, Michael Albinus , Andreas Schwab X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --f46d043bdfa2761e2304ecbb0e81 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On Wed, Dec 4, 2013 at 9:29 AM, Eli Zaretskii wrote: > > From: Josh > > Date: Wed, 4 Dec 2013 06:00:46 -0800 > > Cc: Michael Albinus , 16048@debbugs.gnu.org > > On Wed, Dec 4, 2013 at 5:07 AM, Andreas Schwab wrote: > > > michael.albinus@gmx.de writes: > > > > > > > The following form evals to nil: > > > > > > > > (string-equal "\377" "=FF") > > > > > > "\377" is a unibyte string. When converted to multibyte it yields > > > "\x3fffff". > > > > > > At least as of 24.3, the manual[0] suggests that such a conversion > > should not occur in this case: > And it doesn't occur, indeed: > > (multibyte-string-p "\377") > > =3D> nil > > > You can also use hexadecimal escape sequences (`\xN') and octal > > escape sequences (`\N') in string constants. *But beware:* If a > > string constant contains hexadecimal or octal escape sequences, > > and these escape sequences all specify unibyte characters (i.e., > > less than 256), and there are no other literal non-ASCII > > characters or Unicode-style escape sequences in the string, then > > Emacs automatically assumes that it is a unibyte string. That is > > to say, it assumes that all non-ASCII characters occurring in the > > string are 8-bit raw bytes. > > > > [0] (info "(elisp) Non-ASCII in Strings") > Best citation contest? you're on! No, thanks. I haven't entered such contests in many years. > -- Function: string=3D string1 string2 > This function returns `t' if the characters of the two strings > match exactly. Symbols are also allowed as arguments, in which > case the symbol names are used. Case is always significant, > regardless of `case-fold-search'. > > [...] > > For technical reasons, a unibyte and a multibyte string are > `equal' if and only if they contain the same sequence of character > codes and all these codes are either in the range 0 through 127 > (ASCII) or 160 through 255 (`eight-bit-graphic'). However, when a > unibyte string is converted to a multibyte string, all characters > with codes in the range 160 through 255 are converted to > characters with higher codes, whereas ASCII characters remain > unchanged. Thus, a unibyte string and its conversion to multibyte > are only `equal' if the string is all ASCII. > > Note the last sentence. Yes, I must have misunderstood Andreas' meaning; I believed he was suggesting that the two strings compared differently due to "\377" having been converted to a multibyte string and therefore miscomparing with the unibyte (or so I thought) string "=FF". I see now that I had it exactly backwards. Thanks for setting me straight. --f46d043bdfa2761e2304ecbb0e81 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
On Wed, Dec 4, 2013 at 9:29 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> > From: Jos= h <josh@foxtail.org>
> = > Date: Wed, 4 Dec 2013 06:00:46 -0800
> > Cc: Michael Albinus <michael.albinus@gmx.de>, 1= 6048@debbugs.gnu.org
> > On Wed, Dec 4, 2013 at 5:07 AM, Andre= as Schwab <schwab@linux-m68k.or= g>wrote:
> > > michael.albinus@gm= x.de writes:
> > >
> > > > The following for= m evals to nil:
> > > >
> > > >=A0=A0 (string= -equal "\377" "=FF")
> > >
> > > "\377" is a unibyte string.=A0 W= hen converted to multibyte it yields
> > > "\x3fffff"= .
> >
> >
> > At least as of 24.3, the manual[0]= suggests that such a conversion
> > should not occur in this case:
> And it doesn't occur, = indeed:
>
>=A0=A0 (multibyte-string-p "\377")
>=
>=A0=A0=A0=A0 =3D> nil
>
> >=A0=A0=A0=A0 You can a= lso use hexadecimal escape sequences (`\xN') and octal
> >=A0=A0=A0=A0 escape sequences (`\N') in string constants.=A0 *= But beware:* If a
> >=A0=A0=A0=A0 string constant contains hexadec= imal or octal escape sequences,
> >=A0=A0=A0=A0 and these escape s= equences all specify unibyte characters (i.e.,
> >=A0=A0=A0=A0 less than 256), and there are no other literal non-AS= CII
> >=A0=A0=A0=A0 characters or Unicode-style escape sequences i= n the string, then
> >=A0=A0=A0=A0 Emacs automatically assumes tha= t it is a unibyte string.=A0 That is
> >=A0=A0=A0=A0 to say, it assumes that all non-ASCII characters occu= rring in the
> >=A0=A0=A0=A0 string are 8-bit raw bytes.
> &= gt;
> > [0] (info "(elisp) Non-ASCII in Strings")
>= ; Best citation contest? you're on!

No, thanks.=A0 I haven't entered such contests in many years.
>=A0=A0=A0 -- Function: string=3D string1 string2
>=A0=A0=A0=A0= =A0=A0=A0 This function returns `t' if the characters of the two string= s
>=A0=A0=A0=A0=A0=A0=A0 match exactly.=A0 Symbols are also allowed a= s arguments, in which
>=A0=A0=A0=A0=A0=A0=A0 case the symbol names are used.=A0 Case is always= significant,
>=A0=A0=A0=A0=A0=A0=A0 regardless of `case-fold-search&= #39;.
>
>=A0=A0=A0 [...]
>
>=A0=A0=A0=A0=A0=A0=A0 F= or technical reasons, a unibyte and a multibyte string are
>=A0=A0=A0=A0=A0=A0=A0 `equal' if and only if they contain the same = sequence of character
>=A0=A0=A0=A0=A0=A0=A0 codes and all these code= s are either in the range 0 through 127
>=A0=A0=A0=A0=A0=A0=A0 (ASCII= ) or 160 through 255 (`eight-bit-graphic').=A0 However, when a
>=A0=A0=A0=A0=A0=A0=A0 unibyte string is converted to a multibyte string= , all characters
>=A0=A0=A0=A0=A0=A0=A0 with codes in the range 160 t= hrough 255 are converted to
>=A0=A0=A0=A0=A0=A0=A0 characters with hi= gher codes, whereas ASCII characters remain
>=A0=A0=A0=A0=A0=A0=A0 unchanged.=A0 Thus, a unibyte string and its conv= ersion to multibyte
>=A0=A0=A0=A0=A0=A0=A0 are only `equal' if th= e string is all ASCII.
>
> Note the last sentence.

Yes,= I must have misunderstood Andreas' meaning; I believed he was
suggesting that the two strings compared differently due to "\377"= ;
having been converted to a multibyte string and therefore miscomparing=
with the unibyte (or so I thought) string "=FF".=A0 I see now= that I had
it exactly backwards.=A0 Thanks for setting me straight.

--f46d043bdfa2761e2304ecbb0e81-- From debbugs-submit-bounces@debbugs.gnu.org Thu Dec 05 02:52:14 2013 Received: (at 16048) by debbugs.gnu.org; 5 Dec 2013 07:52:14 +0000 Received: from localhost ([127.0.0.1]:58897 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoTjJ-00035E-2N for submit@debbugs.gnu.org; Thu, 05 Dec 2013 02:52:13 -0500 Received: from mout.gmx.net ([212.227.15.19]:56665) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoTj4-00034W-Nm for 16048@debbugs.gnu.org; Thu, 05 Dec 2013 02:52:08 -0500 Received: from detlef.gmx.de ([87.146.37.149]) by mail.gmx.com (mrgmx003) with ESMTPS (Nemesis) id 0LpKrt-1VLBFJ26ZJ-00fBUO for <16048@debbugs.gnu.org>; Thu, 05 Dec 2013 08:51:57 +0100 From: Michael Albinus To: Stefan Monnier Subject: Re: bug#16048: 24.3.50; String compare surprise References: <87r49six1s.fsf@igel.home> <878uw0ogoj.fsf@gmx.de> <83zjog34gi.fsf@gnu.org> Date: Thu, 05 Dec 2013 08:51:41 +0100 In-Reply-To: (Stefan Monnier's message of "Wed, 04 Dec 2013 14:12:15 -0500") Message-ID: <87vbz3ybua.fsf@gmx.de> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Provags-ID: V03:K0:R0yGO0sdT9xY8VP6tuzuzGVLPxfCzYyVDnxtRAZ3/Jr0wjKR05s y2GHqXTjZlBsTFAtsutlZgeCwlThYvr+4ScUitXkyuIXxdVzt/zpOCR+DkwHZLDMVjiW+YI NBgHZlG8NZ/YkUuo2oqBarDZkx+IczXludzQzi8B0xg0TBhLyV6CUw+/HWikJwByCEP4+9R afXllyBDfsMCjZNGBEVsw== X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 16048 Cc: Eli Zaretskii , 16048@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) Stefan Monnier writes: >> Beware: byte-to-string returns a unibyte string. You do NOT want >> unibyte strings in your application code. > > IIUC this is dbus code, so it likely handles marshalled data, which > often has to manage bytes rather than chars, so a unibyte string might > be the right thing. Indeed. My ert test case is (should (string-equal (dbus-unescape-from-identifier (dbus-escape-as-identifier "0123abc_xyz\x01\xff")) "0123abc_xyz\x01\xff")) `dbus-unescape-from-identifier' cannot know, whether the original string was unibyte or multibyte. So it must decide for one, and unibyte seems to be the better decision. I will add to the docstring of `dbus-unescape-from-identifier', that it returns always a unibyte string. > Stefan "who doesn't actually know what he's talking about" Best regards, Michael. From debbugs-submit-bounces@debbugs.gnu.org Thu Dec 05 12:39:11 2013 Received: (at 16048) by debbugs.gnu.org; 5 Dec 2013 17:39:11 +0000 Received: from localhost ([127.0.0.1]:60157 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoctK-00021K-Ek for submit@debbugs.gnu.org; Thu, 05 Dec 2013 12:39:10 -0500 Received: from mtaout23.012.net.il ([80.179.55.175]:46571) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoctG-000219-0y for 16048@debbugs.gnu.org; Thu, 05 Dec 2013 12:39:07 -0500 Received: from conversion-daemon.a-mtaout23.012.net.il by a-mtaout23.012.net.il (HyperSendmail v2007.08) id <0MXC00B00GVAR700@a-mtaout23.012.net.il> for 16048@debbugs.gnu.org; Thu, 05 Dec 2013 19:39:04 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout23.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0MXC00BLAH14OQ60@a-mtaout23.012.net.il>; Thu, 05 Dec 2013 19:39:04 +0200 (IST) Date: Thu, 05 Dec 2013 19:38:51 +0200 From: Eli Zaretskii Subject: Re: bug#16048: 24.3.50; String compare surprise In-reply-to: <87vbz3ybua.fsf@gmx.de> X-012-Sender: halo1@inter.net.il To: Michael Albinus Message-id: <83k3fj2o5w.fsf@gnu.org> References: <87r49six1s.fsf@igel.home> <878uw0ogoj.fsf@gmx.de> <83zjog34gi.fsf@gnu.org> <87vbz3ybua.fsf@gmx.de> X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 16048 Cc: 16048@debbugs.gnu.org, monnier@iro.umontreal.ca X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > From: Michael Albinus > Cc: Eli Zaretskii , 16048@debbugs.gnu.org > Date: Thu, 05 Dec 2013 08:51:41 +0100 > > > IIUC this is dbus code, so it likely handles marshalled data, which > > often has to manage bytes rather than chars, so a unibyte string might > > be the right thing. > > Indeed. My ert test case is > > (should > (string-equal > (dbus-unescape-from-identifier > (dbus-escape-as-identifier "0123abc_xyz\x01\xff")) > "0123abc_xyz\x01\xff")) FWIW, I don't see anything in this snippet that requires unibyte strings. Just to make it clear: Emacs is perfectly capable of holding raw bytes in multibyte strings. That's why we have the eight-bit charset. From debbugs-submit-bounces@debbugs.gnu.org Thu Dec 05 14:11:15 2013 Received: (at 16048) by debbugs.gnu.org; 5 Dec 2013 19:11:15 +0000 Received: from localhost ([127.0.0.1]:60330 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoeKR-0004NU-6f for submit@debbugs.gnu.org; Thu, 05 Dec 2013 14:11:15 -0500 Received: from ironport2-out.teksavvy.com ([206.248.154.181]:49771) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoeKP-0004NM-1l for 16048@debbugs.gnu.org; Thu, 05 Dec 2013 14:11:13 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Av8EABK/CFFFxL6g/2dsb2JhbABEuzWDWRdzgh4BAQQBViMFCws0EhQYDSSIHgbBLZEKA4hhnBmBXoMV X-IPAS-Result: Av8EABK/CFFFxL6g/2dsb2JhbABEuzWDWRdzgh4BAQQBViMFCws0EhQYDSSIHgbBLZEKA4hhnBmBXoMV X-IronPort-AV: E=Sophos;i="4.84,565,1355115600"; d="scan'208";a="41233885" Received: from 69-196-190-160.dsl.teksavvy.com (HELO pastel.home) ([69.196.190.160]) by ironport2-out.teksavvy.com with ESMTP/TLS/ADH-AES256-SHA; 05 Dec 2013 14:11:12 -0500 Received: by pastel.home (Postfix, from userid 20848) id E3C5060379; Thu, 5 Dec 2013 14:11:11 -0500 (EST) From: Stefan Monnier To: Eli Zaretskii Subject: Re: bug#16048: 24.3.50; String compare surprise Message-ID: References: <87r49six1s.fsf@igel.home> <878uw0ogoj.fsf@gmx.de> <83zjog34gi.fsf@gnu.org> <87vbz3ybua.fsf@gmx.de> <83k3fj2o5w.fsf@gnu.org> Date: Thu, 05 Dec 2013 14:11:11 -0500 In-Reply-To: <83k3fj2o5w.fsf@gnu.org> (Eli Zaretskii's message of "Thu, 05 Dec 2013 19:38:51 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 16048 Cc: 16048@debbugs.gnu.org, Michael Albinus X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.3 (/) > Just to make it clear: Emacs is perfectly capable of holding raw bytes > in multibyte strings. That's why we have the eight-bit charset. When manipulating sequences of bytes (as opposed to sequences of chars), I find it is preferable to use unibyte strings. Indeed, multibyte strings can work as well, but they can be more tricky to work with since `aref' returns a "eight-bit byte" character rather than a value between 128-255. Of course, if your string can contain a mix of bytes and chars, you don't have a choice. Stefan From debbugs-submit-bounces@debbugs.gnu.org Thu Dec 05 14:19:08 2013 Received: (at 16048) by debbugs.gnu.org; 5 Dec 2013 19:19:08 +0000 Received: from localhost ([127.0.0.1]:60336 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoeS4-0004ZJ-81 for submit@debbugs.gnu.org; Thu, 05 Dec 2013 14:19:08 -0500 Received: from mtaout20.012.net.il ([80.179.55.166]:63057) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoeS1-0004Z7-Eq for 16048@debbugs.gnu.org; Thu, 05 Dec 2013 14:19:06 -0500 Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0MXC00A00LDUZD00@a-mtaout20.012.net.il> for 16048@debbugs.gnu.org; Thu, 05 Dec 2013 21:19:03 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0MXC00AZKLNRQO90@a-mtaout20.012.net.il>; Thu, 05 Dec 2013 21:19:03 +0200 (IST) Date: Thu, 05 Dec 2013 21:18:50 +0200 From: Eli Zaretskii Subject: Re: bug#16048: 24.3.50; String compare surprise In-reply-to: X-012-Sender: halo1@inter.net.il To: Stefan Monnier Message-id: <838uvz2jj9.fsf@gnu.org> References: <87r49six1s.fsf@igel.home> <878uw0ogoj.fsf@gmx.de> <83zjog34gi.fsf@gnu.org> <87vbz3ybua.fsf@gmx.de> <83k3fj2o5w.fsf@gnu.org> X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 16048 Cc: 16048@debbugs.gnu.org, michael.albinus@gmx.de X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > From: Stefan Monnier > Cc: Michael Albinus , 16048@debbugs.gnu.org > Date: Thu, 05 Dec 2013 14:11:11 -0500 > > Of course, if your string can contain a mix of bytes and chars, you > don't have a choice. "0123abc_xyz\x01\xff" looks just such a mix to me. From debbugs-submit-bounces@debbugs.gnu.org Thu Dec 05 14:22:29 2013 Received: (at 16048) by debbugs.gnu.org; 5 Dec 2013 19:22:29 +0000 Received: from localhost ([127.0.0.1]:60351 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoeVI-0004f9-6V for submit@debbugs.gnu.org; Thu, 05 Dec 2013 14:22:28 -0500 Received: from mout.gmx.net ([212.227.17.21]:54167) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoeVF-0004ez-VR for 16048@debbugs.gnu.org; Thu, 05 Dec 2013 14:22:27 -0500 Received: from detlef.gmx.de ([87.146.37.149]) by mail.gmx.com (mrgmx002) with ESMTPS (Nemesis) id 0MXZw6-1W3S3d3Pbq-00WYBU for <16048@debbugs.gnu.org>; Thu, 05 Dec 2013 20:22:24 +0100 From: Michael Albinus To: Stefan Monnier Subject: Re: bug#16048: 24.3.50; String compare surprise References: <87r49six1s.fsf@igel.home> <878uw0ogoj.fsf@gmx.de> <83zjog34gi.fsf@gnu.org> <87vbz3ybua.fsf@gmx.de> <83k3fj2o5w.fsf@gnu.org> Date: Thu, 05 Dec 2013 20:22:08 +0100 In-Reply-To: (Stefan Monnier's message of "Thu, 05 Dec 2013 14:11:11 -0500") Message-ID: <878uvz3xy7.fsf@gmx.de> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Provags-ID: V03:K0:P8P55gXn6Pd6H8b6OJKu+f/arWXGjVHxOM8mdFbvj+IqlEqtYJ6 NGaj0wWLbgGCA6M08KQWmPYZcY9jkkNm8Rb/Qcf0/qNbmerZrg4pKnqMLvyAdyz5IVGZCgG JKHvxjF5GiB2V/iPrqnK5mxcYkA7y7ZFJzISRcABjJmdCj8HoxBfHQBiSOxZBEoNMDN7iZf QRGt7ThnkLNlDALeRoQGw== X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 16048 Cc: Eli Zaretskii , 16048@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) Stefan Monnier writes: >> Just to make it clear: Emacs is perfectly capable of holding raw bytes >> in multibyte strings. That's why we have the eight-bit charset. > > When manipulating sequences of bytes (as opposed to sequences of chars), > I find it is preferable to use unibyte strings. We are speaking about functions of dbus.el, which convert a string into something with a C-style identifier syntax, and back. Nothing I would expect to be a multibyte string in real life. (Agreed, my example looks strange, but this is for the hard test in dbus-tests.el) > Of course, if your string can contain a mix of bytes and chars, you > don't have a choice. For the other function in dbus.el, which handles arrays of bytes (often used to marshall whatever strings are on the wire) I've added earlier today the possiblity to encode them as unibyte or multibyte. As above, the function `dbus-byte-array-to-string' cannot decide itself how to interpret the bytestream, so the caller is requested to decide. > Stefan Best regards, Michael. From debbugs-submit-bounces@debbugs.gnu.org Thu Dec 05 14:24:40 2013 Received: (at 16048) by debbugs.gnu.org; 5 Dec 2013 19:24:40 +0000 Received: from localhost ([127.0.0.1]:60356 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoeXQ-0004iV-6Y for submit@debbugs.gnu.org; Thu, 05 Dec 2013 14:24:40 -0500 Received: from mout.gmx.net ([212.227.17.20]:56610) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VoeXO-0004iN-OA for 16048@debbugs.gnu.org; Thu, 05 Dec 2013 14:24:39 -0500 Received: from detlef.gmx.de ([87.146.37.149]) by mail.gmx.com (mrgmx001) with ESMTPS (Nemesis) id 0LymHf-1VUONm18ad-016ACv for <16048@debbugs.gnu.org>; Thu, 05 Dec 2013 20:24:38 +0100 From: Michael Albinus To: Eli Zaretskii Subject: Re: bug#16048: 24.3.50; String compare surprise References: <87r49six1s.fsf@igel.home> <878uw0ogoj.fsf@gmx.de> <83zjog34gi.fsf@gnu.org> <87vbz3ybua.fsf@gmx.de> <83k3fj2o5w.fsf@gnu.org> <838uvz2jj9.fsf@gnu.org> Date: Thu, 05 Dec 2013 20:24:22 +0100 In-Reply-To: <838uvz2jj9.fsf@gnu.org> (Eli Zaretskii's message of "Thu, 05 Dec 2013 21:18:50 +0200") Message-ID: <874n6n3xuh.fsf@gmx.de> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Provags-ID: V03:K0:8WlFbaUZOXH4JwN8/9snt6JWnJ7hC+1ab/Sqc7QcW8E00cWFkd6 3qH7ddxZ/R8e3+w/55Bf/F8AT2YALB5suVqSrRd1qXn6IVFGepDWQ7XsxoXi0jfLYL9jC8y bOMoQ8lZWXi88GqE5MOKoxi7oCXZuMA0euO65jbK1zEJocA43VI27Vmvs7PL4ZQ0clJVtV7 px4tliZyb/vvSn3cNpJFQ== X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 16048 Cc: 16048@debbugs.gnu.org, Stefan Monnier X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) Eli Zaretskii writes: >> Of course, if your string can contain a mix of bytes and chars, you >> don't have a choice. > > "0123abc_xyz\x01\xff" looks just such a mix to me. It is a hard core example, not something from the wild. In practice, I expect rather ASCII strings to be handled. Best regards, Michael. From unknown Sat Aug 09 09:33:17 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Fri, 03 Jan 2014 12:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator