From unknown Mon Jun 23 20:16:25 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#10919 <10919@debbugs.gnu.org> To: bug#10919 <10919@debbugs.gnu.org> Subject: Status: emacs-mule/utf-8 difference Reply-To: bug#10919 <10919@debbugs.gnu.org> Date: Tue, 24 Jun 2025 03:16:25 +0000 retitle 10919 emacs-mule/utf-8 difference reassign 10919 emacs submitter 10919 Tiphaine Turpin severity 10919 normal thanks From debbugs-submit-bounces@debbugs.gnu.org Thu Mar 01 10:41:02 2012 Received: (at submit) by debbugs.gnu.org; 1 Mar 2012 15:41:02 +0000 Received: from localhost ([127.0.0.1]:57680 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1S387p-0003sq-Vv for submit@debbugs.gnu.org; Thu, 01 Mar 2012 10:41:02 -0500 Received: from eggs.gnu.org ([208.118.235.92]:59911) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1S387c-0003sP-AF for submit@debbugs.gnu.org; Thu, 01 Mar 2012 10:40:49 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S3877-0000u1-SY for submit@debbugs.gnu.org; Thu, 01 Mar 2012 10:40:22 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable version=3.3.2 Received: from lists.gnu.org ([208.118.235.17]:39075) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S3877-0000tw-PQ for submit@debbugs.gnu.org; Thu, 01 Mar 2012 10:40:17 -0500 Received: from eggs.gnu.org ([208.118.235.92]:34494) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S3871-0007Bo-LX for bug-gnu-emacs@gnu.org; Thu, 01 Mar 2012 10:40:17 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S386r-0000my-L1 for bug-gnu-emacs@gnu.org; Thu, 01 Mar 2012 10:40:11 -0500 Received: from mail1-relais-roc.national.inria.fr ([192.134.164.82]:30753) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S386r-0000lv-FA for bug-gnu-emacs@gnu.org; Thu, 01 Mar 2012 10:40:01 -0500 X-IronPort-AV: E=Sophos;i="4.73,511,1325458800"; d="scan'208";a="146983133" Received: from chercheurs2-217.saclay.inria.fr (HELO [193.55.250.217]) ([193.55.250.217]) by mail1-relais-roc.national.inria.fr with ESMTP/TLS/DHE-RSA-CAMELLIA256-SHA; 01 Mar 2012 16:39:57 +0100 Message-ID: <4F4F984D.2000901@inria.fr> Date: Thu, 01 Mar 2012 16:39:57 +0100 From: Tiphaine Turpin User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.24) Gecko/20111109 Thunderbird/3.1.16 MIME-Version: 1.0 To: bug-gnu-emacs@gnu.org Subject: emacs-mule/utf-8 difference Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 208.118.235.17 X-Spam-Score: -1.9 (-) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -1.9 (-) Hi, I have a problem regarding coding systems: I'm using process-send-string to send substrings of a buffer through a socket, after setting the process encoding and decoding systems to emacs-mule. I expect the number of bytes written to match the byte-length of the substring as obtained by position-bytes, since the specification of position-bytes in emacs-devel is to always work with the emacs-mule encoding. From emacs-devel: "The byte sequence of a buffer after decoded is always in emacs-mule (in emacs-unicode-2 branch, it's utf-8). So, changing buffer-file-coding-system or any other coding-system-related variables doesn't affects position-bytes." However, this is not the case with 3bytes utf8 characters: position-bytes counts them as 3 bytes, but process-send-string wirtes 4 bytes. Setting the process coding systems for the socket to utf-8 solves the problem, but I don't think it will with other coding systems, even if I used buffer-file-coding-system instead, since position-bytes does not use it. What is the real expected behavior of these things, and how to make this correct ? Regards, Tiphaine Turpin From debbugs-submit-bounces@debbugs.gnu.org Thu Mar 01 10:49:09 2012 Received: (at 10919) by debbugs.gnu.org; 1 Mar 2012 15:49:10 +0000 Received: from localhost ([127.0.0.1]:57698 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1S38Fh-00045G-Gg for submit@debbugs.gnu.org; Thu, 01 Mar 2012 10:49:09 -0500 Received: from mail4-relais-sop.national.inria.fr ([192.134.164.105]:30805) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1S38FW-00044r-3Q for 10919@debbugs.gnu.org; Thu, 01 Mar 2012 10:48:59 -0500 X-IronPort-AV: E=Sophos;i="4.73,511,1325458800"; d="scan'208";a="133865751" Received: from chercheurs2-217.saclay.inria.fr (HELO [193.55.250.217]) ([193.55.250.217]) by mail4-relais-sop.national.inria.fr with ESMTP/TLS/DHE-RSA-CAMELLIA256-SHA; 01 Mar 2012 16:48:31 +0100 Message-ID: <4F4F9A4E.50506@inria.fr> Date: Thu, 01 Mar 2012 16:48:30 +0100 From: Tiphaine Turpin User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.24) Gecko/20111109 Thunderbird/3.1.16 MIME-Version: 1.0 To: 10919@debbugs.gnu.org Subject: Re: emacs-mule/utf-8 difference References: <4F4F984D.2000901@inria.fr> In-Reply-To: <4F4F984D.2000901@inria.fr> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -6.9 (------) X-Debbugs-Envelope-To: 10919 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.9 (------) I just found a solution which seems to work: using emacs-internal instead of emacs-mule. So it seems to be just a documentation problem (or a problem with my reading of it). Tiphaine On 01/03/2012 16:39, Tiphaine Turpin wrote: > Hi, > > I have a problem regarding coding systems: > > I'm using process-send-string to send substrings of a buffer through a > socket, after setting the process encoding and decoding systems to > emacs-mule. > I expect the number of bytes written to match the byte-length of the > substring as obtained by position-bytes, since the specification of > position-bytes in emacs-devel is to always work with the emacs-mule > encoding. From emacs-devel: > > "The byte sequence of a buffer after decoded is always in emacs-mule > (in emacs-unicode-2 branch, it's utf-8). So, changing > buffer-file-coding-system or any other coding-system-related variables > doesn't affects position-bytes." > > However, this is not the case with 3bytes utf8 characters: > position-bytes counts them as 3 bytes, but process-send-string wirtes > 4 bytes. > > Setting the process coding systems for the socket to utf-8 solves the > problem, but I don't think it will with other coding systems, even if > I used buffer-file-coding-system instead, since position-bytes does > not use it. > > What is the real expected behavior of these things, and how to make > this correct ? > > Regards, > > Tiphaine Turpin > From debbugs-submit-bounces@debbugs.gnu.org Thu Mar 01 12:45:54 2012 Received: (at 10919) by debbugs.gnu.org; 1 Mar 2012 17:45:54 +0000 Received: from localhost ([127.0.0.1]:57772 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1S3A4f-0006jO-BD for submit@debbugs.gnu.org; Thu, 01 Mar 2012 12:45:54 -0500 Received: from chene.dit.umontreal.ca ([132.204.246.20]:54913) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1S3A4F-0006im-O0 for 10919@debbugs.gnu.org; Thu, 01 Mar 2012 12:45:40 -0500 Received: from faina.iro.umontreal.ca (lechon.iro.umontreal.ca [132.204.27.242]) by chene.dit.umontreal.ca (8.14.1/8.14.1) with ESMTP id q21Hj0M8007559; Thu, 1 Mar 2012 12:45:00 -0500 Received: by faina.iro.umontreal.ca (Postfix, from userid 20848) id A11C3130005; Thu, 1 Mar 2012 12:45:00 -0500 (EST) From: Stefan Monnier To: Tiphaine Turpin Subject: Re: bug#10919: emacs-mule/utf-8 difference Message-ID: References: <4F4F984D.2000901@inria.fr> <4F4F9A4E.50506@inria.fr> Date: Thu, 01 Mar 2012 12:45:00 -0500 In-Reply-To: <4F4F9A4E.50506@inria.fr> (Tiphaine Turpin's message of "Thu, 01 Mar 2012 16:48:30 +0100") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.92 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-NAI-Spam-Flag: NO X-NAI-Spam-Threshold: 5 X-NAI-Spam-Score: 0 X-NAI-Spam-Rules: 1 Rules triggered RV4148=0 X-NAI-Spam-Version: 2.2.0.9309 : core <4148> : streams <733755> : uri <1075034> X-Spam-Score: -3.5 (---) X-Debbugs-Envelope-To: 10919 Cc: 10919@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -3.5 (---) > I just found a solution which seems to work: using emacs-internal instead= of > emacs-mule. So it seems to be just a documentation problem (or a problem > with my reading of it). emacs-mule was internally used in Emacs<23, now it's a variant of utf-8. So position-bytes in Emacs<23 should be consistent with emasc-mule, but in Emacs=E2=89=A523 it is only consistent with emacs-internal (or utf-8). Stefan From debbugs-submit-bounces@debbugs.gnu.org Thu Mar 01 12:53:21 2012 Received: (at 10919-done) by debbugs.gnu.org; 1 Mar 2012 17:53:21 +0000 Received: from localhost ([127.0.0.1]:57776 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1S3ABt-0006tc-3X for submit@debbugs.gnu.org; Thu, 01 Mar 2012 12:53:21 -0500 Received: from mtaout22.012.net.il ([80.179.55.172]:57894) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1S3ABf-0006sx-Et for 10919-done@debbugs.gnu.org; Thu, 01 Mar 2012 12:53:08 -0500 Received: from conversion-daemon.a-mtaout22.012.net.il by a-mtaout22.012.net.il (HyperSendmail v2007.08) id <0M0700300W7HC200@a-mtaout22.012.net.il> for 10919-done@debbugs.gnu.org; Thu, 01 Mar 2012 19:52:40 +0200 (IST) Received: from HOME-C4E4A596F7 ([84.228.20.191]) by a-mtaout22.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0M07002ARWBQKCJ0@a-mtaout22.012.net.il>; Thu, 01 Mar 2012 19:52:39 +0200 (IST) Date: Thu, 01 Mar 2012 19:54:48 +0200 From: Eli Zaretskii Subject: Re: bug#10919: emacs-mule/utf-8 difference In-reply-to: <4F4F984D.2000901@inria.fr> X-012-Sender: halo1@inter.net.il To: Tiphaine Turpin Message-id: <83399scil3.fsf@gnu.org> References: <4F4F984D.2000901@inria.fr> X-Spam-Score: -1.2 (-) X-Debbugs-Envelope-To: 10919-done Cc: 10919-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -1.2 (-) > Date: Thu, 01 Mar 2012 16:39:57 +0100 > From: Tiphaine Turpin > > From emacs-devel: > > "The byte sequence of a buffer after decoded is always in emacs-mule (in > emacs-unicode-2 branch, it's utf-8). This is very old info. The emacs-unicode-2 branch was merged with the mainline when Emacs 23.1 was released. > So, changing > buffer-file-coding-system or any other coding-system-related variables > doesn't affects position-bytes." > > However, this is not the case with 3bytes utf8 characters: > position-bytes counts them as 3 bytes, but process-send-string wirtes 4 > bytes. process-send-string _encodes_ the string, it does not send the internal representation of the string in the buffer. Using process-send-string is like writing the string to a disk file: Emacs encodes it before sending or writing. Therefore, buffer-file-coding-system _does_ affect what is being sent. I'm closing this non-bug. From unknown Mon Jun 23 20:16:25 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Fri, 30 Mar 2012 11:24:02 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator