GNU bug report logs -
#10919
emacs-mule/utf-8 difference
Previous Next
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
Your message dated Thu, 01 Mar 2012 19:54:48 +0200
with message-id <83399scil3.fsf <at> gnu.org>
and subject line Re: bug#10919: emacs-mule/utf-8 difference
has caused the debbugs.gnu.org bug report #10919,
regarding emacs-mule/utf-8 difference
to be marked as done.
(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)
--
10919: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=10919
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
Hi,
I have a problem regarding coding systems:
I'm using process-send-string to send substrings of a buffer through a
socket, after setting the process encoding and decoding systems to
emacs-mule.
I expect the number of bytes written to match the byte-length of the
substring as obtained by position-bytes, since the specification of
position-bytes in emacs-devel is to always work with the emacs-mule
encoding. From emacs-devel:
"The byte sequence of a buffer after decoded is always in emacs-mule (in
emacs-unicode-2 branch, it's utf-8). So, changing
buffer-file-coding-system or any other coding-system-related variables
doesn't affects position-bytes."
However, this is not the case with 3bytes utf8 characters:
position-bytes counts them as 3 bytes, but process-send-string wirtes 4
bytes.
Setting the process coding systems for the socket to utf-8 solves the
problem, but I don't think it will with other coding systems, even if I
used buffer-file-coding-system instead, since position-bytes does not
use it.
What is the real expected behavior of these things, and how to make this
correct ?
Regards,
Tiphaine Turpin
[Message part 3 (message/rfc822, inline)]
> Date: Thu, 01 Mar 2012 16:39:57 +0100
> From: Tiphaine Turpin <tiphaine.turpin <at> inria.fr>
>
> From emacs-devel:
>
> "The byte sequence of a buffer after decoded is always in emacs-mule (in
> emacs-unicode-2 branch, it's utf-8).
This is very old info. The emacs-unicode-2 branch was merged with the
mainline when Emacs 23.1 was released.
> So, changing
> buffer-file-coding-system or any other coding-system-related variables
> doesn't affects position-bytes."
>
> However, this is not the case with 3bytes utf8 characters:
> position-bytes counts them as 3 bytes, but process-send-string wirtes 4
> bytes.
process-send-string _encodes_ the string, it does not send the
internal representation of the string in the buffer. Using
process-send-string is like writing the string to a disk file: Emacs
encodes it before sending or writing.
Therefore, buffer-file-coding-system _does_ affect what is being sent.
I'm closing this non-bug.
This bug report was last modified 13 years and 88 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.