GNU bug report logs -
#10919
emacs-mule/utf-8 difference
Previous Next
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
Your bug report
#10919: emacs-mule/utf-8 difference
which was filed against the emacs package, has been closed.
The explanation is attached below, along with your original report.
If you require more details, please reply to 10919 <at> debbugs.gnu.org.
--
10919: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=10919
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
> Date: Thu, 01 Mar 2012 16:39:57 +0100
> From: Tiphaine Turpin <tiphaine.turpin <at> inria.fr>
>
> From emacs-devel:
>
> "The byte sequence of a buffer after decoded is always in emacs-mule (in
> emacs-unicode-2 branch, it's utf-8).
This is very old info. The emacs-unicode-2 branch was merged with the
mainline when Emacs 23.1 was released.
> So, changing
> buffer-file-coding-system or any other coding-system-related variables
> doesn't affects position-bytes."
>
> However, this is not the case with 3bytes utf8 characters:
> position-bytes counts them as 3 bytes, but process-send-string wirtes 4
> bytes.
process-send-string _encodes_ the string, it does not send the
internal representation of the string in the buffer. Using
process-send-string is like writing the string to a disk file: Emacs
encodes it before sending or writing.
Therefore, buffer-file-coding-system _does_ affect what is being sent.
I'm closing this non-bug.
[Message part 3 (message/rfc822, inline)]
Hi,
I have a problem regarding coding systems:
I'm using process-send-string to send substrings of a buffer through a
socket, after setting the process encoding and decoding systems to
emacs-mule.
I expect the number of bytes written to match the byte-length of the
substring as obtained by position-bytes, since the specification of
position-bytes in emacs-devel is to always work with the emacs-mule
encoding. From emacs-devel:
"The byte sequence of a buffer after decoded is always in emacs-mule (in
emacs-unicode-2 branch, it's utf-8). So, changing
buffer-file-coding-system or any other coding-system-related variables
doesn't affects position-bytes."
However, this is not the case with 3bytes utf8 characters:
position-bytes counts them as 3 bytes, but process-send-string wirtes 4
bytes.
Setting the process coding systems for the socket to utf-8 solves the
problem, but I don't think it will with other coding systems, even if I
used buffer-file-coding-system instead, since position-bytes does not
use it.
What is the real expected behavior of these things, and how to make this
correct ?
Regards,
Tiphaine Turpin
This bug report was last modified 13 years and 88 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.