GNU bug report logs - #25288
25.1; term, ansi-term, broken output of utf8 text

Previous Next

Package: emacs;

Reported by: Vjacheslav <fvamail <at> gmail.com>

Date: Wed, 28 Dec 2016 16:58:02 UTC

Severity: normal

Tags: confirmed, fixed, patch

Found in versions 24.5, 25.1

Fixed in version 26.1

Done: npostavs <at> users.sourceforge.net

Bug is archived. No further changes may be made.

Full log


Message #10 received at control <at> debbugs.gnu.org (full text, mbox):

From: npostavs <at> users.sourceforge.net
To: Vjacheslav <fvamail <at> gmail.com>
Cc: 25288 <at> debbugs.gnu.org
Subject: Re: bug#25288: 25.1; term, ansi-term, broken output of utf8 text
Date: Wed, 28 Dec 2016 14:10:30 -0500
found 25288 24.5
tags 25288 confirmed
quit

Vjacheslav <fvamail <at> gmail.com> writes:

> Trying to use this command from terminal running bash:
>
> [fva <at> localhost ~]$ python -c 'print "ш"*5000'
>
> produces garbage (шшш\321\210шшш) in output. Terminal needs
> reset. Possibly this is a bug which seen in very old linux, (breaks
> multibyte characters on buffer borders).
>
> default-process-coding-system is OK:
>
> default-process-coding-system is a variable defined in ‘C source code’.
> Its value is (utf-8-unix . utf-8-unix)

It looks like the problem is that the process filter function,
term-emulate-terminal, receives the output in chunks of 4096 bytes[1].  The
ш character is encoded in 2 bytes, which means it can be split across
chunks.

Is there a way to recognize incomplete decoding from lisp?  I can't see
any.


[1]: It's getting bytes rather than characters because in term-exec-1 we
have:

	;; The process's output contains not just chars but also binary
	;; escape codes, so we need to see the raw output.  We will have to
	;; do the decoding by hand on the parts that are made of chars.
	(coding-system-for-read 'binary))





This bug report was last modified 8 years and 196 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.