#5235 - 23.1; Unibyte keyboard input problem

GNU bug report logs - #5235
23.1; Unibyte keyboard input problem

Package: emacs;

Reported by: Tomasz Zbrożek <scianagoryczy <at> wp.pl>

Date: Wed, 16 Dec 2009 21:25:05 UTC

Severity: normal

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

View this message in rfc822 format

From: Stefan Monnier <monnier <at> IRO.UMontreal.CA> To: Jason Rumney <jasonr <at> gnu.org> Cc: Tomasz Zbrożek <scianagoryczy <at> wp.pl>, 5235 <at> debbugs.gnu.org Subject: bug#5235: 23.1; Unibyte keyboard input problem Date: Tue, 29 Dec 2009 13:43:01 -0200

>>> I'll try to explain why I need unibyte mode. I'm maintener of a C/C++ >>> source code which has comments coded in cp1250 (polish language) but >>> strings in code are coded in cp852. So I have two different code >>> pages in source code file. This is old source code and it was >>> developed in Windows (that's why comments are in cp1250) but is >>> compiled to work on MS-DOS (that's why strings are coded in cp852). >> So what happens if you read those files as binary (i.e. C-x RET >> r binary RET)? > At best, he'd end up silently screwing up his files even further, with > cp1250, cp852 and now utf-8 encoded characters in them. More likely he > would still get prompted when saving, just as if he'd used cp1250 or cp852 > to read them. That would be a bug: a file visited as `binary' (or as `raw-text') should be placed in a unibyte buffer, so it should not screw anything up more than was already the case to start with. > The problem here is the files, not Emacs. Basically the reason for using > unibyte is that it allows the user to bury their head in the sand and > pretend the problem does not exist. Of course, but if you start with such files and can't (or don't want to) recode the parts consistently, we can't do much better. > I work on similar files in my day job, with Japanese comments in ShiftJIS > and Chinese comments in GB2312. An easy method of fixing such files would be > nice, but the best I can think of would be to provide a recode-region > function, which would still be too much manual work to be worth it to me > given that I can barely make sense of the Japanese comments and can't make > any sense of the Chinese ones. The original poster might be more motivated > to make use of such a function if it existed though. I'm not sure what would be the best approach in general or in particular cases, but we could certainly provide a command that recodes comments. Or another one that looks for invalid byte sequences (i.e. decoded as eight-bit-bytes) and tries to re-decode them with a secondary coding system. Stefan

This bug report was last modified 4 years and 301 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #5235 23.1; Unibyte keyboard input problem

GNU bug report logs - #5235
23.1; Unibyte keyboard input problem