GNU bug report logs -
#7962
23.2; capitalize letters ISO-8859-1 with diacritic signs in emacs 23.2.1
Previous Next
Reported by: Emmanuel Bigler <Emmanuel.Bigler <at> ens2m.fr>
Date: Wed, 2 Feb 2011 14:42:03 UTC
Severity: normal
Found in version 23.2
Done: Stefan Monnier <monnier <at> iro.umontreal.ca>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
>>
>> I see this:
>> buffer-file-coding-system is a variable defined in `C source code'.
>> Its value is iso-latin-1-dos
>
> See "M-: (coding-system-priority-list) RET".
>
> The highest-priority encoding is set from your locale, but look what
> is the next one.
>
hello again.
I think I'm starting to understand what is going on.
I had created a long time ago an unibyte file containing the 1-byte
characters I want to test within emacs. The file was created with a
program on which I have total control byte-by-byte, so I know what is
inside the file exactly. I have attached the file to this mail, not sure
that this is allowed on the gnu-debug mailing list, but this is simple
and very short .txt file, that reads as follows : (this mail itself is
typeset and displayed here as iso-8859-1)
------- mytestchars-224-255-iso-8859.txt ---------------------
224 \340 à 225 \341 á 226 \342 â 227 \343 ã
228 \344 ä 229 \345 å 230 \346 æ 231 \347 ç
232 \350 è 233 \351 é 234 \352 ê 235 \353 ë
236 \354 ì 237 \355 í 238 \356 î 239 \357 ï
240 \360 ð 241 \361 ñ 242 \362 ò 243 \363 ó
244 \364 ô 245 \365 õ 246 \366 ö 247 \367 ÷
248 \370 ø 249 \371 ù 250 \372 ú 251 \373 û
252 \374 ü 253 \375 ý 254 \376 þ 255 \377 ÿ
éèçàù < test strings to see how they behave
Éèçàù
----------------------------------------------------------
I started /usr/local/bin/emacs -Q mytestchars-224-255-iso-8859.txt
under emacs 23.2.93.1 (i686-pc-linux-gnu)
The file displays perfectly correctly. (describe-char (point)) gives me
exactly what I want, i.e. an extended asci decimal code between 224 and 255.
Almost all operations (except capitalize, see below) work exactly as I
wish and exactly like in older emacs versions, no mystery since the
priority list
M-: (coding-system-priority-list) RET reads as :
(iso-latin-1 utf-8 iso-2022-7bit iso-2022-7bit-lock iso-2022-8bit-ss2
emacs-mule raw-text iso-2022-jp in-is13194-devanagari chinese-iso-8bit
utf-8-auto utf-8-with-signature ...)
Again I'm perfectly happy since I see that iso-latin-1 comes first, but
is this what I want ? certainly yes,
my locale environment variables look like :
LC_ALL=fr_FR.ISO8859-1
LC_COLLATE=fr_FR.ISO8859-1
LANG=fr_FR.ISO8859-1
GDM_LANG=fr_FR.iso88591
LC_CTYPE=fr_FR.ISO8859-1
XTERM_LOCALE=fr_FR.ISO8859-1
However, in this emacs -Q session, with a correct unibyte display of
an unibyte file, *capitalize does not work*.
At the beginning of this discussion, Sven explained that capitalize
would only work on 2-byte characters. Which I tested of course, and of
course it works, but I simply wish I could continue to capitalize M-c
unibyte words like in the good old iso-8859 days !!
Additional info : when applying the M-c command to a letter above
decimal ascii 224, nothing happens on the display as reported, *although
the buffer is marked as being changed.*
Incidentally in a good ol' xterm window (fitted with gnu readline and
obeying my LOCALE preferences as liste above), M-c works perfectly as
it should, and if I cut-paste from the xterm to the emacs buffer,
everything looks fine & unibyte ... except that I can no longer change
the case of the pasted string with 'capitalize' or a similar 'case'
command.
Bug, or UTF-8 emacs 23.2 feature ?
--
Emmanuel
[mytestchars-224-255-iso-8859.txt (text/plain, attachment)]
This bug report was last modified 14 years and 160 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.