GNU bug report logs - #8506
24.0.50; Wrong column count

Previous Next

Package: emacs;

Reported by: Eli Barzilay <eli <at> barzilay.org>

Date: Fri, 15 Apr 2011 11:50:02 UTC

Severity: normal

Tags: notabug

Found in version 24.0.50

Done: Glenn Morris <rgm <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Eli Zaretskii <eliz <at> gnu.org>
To: Eli Barzilay <eli <at> barzilay.org>
Cc: 8506 <at> debbugs.gnu.org
Subject: bug#8506: 24.0.50; Wrong column count
Date: Fri, 15 Apr 2011 17:56:21 +0300
> From: Eli Barzilay <eli <at> barzilay.org>
> Date: Fri, 15 Apr 2011 07:48:45 -0400
> 
> Column numbers in the mode line are wrong with certain characters, I've
> seen this with angle brackets.  To reproduce the problem:
> 
>   * Start emacs and go to the scratch buffer
>   * M-x column-number-mode
>   * C-\ sgml RET
>   * enter &lang;
> 
> The column number that is shown is 2.

This is intentional.  The character that is inserted by the above is
〈, u+2329 (LEFT-POINTING ANGLE BRACKET), and that character is marked
as "W" (meaning Wide) in the Unicode Data Base, see here:

 http://www.unicode.org/Public/UNIDATA/EastAsianWidth.txt

Therefore, characters.el has this:

  ;; 2: East Asian Wide and Full-width characters.
  (let ((l '((#x1100 . #x115F)
	     (#x2329 . #x232A)  <<<<<<<<<<<<<<<<<<<<
	     (#x2E80 . #x303E)
	     (#x3040 . #xA4CF)
	     (#xAC00 . #xD7A3)
	     (#xF900 . #xFAFF)
	     (#xFE30 . #xFE6F)
	     (#xFF01 . #xFF60)
	     (#xFFE0 . #xFFE6)
	     (#x20000 . #x2FFFF)
	     (#x30000 . #x3FFFF))))
    (dolist (elt l)
      (set-char-table-range char-width-table elt 2)))

and consequently (aref char-width-table ?\〈) => 2.  That's why Emacs
thinks you are in column 2 after this character: it is told that its
width is 2.

However, at least on my machine, with "Arial Unicode MS" font, the
character actually displays thinner than normal, so perhaps the font
is wrong.

OTOH, this page:

  http://en.wikipedia.org/wiki/Bracket

says that these two characters "are canonically equivalent to the CJK
code points U+300x and thus likely to render as double-width symbols".

Unless someone comes up with a good reason why we should change the
width of this character in char-width-table, or there are other
characters which somehow defeat the column numbers, I suggest to close
this bug.





This bug report was last modified 13 years and 231 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.