GNU bug report logs -
#13399
24.3.50; Word-wrap can't wrap at zero-width space U-200B
Previous Next
Reported by: martin rudalics <rudalics <at> gmx.at>
Date: Thu, 10 Jan 2013 08:31:02 UTC
Severity: wishlist
Found in version 24.3.50
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
Just to recite the initial problem and your proposal:
>> With emacs -Q evaluate
>>
>> (with-current-buffer (get-buffer-create "*foo*")
>> (dotimes (i 1000)
>> (insert "1234")) ; U-200B
>> (setq word-wrap t)
>> (display-buffer "*foo*"))
>>
>> where the character after 1234 is a zero-width space character with
>> unicode code point U-200B. As can be seen in the window showing *foo*,
>> lines are not regularly wrapped at that character.
>
> You mean, not wrapped at all. Witness the continuation bitmaps in the
> fringes, which shouldn't appear when a line is wrapped.
>
>> Doing
>>
>> (with-current-buffer (get-buffer-create "*foo*")
>> (dotimes (i 1000)
>> (insert "1234 "))
>> (setq word-wrap t)
>> (display-buffer "*foo*"))
>>
>> instead wraps lines as expected.
>
> If anything, this is a missing feature, since word-wrap is explicitly
> coded to break lines only on SPC and TAB characters. See the
> IT_DISPLAYING_WHITESPACE macro in xdisp.c.
>
> If we want to add more characters to the set, we should probably
> arrange a special char-table for this, and have it exposed to Lisp, so
> it could be customized. Patches are welcome.
I now rewrote IT_DISPLAYING_WHITESPACE as
#define IT_DISPLAYING_WHITESPACE(it) \
((it->what == IT_CHARACTER \
&& !NILP (CHAR_TABLE_REF (Vword_wrap_chars, it->c))) \
|| ((STRINGP (it->string) \
&& !NILP (CHAR_TABLE_REF \
(Vword_wrap_chars, \
SREF (it->string, IT_STRING_BYTEPOS (*it))))) \
|| (it->s && !NILP (CHAR_TABLE_REF \
(Vword_wrap_chars, \
it->s[IT_BYTEPOS (*it)]))) \
|| (IT_BYTEPOS (*it) < ZV_BYTE \
&& !NILP (CHAR_TABLE_REF \
(Vword_wrap_chars, \
(*BYTE_POS_ADDR (IT_BYTEPOS (*it)))))))) \
and have a character table called `word-wrap-chars' such that
(aref word-wrap-chars ?) returns t, but it doesn't wrap at a
U-200B character. Is there some additional wrinkle like some
hardcoded space/tab in the word-wrap code I have to observe?
Or is my code wrong?
Thanks, martin
This bug report was last modified 4 years and 245 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.