GNU bug report logs -
#13399
24.3.50; Word-wrap can't wrap at zero-width space U-200B
Previous Next
Reported by: martin rudalics <rudalics <at> gmx.at>
Date: Thu, 10 Jan 2013 08:31:02 UTC
Severity: wishlist
Found in version 24.3.50
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
Full log
Message #17 received at 13399 <at> debbugs.gnu.org (full text, mbox):
>> > You mean, not wrapped at all. Witness the continuation bitmaps in the
>> > fringes, which shouldn't appear when a line is wrapped.
>>
>> I thought these bitmaps appear when a line is wrapped.
>
> Not by default. Not unless you customize visual-line-fringe-indicators.
With emacs -Q I see curly arrows in the fringes regardless of whether I
set `visual-line-fringe-indicators' or not. What am I missing?
>> The doc-string of `word-wrap' says
>>
>> When word-wrapping is on, continuation lines are wrapped at the space
>> or tab character nearest to the right window edge
>>
>> Since U-200B is a space character the line should wrap at it.
>
> No, it means literally "the space character", U+0020.
So `word-wrap' is ASCII-only? The doc-string should say so.
>> and Emacs apparently does handle it specially since it reserves a few
>> pixels when drawing it.
>
> See glyphless-char-display and glyphless-char-display-control for why.
IIUC it has a `thin-space' display method entry and I could set this to
`zero-width' (the doc-string of `glyphless-char-display' is ambiguous
about that)? Does this also mean that I can separate text properties of
adjacent words by inserting a zero-width space between them?
> #define IT_DISPLAYING_WHITESPACE(it) \
> /* If the character to be displayed is SPC or TAB */
[...]
> In any case, you can clearly see that it only tests for literal SPC
> and TAB characters.
Even if I don't understand the code I can see that, yes.
>> > If we want to add more characters to the set, we should probably
>> > arrange a special char-table for this, and have it exposed to Lisp, so
>> > it could be customized. Patches are welcome.
>>
>> IIUC all breakable spaces are between U-2000 and U-200B so maybe a
>> character table is not needed.
>
> Who said we want only break at breakable space characters? Who said
> Unicode will never add more such characters in another block? And
> what about low-ASCII characters, which are already in a different
> block?
But implementing a character table and working with it is harder.
> In any case, even if you are right, a char-table is a way to store
> character properties efficiently. In particular, it will waste very
> little storage to mark a contiguous range of characters with the same
> property. The advantage of using a char-table is that it will
> dynamically expand as needed if more characters are added to the set.
Is it useful to make a _separate_ table for line-break properties?
>> Anway, exposing displayed text to Lisp would be great. We'd just need
>> two functions - one that gets the pixel width of an arbitrary buffer
>> string wrt a specific window, and one that gets the pixel height of an
>> arbitrary buffer string (newlines ignored) wrt a specific window. This
>> way we could get rid of lots of problems currently hidden in the display
>> engine ...
>
> You lost me here. By "exposing to Lisp" I meant expose the char-table
> of word-wrap characters to Lisp.
I only now understand what you meant.
> What did _you_ want exposed to Lisp?
Two functions: One to get the width of some arbitrary buffer text in
pixels and one to get the full height of a buffer text line in pixels.
The former would be used for doing word-wrapping variants in Lisp, the
latter for fitting windows to their buffers.
martin
This bug report was last modified 4 years and 245 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.