GNU bug report logs -
#13399
24.3.50; Word-wrap can't wrap at zero-width space U-200B
Previous Next
Reported by: martin rudalics <rudalics <at> gmx.at>
Date: Thu, 10 Jan 2013 08:31:02 UTC
Severity: wishlist
Found in version 24.3.50
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
Full log
Message #146 received at 13399 <at> debbugs.gnu.org (full text, mbox):
Sorry for not working further on this, but I didn't have time. I will
get back to finishing this, soon.
> Hmm... not sure why you arrived at this conclusion. E.g., what's
> wrong with the implementation at the bottom of this message?
This was very similar to my first try. Unfortunately, it doesn't work
correctly in whitespace-mode, even with just normal spaces, regressing
on Bug#11341.
(with-current-buffer (get-buffer-create "*bar*")
(dotimes (i 1000)
(insert "1234 ")) ; Space
(setq word-wrap t)
(whitespace-mode)
(display-buffer "*bar*"))
The spaces are displayed as `ยท', so it->c returns 183, none of the
further tests are checked and IT_DISPLAYING_WHITESPACE returns False.
(In the currently used implementation, if it->c is not one of ' ' or '\t'
then the later tests are all checked.)
I thought about changing the order of the tests to something like the
following (ignoring the special case of ' ' and '\t', here, for
brevity):
static inline bool
IT_DISPLAYING_WHITESPACE (struct it *it) {
int c;
if (IT_BYTEPOS (*it) < ZV_BYTE)
c = FETCH_CHAR (IT_BYTEPOS (*it));
else if (it->what == IT_CHARACTER)
c = it->c;
else if (STRINGP (it->string))
c = STRING_CHAR (SDATA (it->string) + IT_STRING_BYTEPOS (*it));
else if (it->s)
c = STRING_CHAR (it->s + IT_BYTEPOS (*it));
else
return false;
return !NILP (CHAR_TABLE_REF (Vword_wrap_chars, c));
}
which in the case of whitespace-mode does TRT, but I worried that
there might be situations where wrapping on the display character
is correct. The crux (as I had previously, but very unclearly,
written) is that under "normal" circumstances, both
`(it->what == IT_CHARACTER)' and `(IT_BYTEPOS (*it) < ZV_BYTE)'
are true.
Additionally, I wasn't sure whether there should be a fall-through,
since on the one hand, it prevents emacs crashing if (weirdly) all the
previous tests return false, but on the other, it might preclude some magic
compiler optimisation.
Chaining ORs side-stepped both issues, so I settled on keeping it, though
it might have been the wrong decision.
> > ii) vim's breakat characters (default " ^I!@*-+;:,./?"), since
> > presumably they had given it some thought,
> Maybe. I'm not sure in what modes this would be TRT.
It should almost certainly not be the default in any mode, but it
might, perhaps, be a useful, pre-defined option for some users. (For
instance, when wrapping long URLs or paths in comments:
|;; |
|https://very.long.url/that-will-not-fit-on-a-single-lin|
|e-anyway-but-could-at-least-start-on-the-same-line-as-t|
|he-comment-sign-and-break-at-slightly-more-logical-plac|
|es |
looks (IMO at least!) less aesthetically pleasing than:
|;; https://very.long.url/that-will-not-fit-on-a-single-|
|line-anyway-but-could-at-least-start-on-the-same-line- |
|as-the-comment-sign-and-break-at-slightly-more-logical-|
|places |
where `|' is the margin.
The same sometimes holds for excessively long variable names. I
definitely wouldn't impose this preference on others, but I assume
that some might share it.) Using vim's choice helps avoid
bike-shedding.
> We already import several UCD files, see admin/unidata, where you will
> also find copyright.html from the Unicode Consortium.
Great! That's convenient.
> test/manual is okay.
Thanks!
> This should probably go into simple.el.
I'll move it there.
Thanks for the help!
This bug report was last modified 4 years and 244 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.