#13399 - 24.3.50; Word-wrap can't wrap at zero-width space U-200B

GNU bug report logs - #13399
24.3.50; Word-wrap can't wrap at zero-width space U-200B

Package: emacs;

Reported by: martin rudalics <rudalics <at> gmx.at>

Date: Thu, 10 Jan 2013 08:31:02 UTC

Severity: wishlist

Found in version 24.3.50

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

Message #146 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Adam Tack <adam.tack.513 <at> gmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: 13399 <at> debbugs.gnu.org Subject: Re: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space U-200B Date: Wed, 13 Dec 2017 04:00:56 +0000

Sorry for not working further on this, but I didn't have time. I will get back to finishing this, soon. > Hmm... not sure why you arrived at this conclusion. E.g., what's > wrong with the implementation at the bottom of this message? This was very similar to my first try. Unfortunately, it doesn't work correctly in whitespace-mode, even with just normal spaces, regressing on Bug#11341. (with-current-buffer (get-buffer-create "*bar*") (dotimes (i 1000) (insert "1234 ")) ; Space (setq word-wrap t) (whitespace-mode) (display-buffer "*bar*")) The spaces are displayed as `·', so it->c returns 183, none of the further tests are checked and IT_DISPLAYING_WHITESPACE returns False. (In the currently used implementation, if it->c is not one of ' ' or '\t' then the later tests are all checked.) I thought about changing the order of the tests to something like the following (ignoring the special case of ' ' and '\t', here, for brevity): static inline bool IT_DISPLAYING_WHITESPACE (struct it *it) { int c; if (IT_BYTEPOS (*it) < ZV_BYTE) c = FETCH_CHAR (IT_BYTEPOS (*it)); else if (it->what == IT_CHARACTER) c = it->c; else if (STRINGP (it->string)) c = STRING_CHAR (SDATA (it->string) + IT_STRING_BYTEPOS (*it)); else if (it->s) c = STRING_CHAR (it->s + IT_BYTEPOS (*it)); else return false; return !NILP (CHAR_TABLE_REF (Vword_wrap_chars, c)); } which in the case of whitespace-mode does TRT, but I worried that there might be situations where wrapping on the display character is correct. The crux (as I had previously, but very unclearly, written) is that under "normal" circumstances, both `(it->what == IT_CHARACTER)' and `(IT_BYTEPOS (*it) < ZV_BYTE)' are true. Additionally, I wasn't sure whether there should be a fall-through, since on the one hand, it prevents emacs crashing if (weirdly) all the previous tests return false, but on the other, it might preclude some magic compiler optimisation. Chaining ORs side-stepped both issues, so I settled on keeping it, though it might have been the wrong decision. > > ii) vim's breakat characters (default " ^I!@*-+;:,./?"), since > > presumably they had given it some thought, > Maybe. I'm not sure in what modes this would be TRT. It should almost certainly not be the default in any mode, but it might, perhaps, be a useful, pre-defined option for some users. (For instance, when wrapping long URLs or paths in comments: |;; | |https://very.long.url/that-will-not-fit-on-a-single-lin| |e-anyway-but-could-at-least-start-on-the-same-line-as-t| |he-comment-sign-and-break-at-slightly-more-logical-plac| |es | looks (IMO at least!) less aesthetically pleasing than: |;; https://very.long.url/that-will-not-fit-on-a-single-| |line-anyway-but-could-at-least-start-on-the-same-line- | |as-the-comment-sign-and-break-at-slightly-more-logical-| |places | where `|' is the margin. The same sometimes holds for excessively long variable names. I definitely wouldn't impose this preference on others, but I assume that some might share it.) Using vim's choice helps avoid bike-shedding. > We already import several UCD files, see admin/unidata, where you will > also find copyright.html from the Unicode Consortium. Great! That's convenient. > test/manual is okay. Thanks! > This should probably go into simple.el. I'll move it there. Thanks for the help!

This bug report was last modified 4 years and 297 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #13399 24.3.50; Word-wrap can't wrap at zero-width space U-200B

GNU bug report logs - #13399
24.3.50; Word-wrap can't wrap at zero-width space U-200B