GNU bug report logs - #13399
24.3.50; Word-wrap can't wrap at zero-width space U-200B

Previous Next

Package: emacs;

Reported by: martin rudalics <rudalics <at> gmx.at>

Date: Thu, 10 Jan 2013 08:31:02 UTC

Severity: wishlist

Found in version 24.3.50

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

Full log


Message #128 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Adam Tack <adam.tack.513 <at> gmail.com>
To: 13399 <at> debbugs.gnu.org
Subject: 24.3.50; Word-wrap can't wrap at zero-width space U-200B
Date: Fri, 8 Dec 2017 01:02:08 +0000
[Message part 1 (text/plain, inline)]
I have a patch for the original issue of word-wrap not wrapping at a
zero-width space.  The implementation uses a character table, and is
closely based on that written by Martin Rudalics
(https://debbugs.gnu.org/cgi/bugreport.cgi?bug=13399#113), with Eli
Zaretski's suggestions regarding unicode.

The patch applies cleanly to the latest master, compiles on GNU+Linux
(Ubuntu Xenial) and appears to work — both of the following tests
result in the expected wrapping on the zero-width space character (the
first of these is taken verbatim from this bug thread, the second,
adapted from the first, checks that there is no regression of Bug#11341):

(with-current-buffer (get-buffer-create "*foo*")
  (dotimes (i 1000)
    (insert "1234")) ; U-200B
  (setq word-wrap t)
  (display-buffer "*foo*"))

(with-current-buffer (get-buffer-create "*bar*")
  (dotimes (i 1000)
    (insert "1234")) ; U-200B
  (setq word-wrap t)
  (setq whitespace-display-mappings
    '((space-mark 32
              [183]
              [46])
      (space-mark 160
              [164]
              [95])
      (space-mark 8203
              [164]
              [95])
      (newline-mark 10
            [36 10])
      (tab-mark 9
            [187 9]
            [92 9])))
  (whitespace-mode)
  (display-buffer "*bar*"))

Setting other word-wrap characters using set-char-table-range with
lisp also works as expected in the simple situations that I tested.

However, this is my first foray into modifying a serious C codebase,
so I am not sure if I have done the right thing.  In particular, I
have serious doubts about the second and third cases from
IT_DISPLAYING_WHITESPACE, especially since I don't really know when
they would be applicable.

   || ((STRINGP (it->string)                        \
    && !NILP (CHAR_TABLE_REF                    \
          (Vword_wrap_chars, STRING_CHAR            \
           (SDATA (it->string) + IT_STRING_BYTEPOS (*it)))))    \
       || (it->s && !NILP (CHAR_TABLE_REF                \
               (Vword_wrap_chars,                \
                STRING_CHAR(it->s + IT_BYTEPOS (*it)))))    \

Additionally, I'm not certain whether syms_of_character in character.c
is the right location for the definition of the char-table and whether
the range of characters U+2000 to U+200B should be in the chartable,
or if it should just be space and tab, by default.


I am aware that if this were to be accepted, I would also need to make
a change to etc/NEWS, probably the docstring of `word-wrap' and
somewhere in the Texinfo manual.

I have not yet filled out a copyright assignment form, though I will
do so if this patch (modulo changes) is considered acceptable.

Thanks!
[word_wrap_char_table.diff (text/plain, attachment)]

This bug report was last modified 4 years and 244 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.