Package: emacs;
Reported by: martin rudalics <rudalics <at> gmx.at>
Date: Thu, 10 Jan 2013 08:31:02 UTC
Severity: wishlist
Found in version 24.3.50
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
View this message in rfc822 format
From: Lars Ingebrigtsen <larsi <at> gnus.org> To: Adam Tack <adam.tack.513 <at> gmail.com> Cc: Eli Zaretskii <eliz <at> gnu.org>, 13399 <at> debbugs.gnu.org Subject: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space U-200B Date: Fri, 18 Sep 2020 16:55:40 +0200
Adam Tack <adam.tack.513 <at> gmail.com> writes: > I've split out the non-nil char-table case out into a function, as I > think that using a named function slightly improves readability, and > having a macro over 20 lines long, somehow feels "wrong". If the > compiler does actually follow the inline directive, there should be no > additional performance hit. This was the last post in the thread, and the patch no longer applied, so I've respun it for Emacs 28. However, I can't find any copyright assignment on file -- Adam, did you go through with the assignment process? diff --git a/doc/emacs/display.texi b/doc/emacs/display.texi index e7b8745a04..9fcca8c6e6 100644 --- a/doc/emacs/display.texi +++ b/doc/emacs/display.texi @@ -1831,6 +1831,14 @@ Visual Line Mode report. You can add categories to a character using the command @code{modify-category-entry}. +@vindex word-wrap-chars +@findex word-wrap-chars-mode + Word boundaries and hence points at which word wrap can occur are, +by default, considered to occur on the space and tab characters. If +you prefer word-wrap to be permissible at other characters, you can +change the value of the char-table @code{word-wrap-chars}, or use +@code{word-wrap-chars-mode}, which does this for you. + @node Display Custom @section Customization of Display diff --git a/etc/NEWS b/etc/NEWS index 54bad068f8..f3216ed445 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -73,6 +73,19 @@ its implementation has been removed from the Linux kernel. OpenBSD 5.3 and older releases are no longer supported, as they lack proper pty support that Emacs needs. ++++ +** The characters at which word-wrapping occurs can now be controlled +using the new `word-wrap-chars' char-table. If `word-wrap-chars' is +nil (the default), then word-wrapping will occur only on the space or +tab characters, as has been the case until now. + +The most convenient way to change the characters at which wrap occurs +is customizing the new variable `word-wrap-type' and using the new +`word-wrap-chars-mode' minor mode, which sets `word-wrap-chars' based +on `word-wrap-type', for you. The options for `word-wrap-type' are +ascii-whitespace, unicode-whitespace and a customizable list of +character codes and character code ranges. + * Startup Changes in Emacs 28.1 diff --git a/lisp/simple.el b/lisp/simple.el index 7dc695848b..b881cbc23e 100644 --- a/lisp/simple.el +++ b/lisp/simple.el @@ -7269,6 +7269,117 @@ turn-on-visual-line-mode (define-globalized-minor-mode global-visual-line-mode visual-line-mode turn-on-visual-line-mode) + +(defvar word-wrap-type) + +(defvar word-wrap-chars--saved nil) + +(define-minor-mode word-wrap-chars-mode + "Toggle wrapping using a look-up to `word-wrap-chars'. +The exact choice of characters on which wrapping occurs, depends +on the value of `word-wrap-type'. By default, `word-wrap-type' +is set to unicode-white-space, which allows word wrapping on all +breakable unicode whitespace, not only space and tap. + +For details of other customization options, see +`word-wrap-type'. + +This minor mode has no effect unless `visual-line-mode' is +enabled or `word-wrap' is set to t. + +To toggle wrapping using a look-up, globally, use +`global-word-wrap-chars-mode'." + :group 'visual-line + :lighter " wwc" + (if word-wrap-chars-mode + (progn + (if (local-variable-p 'word-wrap-chars) + (setq-local word-wrap-chars--saved + word-wrap-chars)) + (set-word-wrap-chars)) + (setq-local word-wrap-chars word-wrap-chars--saved))) + +(defun turn-on-word-wrap-chars-mode () + (visual-line-mode 1)) + +(define-globalized-minor-mode global-word-wrap-chars-mode + word-wrap-chars-mode turn-on-word-wrap-chars-mode) + +(defun update-word-wrap-chars () + "Update `word-wrap-chars' upon Customize of `word-wrap-type'. + +Only buffers which use the `word-wrap-chars-mode' are affected." + (mapcar #'(lambda (buf) + (with-current-buffer buf + (if word-wrap-chars-mode + (set-word-wrap-chars)))) + (buffer-list))) + +(defun set-word-wrap-chars () + "Set `word-wrap-chars' locally, based on `word-wrap-type'." + (cond + ((eq word-wrap-type 'ascii-whitespace) + (setq-local word-wrap-chars nil)) + ((eq word-wrap-type 'unicode-whitespace) + (set-word-wrap-chars-from-list + '(9 32 5760 (8192 . 8198) (8200 . 8203) 8287 12288))) + ((listp word-wrap-type) + (set-word-wrap-chars-from-list word-wrap-type)))) + +(defun set-word-wrap-chars-from-list (list) + "Set `word-wrap-chars' locally from a list. +Each element of the list can be a character code (code point) or +a cons of character codes, representing the two (inclusive) +endpoints of the range of characters." + (setq-local + word-wrap-chars + (let ((char-table (make-char-table nil nil))) + (dolist (range list char-table) + (set-char-table-range char-table range t))))) + +(defcustom word-wrap-type + 'unicode-whitespace + "Characters on which word-wrap occurs. +This variable controls the value of `word-wrap-chars' that is set +by `word-wrap-chars-mode`. `word-wrap-chars' determines on what +characters word-wrapping can occur, when `word-wrap' is t or +`visual-line-mode' is enabled. + +Possible values are ascii-whitespace, unicode-whitespace or a +custom list of characters and character ranges. + +If the value is `ascii-whitespace', word-wrap is only on space +and tab. If the value is `unicode-whitespace', word-wrap is on +all the Unicode whitespace characters that permit wrapping, +including but not limited to space and tab. + +If a custom list of characters and ranges is used, word wrap is +on these characters and character ranges. The ranges are +inclusive of both endpoints. + +When you change this without using customize, you need to call +`update-word-wrap-chars' to update the word wrap in current +buffers. For instance: + +(setq word-wrap-type \\='(9 32 ?_)) +(update-word-wrap-chars) + +will set the wrappable characters to space, tab and underscore, +in all buffers in `word-wrap-chars-mode' and using the default +value of `word-wrap-type'. +" + :type '(choice (const :tag "Space and tab" ascii-whitespace) + (const :tag "All unicode spaces" unicode-whitespace) + (repeat :tag "Custom characters or ranges" + :value (9 32) + (choice (character) + (cons :tag "Range" character character)))) + :set (lambda (symbol value) + (set-default symbol value) + (update-word-wrap-chars)) + :group 'visual-line + :version 27.1) + (defun transpose-chars (arg) "Interchange characters around point, moving forward one character. diff --git a/src/buffer.c b/src/buffer.c index 241f2d43a9..5c26323d69 100644 --- a/src/buffer.c +++ b/src/buffer.c @@ -5786,7 +5786,12 @@ syms_of_buffer (void) Visual Line mode. Visual Line mode, when enabled, sets `word-wrap' to t, and additionally redefines simple editing commands to act on visual lines rather than logical lines. See the documentation of -`visual-line-mode'. */); +`visual-line-mode'. + +If `word-wrap-chars' is non-nil and a char-table, continuation lines +are wrapped on the characters in `word-wrap-chars' whose value is t, +rather than the space and tab characters. `word-wrap-chars-mode +provides a convenient interface for using this. */); DEFVAR_PER_BUFFER ("default-directory", &BVAR (current_buffer, directory), Qstringp, diff --git a/src/character.c b/src/character.c index 5860f6a0c8..032f4fc12b 100644 --- a/src/character.c +++ b/src/character.c @@ -1084,4 +1084,14 @@ syms_of_character (void) See The Unicode Standard for the meaning of those values. */); /* The correct char-table is setup in characters.el. */ Vunicode_category_table = Qnil; + + DEFVAR_LISP ("word-wrap-chars", Vword_wrap_chars, + doc: /* A char-table for characters at which word-wrap occurs. +Such characters have value t in this table. If the char-table is nil, +word-wrap occurs only on space and tab. + +For a more user-friendly way of changing the characters at which +word-wrap can occur, consider using `word-wrap-chars-mode' and +customizing `word-wrap-type'. */); + Vword_wrap_chars = Qnil; } diff --git a/src/xdisp.c b/src/xdisp.c index 615f0ca7cf..744b9a52c7 100644 --- a/src/xdisp.c +++ b/src/xdisp.c @@ -494,20 +494,42 @@ #define IT_OVERFLOW_NEWLINE_INTO_FRINGE(it) false #endif /* HAVE_WINDOW_SYSTEM */ /* Test if the display element loaded in IT, or the underlying buffer - or string character, is a space or a TAB character. This is used - to determine where word wrapping can occur. */ + or string character, is a space or tab (by default, to avoid the + unnecessary performance hit of char-table lookup). If + word-wrap-chars is a char-table, then instead check if the relevant + element or character belongs to the char-table. This is used to + determine where word wrapping can occur. */ #define IT_DISPLAYING_WHITESPACE(it) \ - ((it->what == IT_CHARACTER && (it->c == ' ' || it->c == '\t')) \ - || ((STRINGP (it->string) \ - && (SREF (it->string, IT_STRING_BYTEPOS (*it)) == ' ' \ - || SREF (it->string, IT_STRING_BYTEPOS (*it)) == '\t')) \ - || (it->s \ - && (it->s[IT_BYTEPOS (*it)] == ' ' \ - || it->s[IT_BYTEPOS (*it)] == '\t')) \ - || (IT_BYTEPOS (*it) < ZV_BYTE \ - && (*BYTE_POS_ADDR (IT_BYTEPOS (*it)) == ' ' \ - || *BYTE_POS_ADDR (IT_BYTEPOS (*it)) == '\t')))) + (!CHAR_TABLE_P (Vword_wrap_chars) \ + ? ((it->what == IT_CHARACTER && (it->c == ' ' || it->c == '\t')) \ + || ((STRINGP (it->string) \ + && (SREF (it->string, IT_STRING_BYTEPOS (*it)) == ' ' \ + || SREF (it->string, IT_STRING_BYTEPOS (*it)) == '\t')) \ + || (it->s \ + && (it->s[IT_BYTEPOS (*it)] == ' ' \ + || it->s[IT_BYTEPOS (*it)] == '\t')) \ + || (IT_BYTEPOS (*it) < ZV_BYTE \ + && (*BYTE_POS_ADDR (IT_BYTEPOS (*it)) == ' ' \ + || *BYTE_POS_ADDR (IT_BYTEPOS (*it)) == '\t')))) \ + : it_displaying_word_wrap_char(it)) \ + +static inline bool +char_is_word_wrap_char_p (int c) { + return !NILP (CHAR_TABLE_REF (Vword_wrap_chars, c)); +} + +static inline bool +it_displaying_word_wrap_char (struct it *it) { + return ((it->what == IT_CHARACTER && char_is_word_wrap_char_p (it->c)) + || (STRINGP (it->string) && char_is_word_wrap_char_p + (STRING_CHAR + (SDATA (it->string) + IT_STRING_BYTEPOS (*it)))) + || (it->s && char_is_word_wrap_char_p + (STRING_CHAR(it->s + IT_BYTEPOS (*it)))) + || (IT_BYTEPOS (*it) < ZV_BYTE && char_is_word_wrap_char_p + (FETCH_CHAR (IT_BYTEPOS (*it))))); +} /* These are the category sets we use. They are defined by kinsoku.el and chracters.el. */ diff --git a/test/manual/word-wrap-test.el b/test/manual/word-wrap-test.el new file mode 100644 index 0000000000..593c2decc7 --- /dev/null +++ b/test/manual/word-wrap-test.el @@ -0,0 +1,127 @@ +;;; word-wrap-test.el -- tests for word-wrap -*- lexical-binding: t -*- + +;; Copyright (C) 2017 Free Software Foundation, Inc. + +;; This file is part of GNU Emacs. + +;; This program is free software; you can redistribute it and/or modify +;; it under the terms of the GNU General Public License as published by +;; the Free Software Foundation, either version 3 of the License, or +;; (at your option) any later version. + +;; This program is distributed in the hope that it will be useful, +;; but WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +;; GNU General Public License for more details. + +;; You should have received a copy of the GNU General Public License +;; along with this program. If not, see <https://www.gnu.org/licenses/>. + +;;; Commentary: + +;; Run the tests M-x word-wrap-test-[1-4] which correspond to the four +;; combinations: +;; +;; i) whitespace-mode being enabled and disabled, +;; +;; ii) word-wrap-chars being nil and equal to a char-table that +;; specifies U-200B as the only word-wrap character. +;; +;; The tests with whitespace-mode are needed to help avoid a +;; regression on Bug#11341. + +;;; Code: + +(setq whitespace-display-mappings-for-zero-width-space + '((space-mark 32 + [183] + [46]) + (space-mark 160 + [164] + [95]) + (space-mark 8203 + [164] + [95]) + (newline-mark 10 + [36 10]) + (tab-mark 9 + [187 9] + [92 9]))) + +(defun word-wrap-test-1 () + "Check word-wrap for nil `word-wrap-chars'." + (interactive) + (let ((buf (get-buffer-create "*Word-wrap Test 1*"))) + (with-current-buffer buf + (erase-buffer) + (insert "Word wrap should occur for space.\n\n") + (dotimes (i 100) + (insert "1234567 ")) ; Space + (insert "\n\nWord wrap should NOT occur for U-200B.\n\n") + (dotimes (i 100) + (insert "1234567")) ; U-200B + (setq word-wrap t) + (setq-local word-wrap-chars nil) + (whitespace-mode -1) + (display-buffer buf)))) + +(defun word-wrap-test-2 () + "Check word-wrap for nil `word-wrap-chars' with whitespace-mode." + (interactive) + (let ((buf (get-buffer-create "*Word-wrap Test 2*"))) + (with-current-buffer buf + (erase-buffer) + (insert "Word wrap should occur for space (displayed as `·').\n\n") + (dotimes (i 100) + (insert "1234567 ")) ; Space + (insert "\n\nWord wrap should NOT occur for U-200B (displayed as `¤').\n\n") + (dotimes (i 100) + (insert "1234567")) ; U-200B + (setq word-wrap t) + (setq-local word-wrap-chars nil) + (setq-local whitespace-display-mappings + whitespace-display-mappings-for-zero-width-space) + (whitespace-mode) + (display-buffer buf)))) + +(defun word-wrap-test-3 () + "Check word-wrap if `word-wrap-chars' is a char-table." + (interactive) + (let ((buf (get-buffer-create "*Word-wrap Test 3*"))) + (with-current-buffer buf + (erase-buffer) + (insert "Word wrap should NOT occur for space.\n\n") + (dotimes (i 100) + (insert "1234567 ")) ; Space + (insert "\n\nWord wrap should occur for U-200B.\n\n") + (dotimes (i 100) + (insert "1234567")) ; U-200B + (setq word-wrap t) + (setq-local word-wrap-chars + (let ((ct (make-char-table nil nil))) + (set-char-table-range ct 8203 t) + ct)) + (whitespace-mode -1) + (display-buffer buf)))) + +(defun word-wrap-test-4 () + "Check word-wrap if `word-wrap-chars' is a char-table, for whitespace-mode." + (interactive) + (let ((buf (get-buffer-create "*Word-wrap Test 4*"))) + (with-current-buffer buf + (erase-buffer) + (insert "Word wrap should NOT occur for space (displayed as `·').\n\n") + (dotimes (i 100) + (insert "1234567 ")) ; Space + (insert "\n\nWord wrap should occur for U-200B (displayed as `¤').\n\n") + (dotimes (i 100) + (insert "1234567")) ; U-200B + (setq word-wrap t) + (setq-local word-wrap-chars + (let ((ct (make-char-table nil nil))) + (set-char-table-range ct 8203 t) + ct)) + (setq-local whitespace-display-mappings + whitespace-display-mappings-for-zero-width-space) + (whitespace-mode) + (display-buffer buf)))) -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.