Thanks for the very fast replies and the suggestions! > I think this is okay, but maybe the macro could be converted into an > inline function, and then fetching the character from the various > objects separated from looking up the char-table for that character? I've made the conversion — it's now slightly less messy. Regarding the separation, I think that the most that can be done is to have the look-up in a separate function. Regrettably, trying to first obtain the character, for example via a set of if-else clauses, and then looking it up, which would be cleaner, can't really work since the cases (in particular the first and fourth) are not disjunct. > Well, since it's a char-table, users will probably want to control > which characters cause word-wrap. One idea would be to have a minor > mode or some such, providing users an ability to include or exclude > different groups of related whitespace characters as a whole? This > could be in follow-up patches, though. Customisability was the idea. :) I'm not sure how best to expose it in a reasonably user-friendly way, though. For the time being, allowing control directly via the char-table might suffice. > We could also look at LineBreak.txt in the Unicode database for > inspiration and ideas. The three main customisation options that I'm considering are: i) Unicode whitespace (U+2000 - U+200B), ii) vim's breakat characters (default " ^I!@*-+;:,./?"), since presumably they had given it some thought, iii) The characters in LineBreak.txt (parsing the file shouldn't be hard, if there aren't copyright issues). > But I do think that the default should be only TAB and SPC, as Emacs > always did, and the rest should be optional, and probably in Lisp, not > C. > And also a couple of tests (the ones you used would be a good start). These would presumably have to be in tests/manual since the position of the word-wrap depends on too many variables (width of window, font type, font size)? > I will send the forms off-list, thanks. Thanks! > One other thought: since TAB and SPC are single-byte characters, > whereas the other "whitespace" characters are not, supporting the > non-ASCII whitespace will be associated with some performance hit in > the display engine, because it requires a char-table look up and > fetching multibyte characters. So perhaps we should allow the > word-wrap-chars char-table to be nil (and make that the default), and > in that case support only TAB and SPC as word-wrap characters. This > would let the default configuration work as fast is it does now, > imposing the performance penalty only on those who want to support > more whitespace characters. > WDYT? That seems sensible. The old behaviour will now be the default and look-up using the char-table only enabled with the global minor mode `word-wrap-char-table-mode' (suggestions for a catchier name very welcome). For the time being, its definition is in a new file `lisp/word-wrap.el'. Also temporarily, for ease of testing, it allows wrapping on the unicode whitespace characters. The current iteration is attached. Until they've found a proper home, the slightly updated tests are below. (require 'word-wrap) (with-current-buffer (get-buffer-create "*bar*") (dotimes (i 1000) (insert "1234")) ; U-200B (setq word-wrap t) (setq whitespace-display-mappings '((space-mark 32 [183] [46]) (space-mark 160 [164] [95]) (space-mark 8203 [164] [95]) (newline-mark 10 [36 10]) (tab-mark 9 [187 9] [92 9]))) (whitespace-mode) (word-wrap-char-table-mode) (display-buffer "*bar*")) (with-current-buffer (get-buffer-create "*foo*") (dotimes (i 1000) (insert "1234")) ; U-200B (setq word-wrap t) (word-wrap-char-table-mode) (display-buffer "*foo*"))