Package: emacs;
Reported by: Itai Berli <itai.berli <at> gmail.com>
Date: Sat, 1 Jul 2017 10:00:02 UTC
Severity: wishlist
Found in version 25.1
Fixed in version 29.1
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
View this message in rfc822 format
From: Itai Berli <itai.berli <at> gmail.com> To: 27544 <at> debbugs.gnu.org Subject: bug#27544: 25.1; Visualization of Unicode bidirectional marks Date: Sat, 1 Jul 2017 12:58:28 +0300
Emacs supports 12 Unicode bidirectional marks (ALM, RLM, LRM, LRE, RLE, LRO, RLO, PDF, FSI, LRI, RLI, and PDI), each of which displays as a very thin space. This raises two problems. 1. On the one hand, the fact that these inherently invisible marks manifest, by default, as thin spaces undermines attempts at precise alignment and positioning. Moreover, in the case of LRM, RLM and ALM, this behavior contradicts explicit directions given in the Unicode Bidirectional Algorithm 8.0.0 specifications (section 2.6 Implicit Directional Marks): > they do not appear in the display (To my understanding, this is meant to apply to all bidi marks, even if only stated explicitly for LRM, RLM and ALM.) 2. On the other hand, the fact that these spaces are so thin as to be barely noticeable, and the fact that they are indistinguishable from one another makes it difficult to debug and resolve strange and/or erroneous behavior that can happen in a bidi document, an example of which is given below. The solution to both problems is to make the bidi marks visible in `whitespace` mode only, and to give them glyphs that are (a) easy to notice, (b) distinguishable from other whitespace visualization glyphs, (c) distinct from one another. The following example exhibits strange behavior that can arise due to the use of bidi marks. This behavior is difficult to debug without visualizing the bidi marks. Consider the following paragraph. ILLUSTRATION #1: An English sentence that is formatted from right to left. http://imgur.com/is1OBtM The paragraph is entirely in English, then why is it formatted from right to left? Without visible bidi marks, it's hard to tell; however a savvy Unicode-aware person would realize that this must indicate the presence of a Right-To-Left Mark (U+200F). Therefore, if we position the cursor at the beginning of the paragraph (`C-a`), and delete the following character (`C-d`), the sentence should display normally. ILLUSTRATION #2: Deleting the first Right-To-Left mark at the beginning of the paragraph has no effect. http://imgur.com/eBVpdZA Against our expectations, nothing appears to have changed. There must be another Right-To-Left mark at the beginning of the paragraph. Let's delete it as well. (`C-d`) ILLUSTRATION #3: Deleting the second Right-To-Left Mark left-aligns the paragraph, but leave the comma misplaced. http://imgur.com/Klj3lZC The paragraph is now aligned to the left, as it should, and everything looks normal, except for the comma, which appears in the beginning of the paragraph. But this can be easily remedied: let's delete the comma and then retype it in its proper place. We position the cursor at the beginning of the paragraph (`C-a`) and delete the following character (`C-d`). ILLUSTRATION #4: After trying to delete the comma, the paragraph is finally displayed correctly. http://imgur.com/3w73MxM Instead of deleting the comma, this has shifted the comma to it's correct position. If we were able to visualize the whitespace, we would have realized from the beginning that the sequence of characters in this paragraph was, from left to right: RTL-RTL-RLO-RLO-H-e-l-l-o-PDO-,-PDO-SPACE-w-o-r-l-d-! Thus, our first three actions removed the first three characters, leaving us with: RLO-H-e-l-l-o-PDO-,-PDO-SPACE-w-o-r-l-d-! We now realize that even the final, correct form, is in fact littered with bidi errors and potential landmines! In GNU Emacs 25.1.1 (x86_64-apple-darwin13.4.0, NS appkit-1265.21 Version 10.9.5 (Build 13F1911)) of 2016-09-21 built on builder10-9.porkrind.org Windowing system distributor 'Apple', version 10.3.1504 Configured using: 'configure --with-ns '--enable-locallisppath=/Library/Application Support/Emacs/${version}/site-lisp:/Library/Application Support/Emacs/site-lisp' --with-modules' Configured features: NOTIFY ACL GNUTLS LIBXML2 ZLIB TOOLKIT_SCROLL_BARS NS MODULES Important settings: value of $LANG: en_US.UTF-8 locale-coding-system: utf-8-unix Major mode: TeX/P Minor modes in effect: diff-auto-refine-mode: t TeX-PDF-mode: t ivy-mode: t shell-dirtrack-mode: t projectile-mode: t helm-descbinds-mode: t async-bytecomp-package-mode: t tooltip-mode: t global-eldoc-mode: t electric-indent-mode: t mouse-wheel-mode: t tool-bar-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t column-number-mode: t line-number-mode: t transient-mark-mode: t Recent messages: Applying style hooks... done Mark set C-> is undefined Mark set [2 times] Saving file /Users/itaiberli/Documents/GitHub/Thesis/test22.tex... Wrote /Users/itaiberli/Documents/GitHub/Thesis/test22.tex Undo! Auto-saving...done repeat-complex-command: There are no previous complex commands to repeat delete-backward-char: Text is read-only Load-path shadows: /Users/itaiberli/.emacs.d/elpa/seq-2.20/seq hides /Applications/Emacs.app/Contents/Resources/lisp/emacs-lisp/seq Features: (shadow sort mail-extr emacsbug message rfc822 mml mml-sec epg mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums mail-utils vc-git diff-mode tex-bar toolbar-x font-latex plain-tex tex-buf latex tex-ispell tex-style tex crm tex-mode latexenc colir color counsel jka-compr esh-util etags xref project swiper reftex reftex-vars two-column ivy delsel ivy-overlay helm-projectile helm-files rx image-dired tramp tramp-compat tramp-loaddefs trampver shell pcomplete format-spec dired-x dired-aux ffap helm-tags helm-bookmark helm-adaptive helm-info bookmark pp helm-external helm-net browse-url xml url url-proxy url-privacy url-expand url-methods url-history url-cookie url-domsuf url-util url-parse auth-source gnus-util mm-util help-fns mail-prsvr password-cache url-vars mailcap helm-buffers helm-grep helm-regexp helm-utils helm-locate helm-help helm-types projectile grep compile comint ansi-color ring ibuf-ext ibuffer thingatpt helm-descbinds helm easy-mmode helm-source cl-seq eieio-compat eieio eieio-core helm-multi-match helm-lib dired helm-config helm-easymenu cl-macs async-bytecomp async advice edmacro kmacro finder-inf tex-site info package epg-config seq byte-opt gv bytecomp byte-compile cl-extra help-mode easymenu cconv cl-loaddefs pcase cl-lib time-date mule-util tooltip eldoc electric uniquify ediff-hook vc-hooks lisp-float-type mwheel ns-win ucs-normalize term/common-win tool-bar dnd fontset image regexp-opt fringe tabulated-list newcomment elisp-mode lisp-mode prog-mode register page menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock font-lock syntax facemenu font-core frame cl-generic cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese charscript case-table epa-hook jka-cmpr-hook help simple abbrev minibuffer cl-preloaded nadvice loaddefs button faces cus-face macroexp files text-properties overlay sha1 md5 base64 format env code-pages mule custom widget hashtable-print-readable backquote kqueue cocoa ns multi-tty make-network-process emacs) Memory information: ((conses 16 359730 16119) (symbols 48 34262 0) (miscs 40 100 221) (strings 32 65306 15883) (string-bytes 1 1997869) (vectors 16 60314) (vector-slots 8 1721804 214602) (floats 8 589 398) (intervals 56 269 0) (buffers 976 19))
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.