GNU bug report logs -
#51292
27.2; Reversing strings with unicode combining characters
Previous Next
Reported by: Howard Melman <hmelman <at> gmail.com>
Date: Tue, 19 Oct 2021 19:17:02 UTC
Severity: normal
Tags: wontfix
Found in version 27.2
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
Full log
Message #30 received at 51292 <at> debbugs.gnu.org (full text, mbox):
> From: Lars Ingebrigtsen <larsi <at> gnus.org>
> Date: Tue, 19 Oct 2021 21:26:31 +0200
> Cc: 51292 <at> debbugs.gnu.org
>
> Howard Melman <hmelman <at> gmail.com> writes:
>
> > Reversing a string fails to account for unicode combining characters
> >
> > (reverse "nai\u0308ve")
> > "ev̈ian"
> >
> > Note the diaeresis is now on the v and not the i. s-reverse gets it right:
> >
> > (s-reverse "nai\u0308ve")
> > "evïan"
>
> So I wondered what s-reverse did, and indeed:
>
> (defun s-reverse (s)
> "Return the reverse of S."
> (declare (pure t) (side-effect-free t))
> (save-match-data
> (if (multibyte-string-p s)
> (let ((input (string-to-list s))
> output)
> (require 'ucs-normalize)
> (while input
> ;; Handle entire grapheme cluster as a single unit
> (let ((grapheme (list (pop input))))
> (while (memql (car input) ucs-normalize-combining-chars)
> (push (pop input) grapheme))
> (setq output (nconc (nreverse grapheme) output))))
> (concat output))
> (concat (nreverse (string-to-list s))))))
>
> Emacs has string-reverse, obsolete since 25.1. Perhaps we should
> reintroduce it and use the definition from s?
I don't understand the use case(s) where this could be useful. If
this is for display, then displaying text needs much more than just
combining accents with the base characters. E.g., what if the accent
should not combine when the order is reversed, i.e. the composition
rules depend on the following characters as well? And what if
character composition is not due to normalization rules. Or what if
the text includes bidirectional scripts, whose reversal rules are
either very complex or simply undefined?
If this is not for display, then where is this useful and why?
If someone can describe real-life use cases, we could reason whether
doing something like that could be useful enough. Without that, the
code in s-reverse seems like an incomplete semi-feature which supports
some limited use cases that someone needed in some specific situation,
not a useful general feature that handles the issue anywhere close to
completeness.
This bug report was last modified 3 years and 271 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.