GNU bug report logs -
#51292
27.2; Reversing strings with unicode combining characters
Previous Next
Reported by: Howard Melman <hmelman <at> gmail.com>
Date: Tue, 19 Oct 2021 19:17:02 UTC
Severity: normal
Tags: wontfix
Found in version 27.2
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
Full log
Message #8 received at 51292 <at> debbugs.gnu.org (full text, mbox):
Howard Melman <hmelman <at> gmail.com> writes:
> Reversing a string fails to account for unicode combining characters
>
> (reverse "nai\u0308ve")
> "ev̈ian"
>
> Note the diaeresis is now on the v and not the i. s-reverse gets it right:
>
> (s-reverse "nai\u0308ve")
> "evïan"
So I wondered what s-reverse did, and indeed:
(defun s-reverse (s)
"Return the reverse of S."
(declare (pure t) (side-effect-free t))
(save-match-data
(if (multibyte-string-p s)
(let ((input (string-to-list s))
output)
(require 'ucs-normalize)
(while input
;; Handle entire grapheme cluster as a single unit
(let ((grapheme (list (pop input))))
(while (memql (car input) ucs-normalize-combining-chars)
(push (pop input) grapheme))
(setq output (nconc (nreverse grapheme) output))))
(concat output))
(concat (nreverse (string-to-list s))))))
Emacs has string-reverse, obsolete since 25.1. Perhaps we should
reintroduce it and use the definition from s?
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
This bug report was last modified 3 years and 271 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.