GNU bug report logs - #51292
27.2; Reversing strings with unicode combining characters

Previous Next

Package: emacs;

Reported by: Howard Melman <hmelman <at> gmail.com>

Date: Tue, 19 Oct 2021 19:17:02 UTC

Severity: normal

Tags: wontfix

Found in version 27.2

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

Full log


Message #8 received at 51292 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Howard Melman <hmelman <at> gmail.com>
Cc: 51292 <at> debbugs.gnu.org
Subject: Re: bug#51292: 27.2; Reversing strings with unicode combining
 characters
Date: Tue, 19 Oct 2021 21:26:31 +0200
Howard Melman <hmelman <at> gmail.com> writes:

> Reversing a string fails to account for unicode combining characters
>
>     (reverse "nai\u0308ve")
>     "ev̈ian"
>
> Note the diaeresis is now on the v and not the i.  s-reverse gets it right:
>
>     (s-reverse "nai\u0308ve")
>     "evïan"

So I wondered what s-reverse did, and indeed:

(defun s-reverse (s)
  "Return the reverse of S."
  (declare (pure t) (side-effect-free t))
  (save-match-data
    (if (multibyte-string-p s)
        (let ((input (string-to-list s))
              output)
          (require 'ucs-normalize)
          (while input
            ;; Handle entire grapheme cluster as a single unit
            (let ((grapheme (list (pop input))))
              (while (memql (car input) ucs-normalize-combining-chars)
                (push (pop input) grapheme))
              (setq output (nconc (nreverse grapheme) output))))
          (concat output))
      (concat (nreverse (string-to-list s))))))

Emacs has string-reverse, obsolete since 25.1.  Perhaps we should
reintroduce it and use the definition from s?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




This bug report was last modified 3 years and 271 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.