GNU bug report logs - #78690
31.0.50; split string: args out of range with TRIM

Previous Next

Package: emacs;

Reported by: Michael Heerdegen <michael_heerdegen <at> web.de>

Date: Wed, 4 Jun 2025 02:35:02 UTC

Severity: normal

Found in version 31.0.50

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Eli Zaretskii <eliz <at> gnu.org>
To: michael_heerdegen <at> web.de
Cc: 78690 <at> debbugs.gnu.org
Subject: bug#78690: 31.0.50; split string: args out of range with TRIM
Date: Sat, 07 Jun 2025 22:32:15 +0300
> Cc: 78690 <at> debbugs.gnu.org
> Date: Fri, 06 Jun 2025 10:10:25 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> 
> > From: Michael Heerdegen <michael_heerdegen <at> web.de>
> > Cc: 78690 <at> debbugs.gnu.org
> > Date: Fri, 06 Jun 2025 03:53:47 +0200
> > 
> > Eli Zaretskii <eliz <at> gnu.org> writes:
> > 
> > > It is quite obvious that split-string is not prepared to deal with a
> > > situation where the argument STRING begins with a match for
> > > SEPARATORS.  The breakage here happens because the match for
> > > SEPARATORS at the very beginning of STRING also matches TRIM, but even
> > > if that is not so, a match for SEPARATORS at the beginning of STRING
> > > sets THIS-START incorrectly for the first call to push-one inside the
> > > while-loop.
> > 
> > I read that as "confirmed, a bug".  Ok, thanks for the analysis.
> 
> It's more than that: I'm working on this bug.  It just takes time to
> unlock all the subtleties of the implementation and understand how to
> fix it in a most economical and safe way.  I've just succeeded to
> understand what was the root cause when I ran out of time.
> 
> The interim analysis was intended to attract others to the problem and
> perhaps nudge someone to work out a solution.  Also to serve a
> reminder to myself when I get to look at this next time.

The patch below seems to fix the problem, and passes all the tests.
Does it look reasonable?

I took the opportunity to also fix the (dangerous, IMO) calls to
match-beginning in the 'while' condition, since I couldn't be certain
it cannot be affected by the calls to string-match inside push-one.

diff --git a/lisp/subr.el b/lisp/subr.el
index 729f8b3..41aaadf 100644
--- a/lisp/subr.el
+++ b/lisp/subr.el
@@ -5785,7 +5785,9 @@ split-string
 	 (start 0)
 	 this-start this-end
 	 notfirst
+         match-beg
 	 (list nil)
+         (strlen (length string))
 	 (push-one
 	  ;; Push the substring in range THIS-START to THIS-END
 	  ;; onto LIST, trimming it and perhaps discarding it.
@@ -5794,6 +5796,7 @@ split-string
 	      ;; Discard the trim from start of this substring.
 	      (let ((tem (string-match trim string this-start)))
 		(and (eq tem this-start)
+                     (not (eq this-start this-end))
 		     (setq this-start (match-end 0)))))
 
 	    (when (or keep-nulls (< this-start this-end))
@@ -5811,18 +5814,22 @@ split-string
 
     (while (and (string-match rexp string
 			      (if (and notfirst
-				       (= start (match-beginning 0))
-				       (< start (length string)))
+				       (= start match-beg)
+				       (< start strlen))
 				  (1+ start) start))
-		(< start (length string)))
-      (setq notfirst t)
-      (setq this-start start this-end (match-beginning 0)
-	    start (match-end 0))
+		(< start strlen))
+      (setq notfirst t
+            match-beg (match-beginning 0))
+      (if (= start match-beg)
+          (setq this-start (match-end 0)
+                this-end this-start)
+        (setq this-start start this-end match-beg))
+      (setq start (match-end 0))
 
       (funcall push-one))
 
     ;; Handle the substring at the end of STRING.
-    (setq this-start start this-end (length string))
+    (setq this-start start this-end strlen)
     (funcall push-one)
 
     (nreverse list)))
diff --git a/test/lisp/subr-tests.el b/test/lisp/subr-tests.el
index 024cbe8..2e8cbec 100644
--- a/test/lisp/subr-tests.el
+++ b/test/lisp/subr-tests.el
@@ -1505,5 +1505,14 @@ hash-table-contains-p
     (should (hash-table-contains-p 'cookie h))
     (should (hash-table-contains-p 'milk h))))
 
+(ert-deftest subr-test-split-string ()
+  (let ((text "-*- lexical-binding: t; -*-")
+        (seps "-\\*-")
+        (trim "[ \t\n\r-]+"))
+    (should (equal (split-string text seps nil trim)
+                   '("" "lexical-binding: t;" "")))
+    (should (equal (split-string text seps t trim)
+                   '("lexical-binding: t;")))))
+
 (provide 'subr-tests)
 ;;; subr-tests.el ends here




This bug report was last modified 21 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.