GNU bug report logs -
#78690
31.0.50; split string: args out of range with TRIM
Previous Next
Full log
View this message in rfc822 format
> Cc: 78690 <at> debbugs.gnu.org
> Date: Fri, 06 Jun 2025 10:10:25 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
>
> > From: Michael Heerdegen <michael_heerdegen <at> web.de>
> > Cc: 78690 <at> debbugs.gnu.org
> > Date: Fri, 06 Jun 2025 03:53:47 +0200
> >
> > Eli Zaretskii <eliz <at> gnu.org> writes:
> >
> > > It is quite obvious that split-string is not prepared to deal with a
> > > situation where the argument STRING begins with a match for
> > > SEPARATORS. The breakage here happens because the match for
> > > SEPARATORS at the very beginning of STRING also matches TRIM, but even
> > > if that is not so, a match for SEPARATORS at the beginning of STRING
> > > sets THIS-START incorrectly for the first call to push-one inside the
> > > while-loop.
> >
> > I read that as "confirmed, a bug". Ok, thanks for the analysis.
>
> It's more than that: I'm working on this bug. It just takes time to
> unlock all the subtleties of the implementation and understand how to
> fix it in a most economical and safe way. I've just succeeded to
> understand what was the root cause when I ran out of time.
>
> The interim analysis was intended to attract others to the problem and
> perhaps nudge someone to work out a solution. Also to serve a
> reminder to myself when I get to look at this next time.
The patch below seems to fix the problem, and passes all the tests.
Does it look reasonable?
I took the opportunity to also fix the (dangerous, IMO) calls to
match-beginning in the 'while' condition, since I couldn't be certain
it cannot be affected by the calls to string-match inside push-one.
diff --git a/lisp/subr.el b/lisp/subr.el
index 729f8b3..41aaadf 100644
--- a/lisp/subr.el
+++ b/lisp/subr.el
@@ -5785,7 +5785,9 @@ split-string
(start 0)
this-start this-end
notfirst
+ match-beg
(list nil)
+ (strlen (length string))
(push-one
;; Push the substring in range THIS-START to THIS-END
;; onto LIST, trimming it and perhaps discarding it.
@@ -5794,6 +5796,7 @@ split-string
;; Discard the trim from start of this substring.
(let ((tem (string-match trim string this-start)))
(and (eq tem this-start)
+ (not (eq this-start this-end))
(setq this-start (match-end 0)))))
(when (or keep-nulls (< this-start this-end))
@@ -5811,18 +5814,22 @@ split-string
(while (and (string-match rexp string
(if (and notfirst
- (= start (match-beginning 0))
- (< start (length string)))
+ (= start match-beg)
+ (< start strlen))
(1+ start) start))
- (< start (length string)))
- (setq notfirst t)
- (setq this-start start this-end (match-beginning 0)
- start (match-end 0))
+ (< start strlen))
+ (setq notfirst t
+ match-beg (match-beginning 0))
+ (if (= start match-beg)
+ (setq this-start (match-end 0)
+ this-end this-start)
+ (setq this-start start this-end match-beg))
+ (setq start (match-end 0))
(funcall push-one))
;; Handle the substring at the end of STRING.
- (setq this-start start this-end (length string))
+ (setq this-start start this-end strlen)
(funcall push-one)
(nreverse list)))
diff --git a/test/lisp/subr-tests.el b/test/lisp/subr-tests.el
index 024cbe8..2e8cbec 100644
--- a/test/lisp/subr-tests.el
+++ b/test/lisp/subr-tests.el
@@ -1505,5 +1505,14 @@ hash-table-contains-p
(should (hash-table-contains-p 'cookie h))
(should (hash-table-contains-p 'milk h))))
+(ert-deftest subr-test-split-string ()
+ (let ((text "-*- lexical-binding: t; -*-")
+ (seps "-\\*-")
+ (trim "[ \t\n\r-]+"))
+ (should (equal (split-string text seps nil trim)
+ '("" "lexical-binding: t;" "")))
+ (should (equal (split-string text seps t trim)
+ '("lexical-binding: t;")))))
+
(provide 'subr-tests)
;;; subr-tests.el ends here
This bug report was last modified 21 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.