GNU bug report logs - #70076
28.3; xml-escape-string parse issue

Previous Next

Package: emacs;

Reported by: "D. Schmudde" <d <at> schmud.de>

Date: Fri, 29 Mar 2024 16:03:04 UTC

Severity: normal

Tags: notabug

Found in version 28.3

Done: Stefan Kangas <stefankangas <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: "D. Schmudde" <d <at> schmud.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: public <at> protesilaos.com, 70076 <at> debbugs.gnu.org
Subject: bug#70076: 28.3; xml-escape-string parse issue
Date: Sun, 31 Mar 2024 13:15:29 +0200
Okay, good to know. Thanks for taking a look.

Here is some additional context. It occurs when using Elfeed's 
~elfeed-export-opml~ on my list of RSS feeds. It seems the library 
relies on ~xml-escape-string~ to parse each element. It's worth 
noting that this happens on several feeds, not just the feed for 
leancrew.com listed below.

I can file a bug with the package maintainers but I wasn't sure if 
the XML parser was a better place to start. Here is the specific 
backtrace, if it's useful:

Debugger entered--Lisp error: (xml-invalid-character 4194274 11)
 signal(xml-invalid-character (4194274 11))
 xml-escape-string("And now it\342\200\231s all this")
 xml-debug-print-internal((outline ((xmlUrl 
 . "https://leancrew.com/all-this/feed/") (title . "And now 
 it\342\200\231s all this"))) "    ")
 ...

/David

Eli Zaretskii <eliz <at> gnu.org> writes:

>> Cc: Protesilaos Stavrou <public <at> protesilaos.com>
>> From: "D. Schmudde" <d <at> schmud.de>
>> Date: Fri, 29 Mar 2024 16:44:48 +0100
>>
>> Starting with `emacs -Q`:
>>
>> (require 'xml)
>> (xml-escape-string "And now it\342\200\231s all this")
>>
>> The result is: `xml-escape-string: Invalid XML character: 
>> 4194274,
>> 11`
>>
>> I expect that the string will parse correctly with these escape
>> characters. Or is this expectation wrong?
>
> Your expectation is wrong, AFAIU: you are inserting a unibyte 
> string
> (a string made out of raw bytes) instead of inserting a 
> non-ASCII
> multibyte string, which is what XML expects.
>
> Why did you need to insert those bytes, and where did they come 
> from?


--
w: http://schmud.de
e: d <at> schmud.de
t: @dschmudde




This bug report was last modified 1 year and 22 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.