From unknown Wed Jun 18 22:49:22 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#26533 <26533@debbugs.gnu.org> To: bug#26533 <26533@debbugs.gnu.org> Subject: Status: 26.0.50; xml-parse-region's symbol-qname argument is ignored Reply-To: bug#26533 <26533@debbugs.gnu.org> Date: Thu, 19 Jun 2025 05:49:22 +0000 retitle 26533 26.0.50; xml-parse-region's symbol-qname argument is ignored reassign 26533 emacs submitter 26533 Christopher Wellons severity 26533 normal thanks From debbugs-submit-bounces@debbugs.gnu.org Sun Apr 16 08:48:12 2017 Received: (at submit) by debbugs.gnu.org; 16 Apr 2017 12:48:12 +0000 Received: from localhost ([127.0.0.1]:49874 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1czjb9-0003he-OB for submit@debbugs.gnu.org; Sun, 16 Apr 2017 08:48:12 -0400 Received: from eggs.gnu.org ([208.118.235.92]:35535) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1czjb7-0003hR-Ak for submit@debbugs.gnu.org; Sun, 16 Apr 2017 08:48:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1czjb1-0001f5-29 for submit@debbugs.gnu.org; Sun, 16 Apr 2017 08:48:04 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:54007) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1czjb0-0001ey-Vh for submit@debbugs.gnu.org; Sun, 16 Apr 2017 08:48:02 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54683) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1czjax-00036a-IM for bug-gnu-emacs@gnu.org; Sun, 16 Apr 2017 08:48:02 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1czjPg-0004qe-15 for bug-gnu-emacs@gnu.org; Sun, 16 Apr 2017 08:36:25 -0400 Received: from mail.nullprogram.com ([192.241.191.137]:38286) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1czjPf-0004pd-Tk for bug-gnu-emacs@gnu.org; Sun, 16 Apr 2017 08:36:19 -0400 Received: from localhost ([127.0.0.1] helo=tengu.zeus.nullprogram.com) by mail.nullprogram.com with esmtp (Exim 4.84_2) (envelope-from ) id 1czjPV-0003mI-Si; Sun, 16 Apr 2017 08:36:10 -0400 From: Christopher Wellons To: bug-gnu-emacs@gnu.org Subject: 26.0.50; xml-parse-region's symbol-qname argument is ignored Date: Sun, 16 Apr 2017 08:36:07 -0400 Message-ID: <87fuh8bkco.fsf@tengu.zeus.nullprogram.com> MIME-Version: 1.0 Content-Type: text/plain X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] [fuzzy] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) A bug was introduced in aea67018 that causes the special "symbol-qnames" value for PARSE-NS to be ignored, as if it were nil. This information is discarded by the change to xml-parse-attlist, so functions further down the line see the argument as if it was set to nil. Here's an example of the bug: (with-temp-buffer (insert "") (let ((xml-default-ns ())) (xml-parse-region nil nil nil nil 'symbol-qnames))) Prior to this commit (Emacs 25.1 and earlier) the result is: ((root ((b . "c")))) After this commit: ((root ((a:b . "c")))) This is the same as PARSE-NS being set to nil. From debbugs-submit-bounces@debbugs.gnu.org Mon Apr 17 11:33:14 2017 Received: (at 26533-done) by debbugs.gnu.org; 17 Apr 2017 15:33:14 +0000 Received: from localhost ([127.0.0.1]:52658 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1d08eQ-00087B-Ea for submit@debbugs.gnu.org; Mon, 17 Apr 2017 11:33:14 -0400 Received: from randomsample.de ([5.45.97.173]:34374) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1d08eP-000873-0a for 26533-done@debbugs.gnu.org; Mon, 17 Apr 2017 11:33:13 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=randomsample.de; s=a; h=Content-Type:MIME-Version:Message-ID:Date:References:In-Reply-To:Subject:Cc:To:From; bh=tQNyIC+eEnxomk7Cvwt/wRJGdNpCMhLXNP1tTnj1I2c=; b=KGTFtdquCmzlYlD7rFq+BSsdUtFdq9zMH8Nl9As66eTp7/rJ7OaFpe4uL7uTKEKMWiuEIlPB8Ne2MWYhzlaWr271uVfL/MGVHtkqcV15Cg8HRVZtdhnFyDUS2outSJJj; Received: from ip4d1681d2.dynamic.kabel-deutschland.de ([77.22.129.210] helo=isaac) by randomsample.de with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from ) id 1d08eN-0003NF-LT; Mon, 17 Apr 2017 17:33:11 +0200 From: David Engster To: Christopher Wellons Subject: Re: bug#26533: 26.0.50; xml-parse-region's symbol-qname argument is ignored In-Reply-To: <87fuh8bkco.fsf@tengu.zeus.nullprogram.com> (Christopher Wellons's message of "Sun, 16 Apr 2017 08:36:07 -0400") References: <87fuh8bkco.fsf@tengu.zeus.nullprogram.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) Mail-Copies-To: never Date: Mon, 17 Apr 2017 17:33:07 +0200 Message-ID: <874lxncamk.fsf@engster.org> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 26533-done Cc: 26533-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) Christopher Wellons writes: > A bug was introduced in aea67018 that causes the special "symbol-qnames" > value for PARSE-NS to be ignored, as if it were nil. This information is > discarded by the change to xml-parse-attlist, so functions further down > the line see the argument as if it was set to nil. > > Here's an example of the bug: > > (with-temp-buffer > (insert "") > (let ((xml-default-ns ())) > (xml-parse-region nil nil nil nil 'symbol-qnames))) > > Prior to this commit (Emacs 25.1 and earlier) the result is: > > ((root ((b . "c")))) > > After this commit: > > ((root ((a:b . "c")))) > > This is the same as PARSE-NS being set to nil. Thanks for the report. You are right that the fix for bug #23440 was not correct. I now pushed a hopefully better version to master. Note however that your test above has two problems: First, it's invalid XML since you're using an undeclared prefix (so the parser should rather throw an error, but I'm not eager to make the xml parser more strict, as there's a lot of invalid XML in the wild). Second, I don't understand why you let-bind `xml-default-ns' to nil. This will break namespace expansion, and it will actually do this for the whole Emacs session if xml.el gets autoloaded during the above. -David From debbugs-submit-bounces@debbugs.gnu.org Mon Apr 17 12:29:20 2017 Received: (at 26533-done) by debbugs.gnu.org; 17 Apr 2017 16:29:20 +0000 Received: from localhost ([127.0.0.1]:52716 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1d09Wi-0000yN-8p for submit@debbugs.gnu.org; Mon, 17 Apr 2017 12:29:20 -0400 Received: from mail.nullprogram.com ([192.241.191.137]:58631) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1d09Wg-0000yF-GS for 26533-done@debbugs.gnu.org; Mon, 17 Apr 2017 12:29:19 -0400 Received: from localhost ([127.0.0.1] helo=wellocc1-ares.jhuapl.edu) by mail.nullprogram.com with esmtp (Exim 4.84_2) (envelope-from ) id 1d09Wc-0000IO-90; Mon, 17 Apr 2017 12:29:14 -0400 From: Christopher Wellons To: David Engster Subject: Re: bug#26533: 26.0.50; xml-parse-region's symbol-qname argument is ignored In-Reply-To: <874lxncamk.fsf@engster.org> References: <87fuh8bkco.fsf@tengu.zeus.nullprogram.com> <874lxncamk.fsf@engster.org> X-Hashcash: 1:20:170417:deng@randomsample.de::MPTX3I8z3AIy2UB+:000000000000000000000000000000000000000001htR X-Hashcash: 1:20:170417:26533-done@debbugs.gnu.org::xu/Arj33tdcdqzq9:000000000000000000000000000000000003Jwc Date: Mon, 17 Apr 2017 12:29:15 -0400 Message-ID: <87efwr576s.fsf@wellocc1-ares.jhuapl.edu> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 26533-done Cc: 26533-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) Thanks, David! Your fix works fine as far as I can tell. I'm using this trick in Elfeed (a syndication feed reader) as a fast method to strip all namespaces from the XML as it's being parsed. As you said, there's a lot of invalid XML in the wild. I've found it works a lot better to ignore namespaces and strictness, instead extracting the required information heuristically as long as it's reasonably close. Otherwise there would be a whole lot more feeds that wouldn't work well, or at all, in Elfeed. I had noticed with symbol-qnames that xml-parse-region drops unknown namespaces. Since this information comes from an alist, that seemed like reasonable behavior and I assumed it was intentional -- though signaling an error would also be reasonable. To tightly control which namespaces are stripped, I bind xml-default-ns to my own alist for that call. This feels like the natural and lispy way to use this function. The file that binds xml-default-ns requires the xml package explicitly, so there's no risk of it autoloading while it's bound. Though that's an interesting consequence I hadn't considered before. I _have_ seen similar issues with accept-process-output when arbitrary process events are handled while the stack is in an unusual state. From unknown Wed Jun 18 22:49:22 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Tue, 16 May 2017 11:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator