From unknown Sat Sep 06 05:55:22 2025 X-Loop: help-debbugs@gnu.org Subject: bug#71685: [PATCH] fix shr rendering in tables without tbody Resent-From: JD Smith Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 20 Jun 2024 19:16:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 71685 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: 71685@debbugs.gnu.org X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.17189109544921 (code B ref -1); Thu, 20 Jun 2024 19:16:01 +0000 Received: (at submit) by debbugs.gnu.org; 20 Jun 2024 19:15:54 +0000 Received: from localhost ([127.0.0.1]:41037 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sKNGE-0001HI-6c for submit@debbugs.gnu.org; Thu, 20 Jun 2024 15:15:54 -0400 Received: from lists.gnu.org ([209.51.188.17]:40796) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sKNGD-0001H9-0b for submit@debbugs.gnu.org; Thu, 20 Jun 2024 15:15:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sKNG8-0002CZ-K0 for bug-gnu-emacs@gnu.org; Thu, 20 Jun 2024 15:15:48 -0400 Received: from mail-qk1-x72a.google.com ([2607:f8b0:4864:20::72a]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sKNG7-0002bB-4t for bug-gnu-emacs@gnu.org; Thu, 20 Jun 2024 15:15:48 -0400 Received: by mail-qk1-x72a.google.com with SMTP id af79cd13be357-7955ddc6516so82984685a.1 for ; Thu, 20 Jun 2024 12:15:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718910945; x=1719515745; darn=gnu.org; h=to:date:message-id:subject:mime-version:from:from:to:cc:subject :date:message-id:reply-to; bh=qLKJymAKFAeuxDGBj4ZYmy+ySmIBlVUs0B4f1zKW2Fw=; b=PKns+vHEX07yuFZlM16XzzS237qmRu1WEuej7cA8SbdPyMPeJedlfRMWVkNYnuC75P 7kRemp+DeXAvN0SOp7+KlYOavCeb4VcKMHgnb588Q/yjZuVnb+WDehVFggyXxw18cr7V K5ZDGcBH8Yodk9+flAcKEcwu+iZjtSlT/grN8ZW3rj6+E/DY+RcJxNSg8HsvhDh3+vQj 2ZsA4cuW8uggLmNZFE0BfjlExc3dWX7cHPxAoBBWgVxuPVVs7z7uh93FVX7RCMr0x3Tv 2nYbNnuDmBx/BIPKxdkQeV1S5iFwPbklIDlB8onRGfTmZVEpiNnssuSzJfouC9tgLMqI lGVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718910945; x=1719515745; h=to:date:message-id:subject:mime-version:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=qLKJymAKFAeuxDGBj4ZYmy+ySmIBlVUs0B4f1zKW2Fw=; b=OsQQk0FA/bTqa+jywvaY94rQGsoswrErqvFCPulyL6Mw4uBieTcIQO5a1DD/mo/3Vh 6o593TQAjRYPlirP7nSxwJeVblsU7gLlr0RI2DEAtxu8ukeC3XpI4OioPJhOymQL5LoN 59V6o+u7hWDD5dRnM2PGmbE56TUEO1SRrvveBGhJnS1qg1aUA3GDXk0dT7yVPlL0B/ef yLVKdfO89D6trmsyyMVdWLlVqU6mxht0GIpzJJbBTtGIFbHwqAkpml6esZi+Ag5WZ5R0 x81yqkCT5sRLtdDIUymXMXblkbK1UryJ+iEbAk7leXi4mZUwwhTtlkiJ7p2dzF86zxfn Xxcw== X-Gm-Message-State: AOJu0Yx0vGOKC08IK8Wpm7cPuu+1kVlNkvZFMWhlsMsE8hBkhz5NSVwY czO15OW3GLLsLB3Q3XDyLk04d1NWt/ACzYC9evyYIYAU9lZTp9v6/tuwUw== X-Google-Smtp-Source: AGHT+IE084ZCpSG/uzN4U5xhpw3obYrvMaJ/ZlxYayluXUgWPGD0YjxudM2j5spRUJDkP9XePoQIQw== X-Received: by 2002:a05:620a:440c:b0:795:5dab:9f26 with SMTP id af79cd13be357-79bb3e3e77bmr628946285a.28.1718910944704; Thu, 20 Jun 2024 12:15:44 -0700 (PDT) Received: from smtpclient.apple ([131.183.131.33]) by smtp.gmail.com with ESMTPSA id af79cd13be357-79bce8c3574sm2582985a.69.2024.06.20.12.15.43 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 20 Jun 2024 12:15:43 -0700 (PDT) From: JD Smith Content-Type: multipart/mixed; boundary="Apple-Mail=_F340A71F-91F4-4FA3-AA30-73E276B59B1F" Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.600.62\)) Message-Id: <65EAF7A2-697E-46E2-998C-AB99CC54E89C@gmail.com> Date: Thu, 20 Jun 2024 15:15:32 -0400 X-Mailer: Apple Mail (2.3774.600.62) Received-SPF: pass client-ip=2607:f8b0:4864:20::72a; envelope-from=jdtsmith@gmail.com; helo=mail-qk1-x72a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.3 (-) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) --Apple-Mail=_F340A71F-91F4-4FA3-AA30-73E276B59B1F Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii It is very common for HTML tables to include a header () and/or = footer () without using . Modern browsers simply supply = an implicit .. around all the unparented rows in a table. = `shr' does not handle this common case correctly. Below is an example = with but not . It prints the header, but then subsumes = it again inside the derived body, printing the header again in a single = cell. =20 The relevant function which should handle this is `shr--fix-tbody'. = The included patch to this function simply avoids including `thead` and = `tfoot` children in the implicit body. (let ((shr-table-vertical-line ?|) (shr-table-horizontal-line ?-)) (shr-insert-document (with-temp-buffer (insert "
AB
12
34
") (libxml-parse-html-region)))) --------- =20 | --- -- | ||A |B | | | --- -- | ||AB | | | --- -- | ||1 |2 | | | --- -- | ||3 |4 | | | --- -- | --------- =20 --Apple-Mail=_F340A71F-91F4-4FA3-AA30-73E276B59B1F Content-Disposition: attachment; filename=shr_fix_tbody.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="shr_fix_tbody.patch" Content-Transfer-Encoding: 7bit --- shr.el 2024-06-20 15:03:52 +++ shr_fix_tbody.el 2024-06-20 15:00:49 @@ -2053,8 +2053,9 @@ (defun shr--fix-tbody (tbody) (nconc (list 'tbody (dom-attributes tbody)) (cl-loop for child in (dom-children tbody) - collect (if (or (stringp child) - (not (eq (dom-tag child) 'tr))) + for tag = (and (not (stringp child)) (dom-tag child)) + unless (or (eq tag 'thead) (eq tag 'tfoot)) + collect (if (not (eq tag 'tr)) (list 'tr nil (list 'td nil child)) child)))) --Apple-Mail=_F340A71F-91F4-4FA3-AA30-73E276B59B1F-- From unknown Sat Sep 06 05:55:22 2025 X-Loop: help-debbugs@gnu.org Subject: bug#71685: [PATCH] fix shr rendering in tables without tbody Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 06 Jul 2024 07:37:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 71685 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: JD Smith Cc: 71685@debbugs.gnu.org Received: via spool by 71685-submit@debbugs.gnu.org id=B71685.172025141520162 (code B ref 71685); Sat, 06 Jul 2024 07:37:01 +0000 Received: (at 71685) by debbugs.gnu.org; 6 Jul 2024 07:36:55 +0000 Received: from localhost ([127.0.0.1]:45486 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sPzyZ-0005F8-1Y for submit@debbugs.gnu.org; Sat, 06 Jul 2024 03:36:55 -0400 Received: from eggs.gnu.org ([209.51.188.92]:59858) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sPzyY-0005Eu-69 for 71685@debbugs.gnu.org; Sat, 06 Jul 2024 03:36:54 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sPzyP-0001pO-CR; Sat, 06 Jul 2024 03:36:45 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=YE3MkwEAK37j70ogMZ9LmRlHHMxQPVMmybuM/hhIGsk=; b=XiN9J+LX2lG+ YtJx5sQdBpFl3sA0HWPaYgkeWo6SRiodlNydkEcyhmNnXwlmQU3NkE/LyUtv6xTY6q7vJ+r1c/iZR VuJMkHBqVq0CuiTaiYOi9S/4rrHctjzNcBjDQaN76pickIVq3QxAc4gB7gHR/IXStqClENamrwWBi EgIvsVoDgXI7+EfwBirXbop4/7szjvzXBVeresx/dF0JvQ9g3hlb9KYuvngFtR0FWFANEK3evhM3c 2wOmNSMqc6Ur9yFY2u1wgZoShWKpMLh/owfaM5XZDuXhc0c617TDjPvATH382koilJIZuTdtdKSYK aP/tbnahGvjUqKP9jRIr2w==; Date: Sat, 06 Jul 2024 10:36:31 +0300 Message-Id: <86h6d355vk.fsf@gnu.org> From: Eli Zaretskii In-Reply-To: <65EAF7A2-697E-46E2-998C-AB99CC54E89C@gmail.com> (message from JD Smith on Thu, 20 Jun 2024 15:15:32 -0400) References: <65EAF7A2-697E-46E2-998C-AB99CC54E89C@gmail.com> X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: JD Smith > Date: Thu, 20 Jun 2024 15:15:32 -0400 > > It is very common for HTML tables to include a header () and/or footer () without using . Modern browsers simply supply an implicit .. around all the unparented rows in a table. `shr' does not handle this common case correctly. Below is an example with but not . It prints the header, but then subsumes it again inside the derived body, printing the header again in a single cell. > > The relevant function which should handle this is `shr--fix-tbody'. The included patch to this function simply avoids including `thead` and `tfoot` children in the implicit body. Thanks. I don't see any experts chiming in, so if you are confident in the patch, and it doesn't fail the existing tests, please install on the emacs-30 branch, and thanks. Bonus points for adding a test for this case. From unknown Sat Sep 06 05:55:22 2025 X-Loop: help-debbugs@gnu.org Subject: bug#71685: [PATCH] fix shr rendering in tables without tbody Resent-From: JD Smith Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 06 Jul 2024 18:15:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 71685 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Eli Zaretskii Cc: 71685@debbugs.gnu.org Received: via spool by 71685-submit@debbugs.gnu.org id=B71685.172028969432340 (code B ref 71685); Sat, 06 Jul 2024 18:15:01 +0000 Received: (at 71685) by debbugs.gnu.org; 6 Jul 2024 18:14:54 +0000 Received: from localhost ([127.0.0.1]:46731 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sQ9vx-0008PY-Ik for submit@debbugs.gnu.org; Sat, 06 Jul 2024 14:14:53 -0400 Received: from mail-qv1-f45.google.com ([209.85.219.45]:47592) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sQ9vw-0008PK-1F for 71685@debbugs.gnu.org; Sat, 06 Jul 2024 14:14:52 -0400 Received: by mail-qv1-f45.google.com with SMTP id 6a1803df08f44-6b5f46191b5so10277086d6.3 for <71685@debbugs.gnu.org>; Sat, 06 Jul 2024 11:14:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1720289623; x=1720894423; darn=debbugs.gnu.org; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:from:to:cc:subject:date:message-id:reply-to; bh=NZbVaGzqg6Zg62PTTuRfYVwxQcbA37EnSjwgWFPG46M=; b=UeBgyBW/01/akpKNvZJ3qYcglRdoEmgZJQd1qb0hBDoiUHl0tdAzVepg89WxHxn/zT ZFkmx3ZglPGATugWKw9whwJaPqxOpG7YgHYSbnRIOmlSezqsmGKD84jpnzhlveISSjSZ snObPhs+JdxfxcE8HhT1ry2JrpdV1ugCTqkOkk9Aj/9dxD7DopO04IhMeGnMbD2qzZR8 BB/dcL4TAnlpGqDiw/RyV7d+OuO+F9yIS4dzGl9SGZIULJRsXqta9V3Bp0uOxGFpatN6 Dfws4NAwzwZ6aQpe51Gg6Lb4etiMsiExSsf+fRlnFzP3EsCoKS4ieL4I/IoNGUGF2CK8 pJyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720289623; x=1720894423; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=NZbVaGzqg6Zg62PTTuRfYVwxQcbA37EnSjwgWFPG46M=; b=SjlQ11PpXi8pAmvkYCB/w6bppaJ/sTKeyqfl8+dCmR1sMrY5YFu6lKpPzHsH8JDIgK kACtMlLG3XUCntpXglvDXiyAXNma/WCj+JjKS+fffiw/Ev74kykg2H/SE7WfhrRakaMC NzoAy9wBlTEuZl6GFixYLeGjbwkePSw7dq7cw2Ojaqu+dTzKW7OYnZZyQ3T5lU5qrlc5 OJo7cil5/YYDXEW8XF34MzKXipfT+zXogzhc7QNTkYTMjIorYwFQe4GqqNBB/B0b32mf WoFp6XLmoLdefBP6Oyyt3uksIeknnkMp+G6Mjuhua6aABLTqsLi653c5rmyZJ5TupU+A +vWg== X-Gm-Message-State: AOJu0YyGK+Q4yJLTz9eYoTBzgi7rMq4b2kuZRJRMce6vD65OsfKLogid ClYr4xupSEipa4x8zsE9drD/F4+UZLGqzAfmhjYOhODeRhJQBFKO X-Google-Smtp-Source: AGHT+IG0UkeJO7W5Fspq5IsMxoawNAtDuRmsHCxJDAKKL4GvaeRY1KX5QRAdlzDWuRGD9bUkDFCvzA== X-Received: by 2002:a05:6214:19cc:b0:6b5:dea6:f376 with SMTP id 6a1803df08f44-6b5ed0873a5mr116296746d6.61.1720289622553; Sat, 06 Jul 2024 11:13:42 -0700 (PDT) Received: from smtpclient.apple (cm-24-53-187-34.buckeyecom.net. [24.53.187.34]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6b5ee6f0c27sm25967836d6.72.2024.07.06.11.13.41 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 06 Jul 2024 11:13:41 -0700 (PDT) From: JD Smith Message-Id: Content-Type: multipart/mixed; boundary="Apple-Mail=_EDB2603C-B276-45FC-9781-E2A4C9E78D9E" Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.600.62\)) Date: Sat, 6 Jul 2024 14:13:30 -0400 In-Reply-To: <86h6d355vk.fsf@gnu.org> References: <65EAF7A2-697E-46E2-998C-AB99CC54E89C@gmail.com> <86h6d355vk.fsf@gnu.org> X-Mailer: Apple Mail (2.3774.600.62) X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) --Apple-Mail=_EDB2603C-B276-45FC-9781-E2A4C9E78D9E Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Jul 6, 2024, at 3:36=E2=80=AFAM, Eli Zaretskii = wrote: >=20 >> From: JD Smith >> Date: Thu, 20 Jun 2024 15:15:32 -0400 >>=20 >> It is very common for HTML tables to include a header () = and/or footer () without using . Modern browsers simply = supply an implicit .. around all the unparented rows in a = table. `shr' does not handle this common case correctly. Below is an = example with but not . It prints the header, but then = subsumes it again inside the derived body, printing the header again in = a single cell. =20 >>=20 >> The relevant function which should handle this is `shr--fix-tbody'. = The included patch to this function simply avoids including `thead` and = `tfoot` children in the implicit body. >=20 > Thanks. I don't see any experts chiming in, so if you are confident > in the patch, and it doesn't fail the existing tests, please install > on the emacs-30 branch, and thanks. Bonus points for adding a test > for this case. Thanks. I'm afraid I don't have write access on savannah. I've added a = test and formatted the patch, below. All shr tests succeed. --Apple-Mail=_EDB2603C-B276-45FC-9781-E2A4C9E78D9E Content-Disposition: attachment; filename=0001-Fix-formatting-of-tables-with-thead-tfoot-but-no-tbo.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="0001-Fix-formatting-of-tables-with-thead-tfoot-but-no-tbo.patch" Content-Transfer-Encoding: quoted-printable =46rom=20623ecf07dc1b215cbc98f5804d58b571a649e9ba=20Mon=20Sep=2017=20= 00:00:00=202001=0AFrom:=20JD=20Smith=20= <93749+jdtsmith@users.noreply.github.com>=0ADate:=20Sat,=206=20Jul=20= 2024=2009:22:33=20-0400=0ASubject:=20[PATCH]=20Fix=20formatting=20of=20= tables=20with=20thead/tfoot=20but=20no=20tbody=0A=0ACorrectly=20handle=20= formatting=20of=20tables=20containing=20thead=20and/or=20tfoot,=20but=0A= without=20any=20tbody,=20to=20prevent=20including=20thead/tfoot=20= content=20twice=20within=0Athe=20table's=20derived=20body=20(Bug#71685).=0A= *=20lisp/net.shr.el=20(shr--fix-tbody):=20Omit=20thead/tfoot=20from=20= implicit=20body=0A*=20test/lisp/net/shr-resources/table.html:=0A*=20= test/lisp/net/shr-resources/table.txt:=0AAdded=20table=20rendering=20= test.=0A---=0A=20lisp/net/shr.el=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20|=205=20+++--=0A=20= test/lisp/net/shr-resources/table.html=20|=207=20+++++++=0A=20= test/lisp/net/shr-resources/table.txt=20=20|=205=20+++++=0A=203=20files=20= changed,=2015=20insertions(+),=202=20deletions(-)=0A=20create=20mode=20= 100644=20test/lisp/net/shr-resources/table.html=0A=20create=20mode=20= 100644=20test/lisp/net/shr-resources/table.txt=0A=0Adiff=20--git=20= a/lisp/net/shr.el=20b/lisp/net/shr.el=0Aindex=203dadcb9a09b..fb72ea6aa67=20= 100644=0A---=20a/lisp/net/shr.el=0A+++=20b/lisp/net/shr.el=0A@@=20= -2261,8=20+2261,9=20@@=20shr-table-body=0A=20(defun=20shr--fix-tbody=20= (tbody)=0A=20=20=20(nconc=20(list=20'tbody=20(dom-attributes=20tbody))=0A= =20=20=20=20=20=20=20=20=20=20(cl-loop=20for=20child=20in=20= (dom-children=20tbody)=0A-=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20collect=20(if=20(or=20(stringp=20child)=0A-=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20(not=20(eq=20(dom-tag=20child)=20'tr)))=0A+=09=09=20=20for=20tag=20=3D=20= (and=20(not=20(stringp=20child))=20(dom-tag=20child))=0A+=09=09=20=20= unless=20(or=20(eq=20tag=20'thead)=20(eq=20tag=20'tfoot))=0A+=09=09=20=20= collect=20(if=20(not=20(eq=20tag=20'tr))=0A=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20(list=20= 'tr=20nil=20(list=20'td=20nil=20child))=0A=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20child))))=0A=20=0A= diff=20--git=20a/test/lisp/net/shr-resources/table.html=20= b/test/lisp/net/shr-resources/table.html=0Anew=20file=20mode=20100644=0A= index=2000000000000..c5e8875ac91=0A---=20/dev/null=0A+++=20= b/test/lisp/net/shr-resources/table.html=0A@@=20-0,0=20+1,7=20@@=0A= +=0A+=0A= +=0A+=0A+5678=0A= +=0A+
AB
12
34
AB
=0Adiff=20--git=20= a/test/lisp/net/shr-resources/table.txt=20= b/test/lisp/net/shr-resources/table.txt=0Anew=20file=20mode=20100644=0A= index=2000000000000..70939effb63=0A---=20/dev/null=0A+++=20= b/test/lisp/net/shr-resources/table.txt=0A@@=20-0,0=20+1,5=20@@=0A+=20=20= A=20=20B=20=20=20=20=0A+=20=201=20=202=20=20=20=20=0A+=20=203=20=204=20=20= =20=20=0A+=20=205678=20=20=20=20=20=0A+=20=20A=20=20B=20=20=20=20=0A--=20= =0A2.43.0=0A=0A= --Apple-Mail=_EDB2603C-B276-45FC-9781-E2A4C9E78D9E-- From unknown Sat Sep 06 05:55:22 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: JD Smith Subject: bug#71685: closed (Re: bug#71685: [PATCH] fix shr rendering in tables without tbody) Message-ID: References: <65EAF7A2-697E-46E2-998C-AB99CC54E89C@gmail.com> X-Gnu-PR-Message: they-closed 71685 X-Gnu-PR-Package: emacs X-Gnu-PR-Keywords: patch Reply-To: 71685@debbugs.gnu.org Date: Sat, 06 Jul 2024 19:13:02 +0000 Content-Type: multipart/mixed; boundary="----------=_1720293182-5601-1" This is a multi-part message in MIME format... ------------=_1720293182-5601-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #71685: [PATCH] fix shr rendering in tables without tbody which was filed against the emacs package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 71685@debbugs.gnu.org. --=20 71685: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D71685 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1720293182-5601-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 71685-done) by debbugs.gnu.org; 6 Jul 2024 19:12:13 +0000 Received: from localhost ([127.0.0.1]:46755 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sQApR-0001RN-98 for submit@debbugs.gnu.org; Sat, 06 Jul 2024 15:12:13 -0400 Received: from mail-lj1-f171.google.com ([209.85.208.171]:58830) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sQApP-0001R9-W5 for 71685-done@debbugs.gnu.org; Sat, 06 Jul 2024 15:12:12 -0400 Received: by mail-lj1-f171.google.com with SMTP id 38308e7fff4ca-2ee92f7137bso19577521fa.1 for <71685-done@debbugs.gnu.org>; Sat, 06 Jul 2024 12:12:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1720293062; x=1720897862; darn=debbugs.gnu.org; h=cc:to:subject:message-id:date:mime-version:references:in-reply-to :from:from:to:cc:subject:date:message-id:reply-to; bh=Q1sR4W+yFjENmnCcvolesedr/AKnBau1OotP/ujzv3c=; b=ckFhh1bA0FIbXOCLS3UoRUtXkvaQ0nodejsSLHvFqiramUcxMUYjHZ9c49eHqRnpup dcXtJclDuhxypsmkdqzpAMv7/c2QJLUgGCURgyjqQULQzD2uOFNjpAgiRUN4JfNYWmBL UG2hVt0u6O+4j7BXgw+ams/3hfqC9hXXfco1iE41ySfeWAFLEex+qMxRCjFaN6NcpPqw 4z6DJ+Pcx4OcD8obSTK2MIb0g9ND2oLH7grgg3y4D1shfNONUY3+JGoUB+N0nKhRML5c UZEXyY50ZfybEi2wxl53dqM1SM42TmrP1rJF/9N1wWLJzr/35hVU7khzbHSuLVx+Cy4L sQnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720293062; x=1720897862; h=cc:to:subject:message-id:date:mime-version:references:in-reply-to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Q1sR4W+yFjENmnCcvolesedr/AKnBau1OotP/ujzv3c=; b=pzSNNthcU+EZosGrXSKvLINS3Z0uMyNCduurtI4ELmeQfiWDje8rzH16mzWHyc2ljd xCCPeyTewgHZcZm/aTurkX7aPmW0RI8RcEzkkSpexgSxM6xTcfgx9JrkvcxyElchtKNZ BBylNkKppdtFKSG8/mW9DaIRvvT/tlilA7z6FDwX+N01yFSlihgWU2BPkfcSrWATAAB3 IBvqgGhvqC/PMOXD/YhH+peMbEDZRRcOigv3acg8BEejRVP5FVT+Jr7B0P1MxmyicwNI rVe4WS6cfKLBABUu7RqBxZIXe79QwjQ6FpNhAN8r9lhv02ijWrofe6wZTikKmVyeM73f e3kQ== X-Gm-Message-State: AOJu0Yx4BqU/Tf8/r/0jqyr7hoZPCJT5HhUU7EO4LtbuVxO8AqUivkSF ywVMx9vySMaxXTah9It1RmwEe8Hx3pxjqxA/9+C6+kQO3U6dV3EBv96td8foj/DCSPe0olAfGbV TT0vtHVqTd88adH3K4KK8Q5i+0s0= X-Google-Smtp-Source: AGHT+IEepRYzNuKVoPNTkn0K+UxsTa00p9ARP5qz5S1+eU5GF44SUCxqaLwemLKGGRleSXMlW1HY7OHkbaHo9iDfUmg= X-Received: by 2002:a2e:9006:0:b0:2e9:768a:12b0 with SMTP id 38308e7fff4ca-2ee8ee13b14mr53083441fa.50.1720293061759; Sat, 06 Jul 2024 12:11:01 -0700 (PDT) Received: from 753933720722 named unknown by gmailapi.google.com with HTTPREST; Sat, 6 Jul 2024 19:11:00 +0000 From: Stefan Kangas In-Reply-To: References: <65EAF7A2-697E-46E2-998C-AB99CC54E89C@gmail.com> <86h6d355vk.fsf@gnu.org> MIME-Version: 1.0 Date: Sat, 6 Jul 2024 19:11:00 +0000 Message-ID: Subject: Re: bug#71685: [PATCH] fix shr rendering in tables without tbody To: JD Smith , Eli Zaretskii Content-Type: text/plain; charset="UTF-8" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 71685-done Cc: 71685-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Version: 30.1 JD Smith writes: > I've added a test and formatted the patch, below. All shr tests > succeed. Thanks, installed on emacs-30 as commit 9625e4af994. I'm therefore closing this bug report. ------------=_1720293182-5601-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 20 Jun 2024 19:15:54 +0000 Received: from localhost ([127.0.0.1]:41037 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sKNGE-0001HI-6c for submit@debbugs.gnu.org; Thu, 20 Jun 2024 15:15:54 -0400 Received: from lists.gnu.org ([209.51.188.17]:40796) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sKNGD-0001H9-0b for submit@debbugs.gnu.org; Thu, 20 Jun 2024 15:15:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sKNG8-0002CZ-K0 for bug-gnu-emacs@gnu.org; Thu, 20 Jun 2024 15:15:48 -0400 Received: from mail-qk1-x72a.google.com ([2607:f8b0:4864:20::72a]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sKNG7-0002bB-4t for bug-gnu-emacs@gnu.org; Thu, 20 Jun 2024 15:15:48 -0400 Received: by mail-qk1-x72a.google.com with SMTP id af79cd13be357-7955ddc6516so82984685a.1 for ; Thu, 20 Jun 2024 12:15:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718910945; x=1719515745; darn=gnu.org; h=to:date:message-id:subject:mime-version:from:from:to:cc:subject :date:message-id:reply-to; bh=qLKJymAKFAeuxDGBj4ZYmy+ySmIBlVUs0B4f1zKW2Fw=; b=PKns+vHEX07yuFZlM16XzzS237qmRu1WEuej7cA8SbdPyMPeJedlfRMWVkNYnuC75P 7kRemp+DeXAvN0SOp7+KlYOavCeb4VcKMHgnb588Q/yjZuVnb+WDehVFggyXxw18cr7V K5ZDGcBH8Yodk9+flAcKEcwu+iZjtSlT/grN8ZW3rj6+E/DY+RcJxNSg8HsvhDh3+vQj 2ZsA4cuW8uggLmNZFE0BfjlExc3dWX7cHPxAoBBWgVxuPVVs7z7uh93FVX7RCMr0x3Tv 2nYbNnuDmBx/BIPKxdkQeV1S5iFwPbklIDlB8onRGfTmZVEpiNnssuSzJfouC9tgLMqI lGVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718910945; x=1719515745; h=to:date:message-id:subject:mime-version:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=qLKJymAKFAeuxDGBj4ZYmy+ySmIBlVUs0B4f1zKW2Fw=; b=OsQQk0FA/bTqa+jywvaY94rQGsoswrErqvFCPulyL6Mw4uBieTcIQO5a1DD/mo/3Vh 6o593TQAjRYPlirP7nSxwJeVblsU7gLlr0RI2DEAtxu8ukeC3XpI4OioPJhOymQL5LoN 59V6o+u7hWDD5dRnM2PGmbE56TUEO1SRrvveBGhJnS1qg1aUA3GDXk0dT7yVPlL0B/ef yLVKdfO89D6trmsyyMVdWLlVqU6mxht0GIpzJJbBTtGIFbHwqAkpml6esZi+Ag5WZ5R0 x81yqkCT5sRLtdDIUymXMXblkbK1UryJ+iEbAk7leXi4mZUwwhTtlkiJ7p2dzF86zxfn Xxcw== X-Gm-Message-State: AOJu0Yx0vGOKC08IK8Wpm7cPuu+1kVlNkvZFMWhlsMsE8hBkhz5NSVwY czO15OW3GLLsLB3Q3XDyLk04d1NWt/ACzYC9evyYIYAU9lZTp9v6/tuwUw== X-Google-Smtp-Source: AGHT+IE084ZCpSG/uzN4U5xhpw3obYrvMaJ/ZlxYayluXUgWPGD0YjxudM2j5spRUJDkP9XePoQIQw== X-Received: by 2002:a05:620a:440c:b0:795:5dab:9f26 with SMTP id af79cd13be357-79bb3e3e77bmr628946285a.28.1718910944704; Thu, 20 Jun 2024 12:15:44 -0700 (PDT) Received: from smtpclient.apple ([131.183.131.33]) by smtp.gmail.com with ESMTPSA id af79cd13be357-79bce8c3574sm2582985a.69.2024.06.20.12.15.43 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 20 Jun 2024 12:15:43 -0700 (PDT) From: JD Smith Content-Type: multipart/mixed; boundary="Apple-Mail=_F340A71F-91F4-4FA3-AA30-73E276B59B1F" Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.600.62\)) Subject: [PATCH] fix shr rendering in tables without tbody Message-Id: <65EAF7A2-697E-46E2-998C-AB99CC54E89C@gmail.com> Date: Thu, 20 Jun 2024 15:15:32 -0400 To: bug-gnu-emacs@gnu.org X-Mailer: Apple Mail (2.3774.600.62) Received-SPF: pass client-ip=2607:f8b0:4864:20::72a; envelope-from=jdtsmith@gmail.com; helo=mail-qk1-x72a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.3 (-) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) --Apple-Mail=_F340A71F-91F4-4FA3-AA30-73E276B59B1F Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii It is very common for HTML tables to include a header () and/or = footer () without using . Modern browsers simply supply = an implicit .. around all the unparented rows in a table. = `shr' does not handle this common case correctly. Below is an example = with but not . It prints the header, but then subsumes = it again inside the derived body, printing the header again in a single = cell. =20 The relevant function which should handle this is `shr--fix-tbody'. = The included patch to this function simply avoids including `thead` and = `tfoot` children in the implicit body. (let ((shr-table-vertical-line ?|) (shr-table-horizontal-line ?-)) (shr-insert-document (with-temp-buffer (insert "
AB
12
34
") (libxml-parse-html-region)))) --------- =20 | --- -- | ||A |B | | | --- -- | ||AB | | | --- -- | ||1 |2 | | | --- -- | ||3 |4 | | | --- -- | --------- =20 --Apple-Mail=_F340A71F-91F4-4FA3-AA30-73E276B59B1F Content-Disposition: attachment; filename=shr_fix_tbody.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="shr_fix_tbody.patch" Content-Transfer-Encoding: 7bit --- shr.el 2024-06-20 15:03:52 +++ shr_fix_tbody.el 2024-06-20 15:00:49 @@ -2053,8 +2053,9 @@ (defun shr--fix-tbody (tbody) (nconc (list 'tbody (dom-attributes tbody)) (cl-loop for child in (dom-children tbody) - collect (if (or (stringp child) - (not (eq (dom-tag child) 'tr))) + for tag = (and (not (stringp child)) (dom-tag child)) + unless (or (eq tag 'thead) (eq tag 'tfoot)) + collect (if (not (eq tag 'tr)) (list 'tr nil (list 'td nil child)) child)))) --Apple-Mail=_F340A71F-91F4-4FA3-AA30-73E276B59B1F-- ------------=_1720293182-5601-1--