From unknown Mon Jun 16 23:40:19 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#77314 <77314@debbugs.gnu.org> To: bug#77314 <77314@debbugs.gnu.org> Subject: Status: [PATCH] Gnus HTML washing: Allow href surrounded by single-quotes Reply-To: bug#77314 <77314@debbugs.gnu.org> Date: Tue, 17 Jun 2025 06:40:19 +0000 retitle 77314 [PATCH] Gnus HTML washing: Allow href surrounded by single-qu= otes reassign 77314 emacs submitter 77314 Nuno Silva severity 77314 normal tag 77314 patch thanks From debbugs-submit-bounces@debbugs.gnu.org Thu Mar 27 11:16:03 2025 Received: (at submit) by debbugs.gnu.org; 27 Mar 2025 15:16:04 +0000 Received: from localhost ([127.0.0.1]:50863 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1txoxd-0007rm-Cl for submit@debbugs.gnu.org; Thu, 27 Mar 2025 11:16:03 -0400 Received: from lists.gnu.org ([2001:470:142::17]:53576) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1txliQ-00048z-37 for submit@debbugs.gnu.org; Thu, 27 Mar 2025 07:48:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1txliH-0003Og-Qz for bug-gnu-emacs@gnu.org; Thu, 27 Mar 2025 07:47:57 -0400 Received: from mail-wm1-x332.google.com ([2a00:1450:4864:20::332]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1txliF-0000qn-Tb for bug-gnu-emacs@gnu.org; Thu, 27 Mar 2025 07:47:57 -0400 Received: by mail-wm1-x332.google.com with SMTP id 5b1f17b1804b1-43cfb6e9031so8203985e9.0 for ; Thu, 27 Mar 2025 04:47:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1743076073; x=1743680873; darn=gnu.org; h=content-disposition:mime-version:message-id:subject:to:from:date :from:to:cc:subject:date:message-id:reply-to; bh=yn0KTfG6pB/wEjUP0aF0bQCWFo6MhUEiMEyz98VqJ9c=; b=c9sSRVoCjDLDV3yOGfXfDGKWRpY0JekAexlrLnuArsBKyqRF8iPAiXy3CUyAyItZZE v2J2ECMN/xCYigRtssGW0h3HWXapzdzXMLau4/2CEKS0Cte+qBJ83LCvfOkmM7/JsneJ F7mJPYyV0p4T8YAp/VLLmjZqOa7k8IuFH/014zvr3ys/nkdTZ6RJ8e27gHbCY1pZQrjr a9/SPxNtp2Wo9dKrgJkc+lVOwEQjad/ExjAC3OpThrAAlF2zbkgAf1qcyST1AN1H6oNh zw37hVlZPLbSvByyrT9q1psmRGpjHSTE+TOXRm1379enjtW+RWm7UgTS1100bWHtK3dj oJpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743076073; x=1743680873; h=content-disposition:mime-version:message-id:subject:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=yn0KTfG6pB/wEjUP0aF0bQCWFo6MhUEiMEyz98VqJ9c=; b=LTHpc9wAW1rwrY2mhYdDnJDoZ/g82aQatTaxpaAaa6tCQ/qqj7Tyr8v4ryaq1xgjqI JXrpePa8qUonLaOOrDNg7UMym9dnfkZE++2DrQ+scUR8aKrC0vR+MGV2vfreNBUcPE78 zuDUomtXNXEEgPUJ5NZ3IvNmthcsGpacD9etkhN0rBniyap9pNP6kIDJb1OCZLubh7TZ 7EV/fB7JB84rCiu3Yj4aIlYPRCBn4leQVyuqMKG+vd7K3Q8FS5F21H16VGYWxQfDcmpg nWavApulbZ7kWjP0EBkk56vwy1JvpLkcYxdBqvXUyRekMRchy8yOA5J6EP1fNPD9XXP+ bsfQ== X-Gm-Message-State: AOJu0YxrPVX+vx8+Dh5vHZ109GkJV96Obqq5Lvn8zo8wGn6aYn+kQWZy 3p5NmZ7oq0+bAIy4WDJbiKebx9csvvCONIKQzx1bZwK1OBPFE8BPy5nz6AxC X-Gm-Gg: ASbGncsC+Wt6ILYpqpe1tVlJBBQIhRrhtYlNbbR/U+jCoDVED5u/dx6sZefQkcGoFzR qN/YteWnLY3wS+Ku3THsNda4txe4Iy3/hZLV2ksNDfZ2OBML10VkkkV0tUaU4xgNX4R5eeO5huZ qQ+Mm5fQ7qrtkBVWoughOA6VcvJxY0rZiZmBC4NfDm0Oz+LgegD2J4Y8p3FT613FiCGnSimXcij ghGsJ4a2kgzhDOBZD8rWCh/Muy2hyT0wQoIcFzUFdzwoZjW24ArycSg0xDNDF+LOj1H58gBAHL2 S8N0ITlxeF15gJuRKK0corK+NGj6peCRDcETDOp489BGoD23cTOy X-Google-Smtp-Source: AGHT+IGsJfGC0K31lEaSa6t3vrG9lWKvkALtDcRcOwCf+KjWxXa9msdmyU7eof54ROad4lTJnDrZsA== X-Received: by 2002:a5d:6da1:0:b0:391:1473:2a08 with SMTP id ffacd0b85a97d-39ad174154fmr2181050f8f.7.1743076072725; Thu, 27 Mar 2025 04:47:52 -0700 (PDT) Received: from localhost ([2001:818:dc66:5e00:f787:58b7:73fd:39be]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3997f99540bsm19613642f8f.2.2025.03.27.04.47.51 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 27 Mar 2025 04:47:52 -0700 (PDT) Date: Thu, 27 Mar 2025 11:47:51 +0000 From: Nuno Silva To: bug-gnu-emacs@gnu.org Subject: [PATCH] Gnus HTML washing: Allow href surrounded by single-quotes Message-ID: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="QZL8tVliXSlcbkIJ" Content-Disposition: inline Received-SPF: pass client-ip=2a00:1450:4864:20::332; envelope-from=nunojsg@gmail.com; helo=mail-wm1-x332.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Thu, 27 Mar 2025 11:16:00 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) --QZL8tVliXSlcbkIJ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Gnus, both in the version of Emacs I use locally, and in the development repository, has the following in gnus-html-wash-tags (lisp/gnus/gnus-html.el): (when (string-match "href=\"\\([^\"]+\\)" parameters) This, in turn, makes it so that, at least when this function is involved with displaying HTML parts, links that have single quotes surrounding their target in the code are not linkified in the article display. I noticed this when reading RSS/Atom feeds via feedbase.org, which uses single quotes for the link associated with each post, as opposed to Gmane's Gwene (which does not show this issue and uses double quotes for said links), in an instance of Gnus which is using (setq mm-text-html-renderer 'gnus-w3m) The attached patch has the modification that fixes this for me locally, applied to git HEAD; please let me know if something should be done differently. For further context, check the subthread on the gmane.discuss group at news.gmane.io (over netnews) starting at [1], especially [2] and [3]. The patch is almost identical to the change mentioned in [3], but does not escape the second ' in the new regexp. [1] Message-ID: , news://news.gmane.io/vs0nno$vkt$1@ciao.gmane.io [2] Message-ID: , news://news.gmane.io/vs0r45$j1f$1@ciao.gmane.io [3] Message-ID: , news://news.gmane.io/vs1it5$12p2$2@ciao.gmane.io -- Nuno Silva (aka njsg) --QZL8tVliXSlcbkIJ Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="gnus-html-href-regex-single-quote.patch" >From 64f1969fd1b728f40bb1f41150f17068dcddabe8 Mon Sep 17 00:00:00 2001 From: "Nuno J. Silva" Date: Thu, 27 Mar 2025 11:25:01 +0000 Subject: [PATCH] Gnus html washing: allow href attribute surrounded by single quotes --- lisp/gnus/gnus-html.el | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/lisp/gnus/gnus-html.el b/lisp/gnus/gnus-html.el index 435ea11..cc63e96 100644 --- a/lisp/gnus/gnus-html.el +++ b/lisp/gnus/gnus-html.el @@ -258,17 +258,18 @@ Use ALT-TEXT for the image string." (delete-region (match-beginning 0) (match-end 0))) (setq end (point)) (cond ;; Fetch and insert a picture. ((equal tag "img_alt")) ;; Add a link. ((or (equal tag "a") (equal tag "A")) - (when (string-match "href=\"\\([^\"]+\\)" parameters) + (when (or (string-match "href=\"\\([^\"]+\\)" parameters) + (string-match "href='\\([^']+\\)" parameters)) (setq url (match-string 1 parameters)) (gnus-message 8 "gnus-html-wash-tags: fetching link URL %s" url) (gnus-article-add-button start end 'browse-url (mm-url-decode-entities-string url) url) (let ((overlay (make-overlay start end))) (overlay-put overlay 'evaporate t) (overlay-put overlay 'gnus-button-url url) --QZL8tVliXSlcbkIJ--