GNU bug report logs - #73312
31.0.50; textsec test failure because of UTS #46 changes

Previous Next

Package: emacs;

Reported by: Robert Pluim <rpluim <at> gmail.com>

Date: Tue, 17 Sep 2024 10:13:02 UTC

Severity: normal

Tags: fixed

Found in version 31.0.50

Fixed in version 31.1

Done: Robert Pluim <rpluim <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 73312 in the body.
You can then email your comments to 73312 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#73312; Package emacs. (Tue, 17 Sep 2024 10:13:02 GMT) Full text and rfc822 format available.

Message #3 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Robert Pluim <rpluim <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 31.0.50; textsec test failure because of UTS #46 changes
Date: Tue, 17 Sep 2024 12:11:50 +0200
Following the update to Unicode 16, the textsec tests now fail:

  GEN      lisp/international/textsec-tests.log
Running 12 tests (2024-09-17 11:51:51+0200, selector `(not (or (tag :unstable) (tag :nativecomp)))')
   passed   1/12  test-confusable (0.001562 sec)
   passed   2/12  test-minimal-scripts (0.000155 sec)
   passed   3/12  test-mixed-numbers (0.000846 sec)
   passed   4/12  test-resolved (0.000120 sec)
   passed   5/12  test-restriction-level (0.000259 sec)
   passed   6/12  test-scripts (0.000354 sec)
   passed   7/12  test-suspicious-email (0.001587 sec)
   passed   8/12  test-suspicious-link (0.015283 sec)
   passed   9/12  test-suspicious-local (0.000522 sec)
   passed  10/12  test-suspicious-name (0.000420 sec)
   passed  11/12  test-suspicious-url (0.000498 sec)
Test test-suspiction-domain backtrace:
  signal(ert-test-failed (((should (textsec-domain-suspicious-p "foo/b
  ert-fail(((should (textsec-domain-suspicious-p "foo/bar.org")) :form
  (if (unwind-protect (setq value-222 (apply fn-220 args-221)) (setq f
  (let (form-description-224) (if (unwind-protect (setq value-222 (app
  (let ((value-222 'ert-form-evaluation-aborted-223)) (let (form-descr
  (let* ((fn-220 #'textsec-domain-suspicious-p) (args-221 (condition-c
  #f(lambda () [t] (let* ((fn-220 #'textsec-domain-suspicious-p) (args
  #f(compiled-function () #<bytecode -0x167cc0e1752f76aa>)()
  handler-bind-1(#f(compiled-function () #<bytecode -0x167cc0e1752f76a
  ert--run-test-internal(#s(ert--test-execution-info :test #s(ert-test
  ert-run-test(#s(ert-test :name test-suspiction-domain :documentation
  ert-run-or-rerun-test(#s(ert--stats :selector ... :tests ... :test-m
  ert-run-tests((not (or (tag :unstable) (tag :nativecomp))) #f(compil
  ert-run-tests-batch((not (or (tag :unstable) (tag :nativecomp))))
  ert-run-tests-batch-and-exit((not (or (tag :unstable) (tag :nativeco
  eval((ert-run-tests-batch-and-exit '(not (or (tag :unstable) (tag :n
  command-line-1(("-L" ":." "-l" "ert" "--eval" "(setq treesit-extra-l
  command-line()
  normal-top-level()
Test test-suspiction-domain condition:
    (ert-test-failed
     ((should (textsec-domain-suspicious-p "foo/bar.org")) :form
      (textsec-domain-suspicious-p "foo/bar.org") :value nil))
   FAILED  12/12  test-suspiction-domain (0.000228 sec) at lisp/international/textsec-tests.el:114

Ran 12 tests, 11 results as expected, 1 unexpected (2024-09-17 11:51:51+0200, 0.102143 sec)

1 unexpected results:
   FAILED  test-suspiction-domain

This is because UTS #46 in their infinite wisdom have decided to change
the rules on how to check what is considered an allowed character in a
domain name. Previously, IdnaMappingTable.txt contained eg:

002F          ; disallowed_STD3_valid                  # 1.1  SOLIDUS

but now it contains

002F          ; valid      ;      ; NV8    # 1.1  SOLIDUS

with a change to section 4.1 of UTS#46 saying that only [a-z0-9-] are
allowed for ASCII. Note that theyʼve helpfully marked
valid-but-invalid-in-idna characters with either NV8 or XV8, but then
have unhelpfully said that those markings are not normative. <sigh>

Anyway, willfully ignoring their verbiage about normative markings,
the following fixes it for me, at least until the next version of UTS
#46, I guess.

diff --git a/admin/unidata/unidata-gen.el b/admin/unidata/unidata-gen.el
index 7be03fe63af..adbe9c83670 100644
--- a/admin/unidata/unidata-gen.el
+++ b/admin/unidata/unidata-gen.el
@@ -1598,15 +1598,21 @@ unidata-gen-idna-mapping
   (let ((map (make-char-table nil)))
     (with-temp-buffer
       (unidata-gen--insert-file "IdnaMappingTable.txt")
-      (while (re-search-forward "^\\([0-9A-F]+\\)\\(?:\\.\\.\\([0-9A-F]+\\)\\)? +; +\\([^ ]+\\) +\\(?:; +\\([ 0-9A-F]+\\)\\)?"
+      (while (re-search-forward "^\\([0-9A-F]+\\)\\(?:\\.\\.\\([0-9A-F]+\\)\\)? +; +\\([^ ]+\\) +\\(?:; +\\([ 0-9A-F]+\\)\\)?\\(?:; \\(NV8\\|XV8\\)\\)?"
                                 nil t)
         (let ((start (match-string 1))
               (end (match-string 2))
               (status (match-string 3))
-              (mapped (match-string 4)))
+              (mapped (match-string 4))
+              (idna-status (match-string 5)))
           ;; Make reading the file slightly faster by using `t'
           ;; instead of `disallowed' all over the place.
-          (when (string-match-p "\\`disallowed" status)
+          (when (or (string-match-p "\\`disallowed" status)
+                    ;; UTS#46 messed us about with "status = valid" for
+                    ;; invalid characters, so we need to check for "NV8" or
+                    ;; "XV8".
+                    (string= idna-status "NV8")
+                    (string= idna-status "XV8"))
             (setq status "t"))
           (unless (or (equal status "valid")
                       (equal status "deviation"))



Robert
-- 




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73312; Package emacs. (Tue, 17 Sep 2024 13:12:02 GMT) Full text and rfc822 format available.

Message #6 received at 73312 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Robert Pluim <rpluim <at> gmail.com>
Cc: 73312 <at> debbugs.gnu.org
Subject: Re: bug#73312: 31.0.50;
 textsec test failure because of UTS #46 changes
Date: Tue, 17 Sep 2024 16:10:38 +0300
> From: Robert Pluim <rpluim <at> gmail.com>
> Date: Tue, 17 Sep 2024 12:11:50 +0200
> 
> Following the update to Unicode 16, the textsec tests now fail:
> 
>   GEN      lisp/international/textsec-tests.log
> Running 12 tests (2024-09-17 11:51:51+0200, selector `(not (or (tag :unstable) (tag :nativecomp)))')
>    passed   1/12  test-confusable (0.001562 sec)
>    passed   2/12  test-minimal-scripts (0.000155 sec)
>    passed   3/12  test-mixed-numbers (0.000846 sec)
>    passed   4/12  test-resolved (0.000120 sec)
>    passed   5/12  test-restriction-level (0.000259 sec)
>    passed   6/12  test-scripts (0.000354 sec)
>    passed   7/12  test-suspicious-email (0.001587 sec)
>    passed   8/12  test-suspicious-link (0.015283 sec)
>    passed   9/12  test-suspicious-local (0.000522 sec)
>    passed  10/12  test-suspicious-name (0.000420 sec)
>    passed  11/12  test-suspicious-url (0.000498 sec)
> Test test-suspiction-domain backtrace:
>   signal(ert-test-failed (((should (textsec-domain-suspicious-p "foo/b
>   ert-fail(((should (textsec-domain-suspicious-p "foo/bar.org")) :form
>   (if (unwind-protect (setq value-222 (apply fn-220 args-221)) (setq f
>   (let (form-description-224) (if (unwind-protect (setq value-222 (app
>   (let ((value-222 'ert-form-evaluation-aborted-223)) (let (form-descr
>   (let* ((fn-220 #'textsec-domain-suspicious-p) (args-221 (condition-c
>   #f(lambda () [t] (let* ((fn-220 #'textsec-domain-suspicious-p) (args
>   #f(compiled-function () #<bytecode -0x167cc0e1752f76aa>)()
>   handler-bind-1(#f(compiled-function () #<bytecode -0x167cc0e1752f76a
>   ert--run-test-internal(#s(ert--test-execution-info :test #s(ert-test
>   ert-run-test(#s(ert-test :name test-suspiction-domain :documentation
>   ert-run-or-rerun-test(#s(ert--stats :selector ... :tests ... :test-m
>   ert-run-tests((not (or (tag :unstable) (tag :nativecomp))) #f(compil
>   ert-run-tests-batch((not (or (tag :unstable) (tag :nativecomp))))
>   ert-run-tests-batch-and-exit((not (or (tag :unstable) (tag :nativeco
>   eval((ert-run-tests-batch-and-exit '(not (or (tag :unstable) (tag :n
>   command-line-1(("-L" ":." "-l" "ert" "--eval" "(setq treesit-extra-l
>   command-line()
>   normal-top-level()
> Test test-suspiction-domain condition:
>     (ert-test-failed
>      ((should (textsec-domain-suspicious-p "foo/bar.org")) :form
>       (textsec-domain-suspicious-p "foo/bar.org") :value nil))
>    FAILED  12/12  test-suspiction-domain (0.000228 sec) at lisp/international/textsec-tests.el:114
> 
> Ran 12 tests, 11 results as expected, 1 unexpected (2024-09-17 11:51:51+0200, 0.102143 sec)
> 
> 1 unexpected results:
>    FAILED  test-suspiction-domain
> 
> This is because UTS #46 in their infinite wisdom have decided to change
> the rules on how to check what is considered an allowed character in a
> domain name. Previously, IdnaMappingTable.txt contained eg:
> 
> 002F          ; disallowed_STD3_valid                  # 1.1  SOLIDUS
> 
> but now it contains
> 
> 002F          ; valid      ;      ; NV8    # 1.1  SOLIDUS
> 
> with a change to section 4.1 of UTS#46 saying that only [a-z0-9-] are
> allowed for ASCII. Note that theyʼve helpfully marked
> valid-but-invalid-in-idna characters with either NV8 or XV8, but then
> have unhelpfully said that those markings are not normative. <sigh>
> 
> Anyway, willfully ignoring their verbiage about normative markings,
> the following fixes it for me, at least until the next version of UTS
> #46, I guess.

Please install, and thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#73312; Package emacs. (Tue, 17 Sep 2024 13:55:01 GMT) Full text and rfc822 format available.

Message #9 received at 73312 <at> debbugs.gnu.org (full text, mbox):

From: Robert Pluim <rpluim <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 73312 <at> debbugs.gnu.org
Subject: Re: bug#73312: 31.0.50; textsec test failure because of UTS #46
 changes
Date: Tue, 17 Sep 2024 15:52:59 +0200
tags 73312 fixed
close 73312 31.1
quit

>>>>> On Tue, 17 Sep 2024 16:10:38 +0300, Eli Zaretskii <eliz <at> gnu.org> said:


    Eli> Please install, and thanks.

Closing.
Committed as 7d365a2d72d

Robert
-- 




Added tag(s) fixed. Request was from Robert Pluim <rpluim <at> gmail.com> to control <at> debbugs.gnu.org. (Tue, 17 Sep 2024 13:55:02 GMT) Full text and rfc822 format available.

bug marked as fixed in version 31.1, send any further explanations to 73312 <at> debbugs.gnu.org and Robert Pluim <rpluim <at> gmail.com> Request was from Robert Pluim <rpluim <at> gmail.com> to control <at> debbugs.gnu.org. (Tue, 17 Sep 2024 13:55:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 16 Oct 2024 11:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 299 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.