GNU bug report logs - #51733
27.1; Detect impossible email addresses better

Previous Next

Packages: emacs, gnus;

Reported by: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>

Date: Wed, 10 Nov 2021 00:29:01 UTC

Severity: wishlist

Found in version 27.1

Fixed in version 29.1

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Eli Zaretskii <eliz <at> gnu.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 51733 <at> debbugs.gnu.org, jidanni <at> jidanni.org
Subject: bug#51733: 27.1; Detect impossible email addresses better
Date: Wed, 19 Jan 2022 16:57:38 +0200
> From: Lars Ingebrigtsen <larsi <at> gnus.org>
> Cc: 51733 <at> debbugs.gnu.org,  jidanni <at> jidanni.org
> Date: Wed, 19 Jan 2022 15:28:51 +0100
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > Why? .ru is a top-level domain, it doesn't affect what should be
> > before the dot, I think?
> >
> > If you replace "Сгсе.ru" with "Cгсе.ru", you do get a warning.
> 
> Yes.  But "Сгсе.ru" is a whole-script confusable with "Crce.ru", and is
> therefore suspicious.

OK, but why do you think "Сгсе.ru" is confusable?  The SLD part is
entirely made of single-script characters, and UTS#39 explicitly
allows that:

  [...] it can be perfectly legitimate to have scripts in a SLD
  (second level domain) not be the same as scripts in a TLD (top-level
  domain), such as:

    Cyrillic labels in a domain name with a TLD of .ru or .рф 

That's your case, isn't it?

> >> Is that what they mean here?
> >
> > I'm not sure I understand the purpose of finding which scripts
> > "contain a whole-script confusable with a string X".  What are we
> > supposed to do with the resulting list?
> 
> I think this standard was written by somebody with a PhD in Philosophy,
> and not a programmer, so the language is very high falutin'.
> 
> So they're not actually suggesting that a list should be made, but the
> result should be mathematically equivalent with the result of the
> mathematical algorithm described.  I just don't understand what he's
> saying here.

Regardless of what they are saying, I don't think the above is
suitable for production.  I think it should be enough to see whether
there could be confusion with the corresponding ASCII characters from
confusables.txt.




This bug report was last modified 3 years and 124 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.