GNU bug report logs - #51733
27.1; Detect impossible email addresses better

Previous Next

Packages: emacs, gnus;

Reported by: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>

Date: Wed, 10 Nov 2021 00:29:01 UTC

Severity: wishlist

Found in version 27.1

Fixed in version 29.1

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Eli Zaretskii <eliz <at> gnu.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 51733 <at> debbugs.gnu.org, jidanni <at> jidanni.org
Subject: bug#51733: 27.1; Detect impossible email addresses better
Date: Sun, 16 Jan 2022 20:14:08 +0200
> From: Lars Ingebrigtsen <larsi <at> gnus.org>
> Cc: 51733 <at> debbugs.gnu.org,  jidanni <at> jidanni.org
> Date: Sun, 16 Jan 2022 18:03:23 +0100
> 
> https://www.unicode.org/reports/tr24/tr24-32.html#Scripts_and_Blocks
> 
>    As a result, using the block names as simplistic substitute for
>    script identity generally leads to poor results.
> 
> It looks like we're doing that, though?

No, not really.  We collect various blocks of the same scripts
together.

> And indeed:
> 
> (elt char-script-table #xAB65)
> => latin
> 
> which is wrong, because that's
> 
> GREEK LETTER SMALL CAPITAL OMEGA
> 
> So we should be populating char-script-table from
> http://www.unicode.org/Public/UCD/latest/ucd/Scripts.txt instead of
> Blocks.txt.  So I'll be doing that, too.

Beware: the Unicode Script property is not identical to ours!  Before
throwing away what we have, please consider how many deviations we
have in practice, and if they are just a few, let's fix only them
individually.  It's easy.  You will have to add some manual heuristics
even if you do use the Unicode Scripts.txt as the basis.




This bug report was last modified 3 years and 124 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.