From debbugs-submit-bounces@debbugs.gnu.org Thu Jan 05 08:46:22 2017 Received: (at submit) by debbugs.gnu.org; 5 Jan 2017 13:46:22 +0000 Received: from localhost ([127.0.0.1]:41821 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cP8N4-00057X-9w for submit@debbugs.gnu.org; Thu, 05 Jan 2017 08:46:22 -0500 Received: from eggs.gnu.org ([208.118.235.92]:48423) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cP8N3-00057G-18 for submit@debbugs.gnu.org; Thu, 05 Jan 2017 08:46:21 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cP8Mw-0007wE-Rs for submit@debbugs.gnu.org; Thu, 05 Jan 2017 08:46:15 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: * X-Spam-Status: No, score=1.1 required=5.0 tests=BAYES_50, FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,T_DKIM_INVALID autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:38366) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cP8Mw-0007w8-OP for submit@debbugs.gnu.org; Thu, 05 Jan 2017 08:46:14 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39377) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cP8Mv-0005eX-Db for bug-gnu-emacs@gnu.org; Thu, 05 Jan 2017 08:46:14 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cP8Mr-0007uE-6x for bug-gnu-emacs@gnu.org; Thu, 05 Jan 2017 08:46:13 -0500 Received: from mail-wj0-x234.google.com ([2a00:1450:400c:c01::234]:35797) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1cP8Mr-0007tH-16 for bug-gnu-emacs@gnu.org; Thu, 05 Jan 2017 08:46:09 -0500 Received: by mail-wj0-x234.google.com with SMTP id i20so34967306wjn.2 for ; Thu, 05 Jan 2017 05:46:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id:mime-version :content-transfer-encoding; bh=DnIQCtb0yJsM64t1bSCsP2Ldl1GsBEa2FJu2c1gjiiY=; b=m5dBF2kE01t+Padaf2KD7TrdOTEmIau6//Ely8gWYPHTKcdk29mxn9QShx3kTxVbcb jhFoc/8EdeUkwb9NtZYOIzIF1E8vBno4XiNrcQYcKNRUKar14krzlohstJ3a926p/eZm epD0ajNUUtf0ZWABP5iULFa2P9LsaDoYEV7JHQYfU+Su+iQBGWW7yGXWr0xc4iak3CKP rKy+OjdwfS1eZU+3dfwBf4+8+LQ3YVpoGAufXANz4fDMNQNhWAIywWgVJqsxKxL1xnaU aFNt6i0M7Jck06RhUYtcYQYBIjLfUWqgp/uF4iWfEaMBZEN7I6ua9bY+QaczbcscT9Gj ycRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:mime-version :content-transfer-encoding; bh=DnIQCtb0yJsM64t1bSCsP2Ldl1GsBEa2FJu2c1gjiiY=; b=FvEnRFMQ0plPHdlotrnToh/taOHzA+xZmX5yTE7gTxKW3tvituGnREo92WauPVkl1T dFfAqyOnlvXdwpFHLs2VLXWN6nQ8C6408hWU7mYFsCge2bcT6zuaMARIHsgjWqSN6qIe uM9BpiiBHUz4JWD5mUNh3iHaJCDqurApu5nVrFiJjMjxgajR/PdzMDlJfU7OPXj8i8sb Z8E9aTh0eZAD1wiYNMXq+8ktqNOXALT8lz2uihUTMnGSyzLFy9maD7Z9beFoTsG/dRM6 EU6tGWmi49kDKAxrz/LAOdIflYeIvWpN018Xkc9rB+1XsP8Ca9Q3BXD1Xg+fotfZZEEE iKIQ== X-Gm-Message-State: AIkVDXKo7CfWXs6O00cdykVDDf4obkaoRjyjoYtq75er58822p1YLBfohLEQi+aWUAmo+w== X-Received: by 10.194.205.103 with SMTP id lf7mr8417134wjc.202.1483623965918; Thu, 05 Jan 2017 05:46:05 -0800 (PST) Received: from a.muc.corp.google.com ([2a00:79e0:15:4:8c89:a479:a923:f6b2]) by smtp.gmail.com with ESMTPSA id e3sm68709499wjm.12.2017.01.05.05.46.05 for (version=TLS1_2 cipher=AES128-SHA bits=128/128); Thu, 05 Jan 2017 05:46:05 -0800 (PST) From: Philipp Stephani To: bug-gnu-emacs@gnu.org Subject: 26.0.50; [:blank:] character class should match all Unicode horizontal whitespace Date: Thu, 05 Jan 2017 14:46:01 +0100 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -3.8 (---) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.8 (---) (string-match-p "[[:blank:]]" "\N{HAIR SPACE}") =3D> nil, expected 0 [[:blank:]] should be the same as \h in PRCE. In GNU Emacs 26.0.50.26 (x86_64-unknown-linux-gnu, GTK+ Version 3.10.8) of 2017-01-05 built on unknown Repository revision: d88cdad2847726438c7d1de9fd2651c4be9243aa Windowing system distributor 'The X.Org Foundation', version 11.0.11501000 System Description: Ubuntu 14.04 LTS Recent messages: For information about GNU Emacs and the GNU system, type C-h C-a. Entering debugger... Back to top level Configured using: 'configure --with-modules --enable-checking --enable-check-lisp-object-type 'CFLAGS=3D-ggdb3 -O0'' Configured features: XPM JPEG TIFF GIF PNG SOUND GSETTINGS NOTIFY GNUTLS FREETYPE XFT ZLIB TOOLKIT_SCROLL_BARS GTK3 X11 MODULES Important settings: value of $LANG: en_US.UTF-8 locale-coding-system: utf-8-unix Major mode: Lisp Interaction Minor modes in effect: tooltip-mode: t global-eldoc-mode: t electric-indent-mode: t mouse-wheel-mode: t tool-bar-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t line-number-mode: t transient-mark-mode: t Load-path shadows: None found. Features: (shadow sort mail-extr emacsbug message subr-x puny seq byte-opt gv bytecomp byte-compile cl-extra cconv dired dired-loaddefs format-spec rfc822 mml mml-sec password-cache epa derived epg epg-config gnus-util rmail rmail-loaddefs mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils help-mode easymenu cl-loaddefs pcase cl-lib debug time-date mule-util tooltip eldoc electric uniquify ediff-hook vc-hooks lisp-float-type mwheel term/x-win x-win term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe tabulated-list replace newcomment text-mode elisp-mode lisp-mode prog-mode register page menu-bar rfn-eshadow isearch timer select scroll-bar mouse jit-lock font-lock syntax facemenu font-core term/tty-colors frame cl-generic cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese composite charscript case-table epa-hook jka-cmpr-hook help simple abbrev obarray minibuffer cl-preloaded nadvice loaddefs button faces cus-face macroexp files text-properties overlay sha1 md5 base64 format env code-pages mule custom widget hashtable-print-readable backquote inotify dynamic-setting system-font-setting font-render-setting move-toolbar gtk x-toolkit x multi-tty make-network-process emacs) Memory information: ((conses 16 182571 10570) (symbols 48 31257 1) (miscs 40 340 231) (strings 32 71112 6419) (string-bytes 1 1678721) (vectors 16 14561) (vector-slots 8 529555 10250) (floats 8 183 150) (intervals 56 250 6) (buffers 976 13) (heap 1024 36602 1391)) --=20 Google Germany GmbH Erika-Mann-Stra=C3=9Fe 33 80636 M=C3=BCnchen Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg Gesch=C3=A4ftsf=C3=BChrer: Matthew Scott Sucherman, Paul Terence Manicle Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat sind, leiten Sie diese bitte nicht weiter, informieren Sie den Absender und l=C3= =B6schen Sie die E-Mail und alle Anh=C3=A4nge. Vielen Dank. This e-mail is confidential. If you are not the right addressee please do = not forward it, please inform the sender, and please erase this e-mail including any attachments. Thanks. From debbugs-submit-bounces@debbugs.gnu.org Thu Jan 05 10:50:16 2017 Received: (at 25366) by debbugs.gnu.org; 5 Jan 2017 15:50:17 +0000 Received: from localhost ([127.0.0.1]:43995 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cPAIy-0000kU-NY for submit@debbugs.gnu.org; Thu, 05 Jan 2017 10:50:16 -0500 Received: from eggs.gnu.org ([208.118.235.92]:46139) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cPAIx-0000kH-Ch for 25366@debbugs.gnu.org; Thu, 05 Jan 2017 10:50:15 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cPAIo-0004Zb-7i for 25366@debbugs.gnu.org; Thu, 05 Jan 2017 10:50:10 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_50,RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:60580) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cPAIo-0004ZU-4N; Thu, 05 Jan 2017 10:50:06 -0500 Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:4967 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1cPAIn-00066W-8D; Thu, 05 Jan 2017 10:50:05 -0500 Date: Thu, 05 Jan 2017 17:50:21 +0200 Message-Id: <838tqpecaq.fsf@gnu.org> From: Eli Zaretskii To: Philipp Stephani In-reply-to: (message from Philipp Stephani on Thu, 05 Jan 2017 14:46:01 +0100) Subject: Re: bug#25366: 26.0.50; [:blank:] character class should match all Unicode horizontal whitespace References: X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -8.2 (--------) X-Debbugs-Envelope-To: 25366 Cc: 25366@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -8.2 (--------) > From: Philipp Stephani > Date: Thu, 05 Jan 2017 14:46:01 +0100 > > (string-match-p "[[:blank:]]" "\N{HAIR SPACE}") > => nil, expected 0 > > [[:blank:]] should be the same as \h in PRCE. We are consistent with our documentation, but I agree that it would be good to extend [:blank:], as proposed here: http://www.unicode.org/reports/tr18/tr18-19.html#Compatibility_Properties Patches to that effect are welcome. From debbugs-submit-bounces@debbugs.gnu.org Thu Jan 05 18:07:14 2017 Received: (at control) by debbugs.gnu.org; 5 Jan 2017 23:07:14 +0000 Received: from localhost ([127.0.0.1]:44079 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cPH7p-00039k-Qz for submit@debbugs.gnu.org; Thu, 05 Jan 2017 18:07:13 -0500 Received: from mail-io0-f175.google.com ([209.85.223.175]:36552) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cPH7o-00039X-4x for control@debbugs.gnu.org; Thu, 05 Jan 2017 18:07:12 -0500 Received: by mail-io0-f175.google.com with SMTP id p127so44846846iop.3 for ; Thu, 05 Jan 2017 15:07:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=g4HxgLrAXxZ91lZ2sxXC7pAdrZ0cHnCryOh5wZD30dE=; b=HQdiHElBWD/AlokzhwwCa6LTC8uYtBpDJzX2g05u7Df812KCS/4+/reazQiiu78fbg Aq1o45NlHCA9fIYg6rGkv1GwKCKWiVrsxh/FoYX7xdM0Hgk+Km9tRzhD6upw2WEJGwyl dk4nLNp6KWCap1C3J0NW0ZX9/kE5P5rOYSk9JZm8avGEGtbQxpa9r6LE/YktSLSUjCgp u0GRWr3Y3Lv2rTcMc3NLYroaTfuEMhnTTN7WwIZUzbH02bVqJr8RwPVae89rLQFxgl+U rShhBfsUS/YHOTQGfycNiDu6jDniv0Kxq39EMoQbnU2bUuu1nTbXNYVUp8xnljv43ZGD Pb7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:subject:references:date :in-reply-to:message-id:user-agent:mime-version; bh=g4HxgLrAXxZ91lZ2sxXC7pAdrZ0cHnCryOh5wZD30dE=; b=WwPnzcsQeuEoxyQt4B6b2A3QuLYUOE6ZWh2OL1qjgvb3xcccVzTrcIlDRz0Gzl0Mhm XoVQaAFk9TtY2BYQP2c+m1KL8mqXRM0iDE7ZbQilis6BL7y0KIB648ig7AdxCq3xNoEm Wx7uf2RVlqppEImkN1/p7oqIZNFBwx/ufM4ogNJuIwnF3oqB9s1QwonUIiQwW2kiu8EI R7ZqXnYP/CHG261aTmbOyIA9gbx654Zt9gK6SnHgeW2xXWznLEDsqpF/l6mBeSv6Bd0K WpkKTZ6mi9GL2kJcpOreMoFLpq9zxV5JByezuSrXALZRPMANZFA+YLyF/1vsnMq6bZER ZjXg== X-Gm-Message-State: AIkVDXI8pw+LVB4+CyXq9fvOahmmpeQWsEhEhCT4q9SJZe37tglxbfeUnj52wWoO4R9Pug== X-Received: by 10.107.138.228 with SMTP id c97mr60382169ioj.77.1483657626390; Thu, 05 Jan 2017 15:07:06 -0800 (PST) Received: from zony ([45.2.7.65]) by smtp.googlemail.com with ESMTPSA id b65sm19400747iob.41.2017.01.05.15.07.05 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 05 Jan 2017 15:07:05 -0800 (PST) From: npostavs@users.sourceforge.net To: control@debbugs.gnu.org Subject: Re: bug#25366: 26.0.50; [:blank:] character class should match all Unicode horizontal whitespace References: <838tqpecaq.fsf@gnu.org> Date: Thu, 05 Jan 2017 18:08:08 -0500 In-Reply-To: <838tqpecaq.fsf@gnu.org> (Eli Zaretskii's message of "Thu, 05 Jan 2017 17:50:21 +0200") Message-ID: <87wpe93y1z.fsf@users.sourceforge.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -0.6 (/) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.6 (/) severity 25366 wishlist tags 25366 confirmed quit Eli Zaretskii writes: >> From: Philipp Stephani >> Date: Thu, 05 Jan 2017 14:46:01 +0100 >> >> (string-match-p "[[:blank:]]" "\N{HAIR SPACE}") >> => nil, expected 0 >> >> [[:blank:]] should be the same as \h in PRCE. > > We are consistent with our documentation, but I agree that it would be > good to extend [:blank:], as proposed here: > > http://www.unicode.org/reports/tr18/tr18-19.html#Compatibility_Properties > > Patches to that effect are welcome. From debbugs-submit-bounces@debbugs.gnu.org Fri Jan 06 10:00:40 2017 Received: (at 25366) by debbugs.gnu.org; 6 Jan 2017 15:00:40 +0000 Received: from localhost ([127.0.0.1]:45252 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cPW0W-0004tT-5L for submit@debbugs.gnu.org; Fri, 06 Jan 2017 10:00:40 -0500 Received: from mail-oi0-f41.google.com ([209.85.218.41]:33980) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cPW0U-0004tG-V2 for 25366@debbugs.gnu.org; Fri, 06 Jan 2017 10:00:39 -0500 Received: by mail-oi0-f41.google.com with SMTP id 3so434728783oih.1 for <25366@debbugs.gnu.org>; Fri, 06 Jan 2017 07:00:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=YrY7JHXC5sxUJvz9M9sGCYn3/CUqSHEBdDQQFbEr9DA=; b=R0M90OGop1pjRv9KxbiNQ3FOO0UCWAgkBhkdv1xB9RVl/fjZASfMOyf/4pycdjy9zW FT0+5hEfDR0HvVdF/oQxcrGs0wXOKlQcAXjKNYPXhryqOeeFAWg1cAWthdgeka9ZyfkR RkGV7317L555x3D833PRIUzPJjx71lgLLn71b1Q4/Z+3js9/Gs5JAtLRqd/ICBreBC7C IGpl94sput8eltJlbL5e/WaDAHxaUqlke8s/UR8AYnrcIHAIysYHahyaZ6gcCIDJRNhj bF6xixNqyR3FpWV/izBl6Ao0PqBoCI93+ET0YOaMeNfFAgrTu35NhRGdYbNjR5CeUPi2 4FsQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=YrY7JHXC5sxUJvz9M9sGCYn3/CUqSHEBdDQQFbEr9DA=; b=trwtTuoZtcqwl5WwVPrKRtaeXoc5jew4t8ZwKHLxM3hOljz1qV04yUtEBiFrZLlmjj 78soFJfsuDaQXs0nbifKW/axBSVN6VWkF/UyypDGLhuErAWGIKowjKHgkg4hVwNwt67q n7PiEadocQ+lTd8BMFN0HtUAyhSbNHEDo8V+kHxrmWf5RrvQM/U2s3oVWs6CP5jAvuCE jLhIybYZvfy/Y8pEP5ilpdf4QFE54eJXJzeHyt0qEe6sWUEgO/cnydtclQLxwC26xWL9 H5IuS/ovwpFHpJD498DteVwvtiIC1HzUlFdf4ZmDN/QedYxKsG7SnDdCszOyRPcYJ+Xm avCw== X-Gm-Message-State: AIkVDXKx1OxXOEnYOjKyejKzAC+hiPfRz+kxjHE4V1P+9S2od6tIS0CGYM2mclNtH9BfBozH6Y4EkdbIrNJuMw== X-Received: by 10.157.9.65 with SMTP id 59mr1492142otp.184.1483714833087; Fri, 06 Jan 2017 07:00:33 -0800 (PST) MIME-Version: 1.0 References: <838tqpecaq.fsf@gnu.org> In-Reply-To: <838tqpecaq.fsf@gnu.org> From: Philipp Stephani Date: Fri, 06 Jan 2017 15:00:22 +0000 Message-ID: Subject: Re: bug#25366: 26.0.50; [:blank:] character class should match all Unicode horizontal whitespace To: Eli Zaretskii Content-Type: multipart/mixed; boundary=94eb2c0c57ae9322a505456e470e X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 25366 Cc: 25366@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) --94eb2c0c57ae9322a505456e470e Content-Type: multipart/alternative; boundary=94eb2c0c57ae93229e05456e470c --94eb2c0c57ae93229e05456e470c Content-Type: text/plain; charset=UTF-8 Eli Zaretskii schrieb am Do., 5. Jan. 2017 um 16:50 Uhr: > > From: Philipp Stephani > > Date: Thu, 05 Jan 2017 14:46:01 +0100 > > > > (string-match-p "[[:blank:]]" "\N{HAIR SPACE}") > > => nil, expected 0 > > > > [[:blank:]] should be the same as \h in PRCE. > > We are consistent with our documentation, but I agree that it would be > good to extend [:blank:], as proposed here: > > > http://www.unicode.org/reports/tr18/tr18-19.html#Compatibility_Properties > > Patches to that effect are welcome. > Here's a patch. --94eb2c0c57ae93229e05456e470c Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


Eli Za= retskii <eliz@gnu.org> schrieb am= Do., 5. Jan. 2017 um 16:50=C2=A0Uhr:
> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Thu, 05 Jan 2017 14:46:01 +0100
>
> (string-match-p "[[:blank:]]" "\N{HAIR SPACE}") > =3D> nil, expected 0
>
> [[:blank:]] should be the same as \h in PRCE.

We are consistent with our documentation, but I agree that it would be
good to extend [:blank:], as proposed here:

=C2=A0 = http://www.unicode.org/reports/tr18/tr18-19.html#Compatibility_Properties

Patches to that effect are welcome.

--94eb2c0c57ae93229e05456e470c-- --94eb2c0c57ae9322a505456e470e Content-Type: text/plain; charset=US-ASCII; name="0001-Add-support-for-Unicode-whitespace-in-blank.txt" Content-Disposition: attachment; filename="0001-Add-support-for-Unicode-whitespace-in-blank.txt" Content-Transfer-Encoding: base64 Content-ID: <159744b6dc8ec3375541> X-Attachment-Id: 159744b6dc8ec3375541 RnJvbSBjOGNjOTJkYTE3ZjhlMzNlZDg4NmQzNDExZjYzMTM0N2VmMWM1NWZmIE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBQaGlsaXBwIFN0ZXBoYW5pIDxwaHN0QGdvb2dsZS5jb20+CkRh dGU6IEZyaSwgNiBKYW4gMjAxNyAxNTo1Njo1MSArMDEwMApTdWJqZWN0OiBbUEFUQ0hdIEFkZCBz dXBwb3J0IGZvciBVbmljb2RlIHdoaXRlc3BhY2UgaW4gWzpibGFuazpdCgpTZWUgQnVnIzI1MzY2 LgoKKiBzcmMvY2hhcmFjdGVyLmMgKGJsYW5rcCk6IE5ldyBmdW5jdGlvbiBmb3IgY2hlY2tpbmcg VW5pY29kZQpob3Jpem9udGFsIHdoaXRlc3BhY2UuCiogc3JjL3JlZ2V4LmMgKElTQkxBTkspOiBV c2UgJ2JsYW5rcCcgZm9yIG5vbi1BU0NJSSBob3Jpem9udGFsCndoaXRlc3BhY2UuCihCSVRfQkxB TkspOiBOZXcgYml0IGZvciByYW5nZSB0YWJsZS4KKHJlX3djdHlwZV90b19iaXQsIGV4ZWN1dGVf Y2hhcnNldCk6IFVzZSBpdC4KKiB0ZXN0L2xpc3Avc3Vici10ZXN0cy5lbCAoc3Vici10ZXN0cy0t c3RyaW5nLW1hdGNoLXAtLWJsYW5rKTogQWRkCnVuaXQgdGVzdCBmb3IgWzpibGFuazpdIGNoYXJh Y3RlciBjbGFzcy4KKiBkb2MvbGlzcHJlZi9zZWFyY2hpbmcudGV4aSAoQ2hhciBDbGFzc2VzKTog RG9jdW1lbnQgbmV3IFVuaWNvZGUKYmVoYXZpb3IgZm9yIFs6Ymxhbms6XS4KLS0tCiBkb2MvbGlz cHJlZi9zZWFyY2hpbmcudGV4aSB8ICA1ICsrKystCiBldGMvTkVXUyAgICAgICAgICAgICAgICAg ICB8ICA1ICsrKysrCiBzcmMvY2hhcmFjdGVyLmMgICAgICAgICAgICB8IDE1ICsrKysrKysrKysr KysrKwogc3JjL2NoYXJhY3Rlci5oICAgICAgICAgICAgfCAgMSArCiBzcmMvcmVnZXguYyAgICAg ICAgICAgICAgICB8IDEyICsrKysrKysrLS0tLQogdGVzdC9saXNwL3N1YnItdGVzdHMuZWwgICAg fCAxMCArKysrKysrKysrCiA2IGZpbGVzIGNoYW5nZWQsIDQzIGluc2VydGlvbnMoKyksIDUgZGVs ZXRpb25zKC0pCgpkaWZmIC0tZ2l0IGEvZG9jL2xpc3ByZWYvc2VhcmNoaW5nLnRleGkgYi9kb2Mv bGlzcHJlZi9zZWFyY2hpbmcudGV4aQppbmRleCBiMDExZDE0ZWUzLi4zOGQyMTIxNmQ2IDEwMDY0 NAotLS0gYS9kb2MvbGlzcHJlZi9zZWFyY2hpbmcudGV4aQorKysgYi9kb2MvbGlzcHJlZi9zZWFy Y2hpbmcudGV4aQpAQCAtNTUzLDcgKzU1MywxMCBAQCBDaGFyIENsYXNzZXMKIChAcHhyZWZ7Q2hh cmFjdGVyIFByb3BlcnRpZXN9KSBpbmRpY2F0ZXMgdGhleSBhcmUgYWxwaGFiZXRpYwogY2hhcmFj dGVycy4KIEBpdGVtIFs6Ymxhbms6XQotVGhpcyBtYXRjaGVzIHNwYWNlIGFuZCB0YWIgb25seS4K K1RoaXMgbWF0Y2hlcyBob3Jpem9udGFsIHdoaXRlc3BhY2UsIGFzIGRlZmluZWQgYnkgVW5pY29k ZSBUZWNobmljYWwKK1N0YW5kYXJkICMxOC4gIEluIHBhcnRpY3VsYXIsIGl0IG1hdGNoZXMgdGFi cyBhbmQgY2hhcmFjdGVycyB3aG9zZQorVW5pY29kZSBAc2FtcHtnZW5lcmFsLWNhdGVnb3J5fSBw cm9wZXJ0eSAoQHB4cmVme0NoYXJhY3RlcgorUHJvcGVydGllc30pIGluZGljYXRlcyB0aGV5IGFy ZSBzcGFjaW5nIHNlcGFyYXRvcnMuCiBAaXRlbSBbOmNudHJsOl0KIFRoaXMgbWF0Y2hlcyBhbnkg QGFjcm9ueW17QVNDSUl9IGNvbnRyb2wgY2hhcmFjdGVyLgogQGl0ZW0gWzpkaWdpdDpdCmRpZmYg LS1naXQgYS9ldGMvTkVXUyBiL2V0Yy9ORVdTCmluZGV4IGQ5MTIwNGIyMWIuLjlhN2FhMjA3YmMg MTAwNjQ0Ci0tLSBhL2V0Yy9ORVdTCisrKyBiL2V0Yy9ORVdTCkBAIC03MTAsNiArNzEwLDExIEBA IG9mIGN1cnZlZCBxdW90ZXMgaW4gZm9ybWF0IGFyZ3VtZW50cyB0byBmdW5jdGlvbnMgbGlrZSAn bWVzc2FnZScgYW5kCiBub3cgZ2VuZXJhdGUgbGVzcyBjaGF0dGVyIGFuZCBtb3JlLWNvbXBhY3Qg ZGlhZ25vc3RpY3MuICBUaGUgYXV4aWxpYXJ5CiBmdW5jdGlvbiAnY2hlY2stZGVjbGFyZS1lcnJt c2cnIGhhcyBiZWVuIHJlbW92ZWQuCiAKKysrKworKiogVGhlIHJlZ3VsYXIgZXhwcmVzc2lvbiBj aGFyYWN0ZXIgY2xhc3MgWzpibGFuazpdIG5vdyBtYXRjaGVzCitVbmljb2RlIGhvcml6b250YWwg d2hpdGVzcGFjZSBhcyBkZWZpbmVkIGluCitodHRwOi8vd3d3LnVuaWNvZGUub3JnL3JlcG9ydHMv dHIxOC90cjE4LTE5Lmh0bWwjYmxhbmsuCisKIAwKICogTGlzcCBDaGFuZ2VzIGluIEVtYWNzIDI2 LjEKIApkaWZmIC0tZ2l0IGEvc3JjL2NoYXJhY3Rlci5jIGIvc3JjL2NoYXJhY3Rlci5jCmluZGV4 IGI1OTRhZjA0MGMuLjc0ZDY0MTBmYzcgMTAwNjQ0Ci0tLSBhL3NyYy9jaGFyYWN0ZXIuYworKysg Yi9zcmMvY2hhcmFjdGVyLmMKQEAgLTEwMzgsNiArMTAzOCwyMSBAQCBwcmludGFibGVwIChpbnQg YykKIAkgICAgfHwgZ2VuX2NhdCA9PSBVTklDT0RFX0NBVEVHT1JZX0NuKSk7IC8qIHVuYXNzaWdu ZWQgKi8KIH0KIAorLyogUmV0dXJuIHRydWUgaWYgQyBpcyBhIGhvcml6b250YWwgd2hpdGVzcGFj ZSBjaGFyYWN0ZXIsIGFzIGRlZmluZWQKKyAgIGJ5IGh0dHA6Ly93d3cudW5pY29kZS5vcmcvcmVw b3J0cy90cjE4L3RyMTgtMTkuaHRtbCNibGFuay4gICovCitib29sCitibGFua3AgKGludCBjKQor eworICBpZiAoYyA9PSAnXHQnKQorICAgIHJldHVybiB0cnVlOworCisgIExpc3BfT2JqZWN0IGNh dGVnb3J5ID0gQ0hBUl9UQUJMRV9SRUYgKFZ1bmljb2RlX2NhdGVnb3J5X3RhYmxlLCBjKTsKKyAg aWYgKCEgSU5URUdFUlAgKGNhdGVnb3J5KSkKKyAgICByZXR1cm4gZmFsc2U7CisKKyAgcmV0dXJu IFhJTlQgKGNhdGVnb3J5KSA9PSBVTklDT0RFX0NBVEVHT1JZX1pzOyAvKiBzZXBhcmF0b3IsIHNw YWNlICovCit9CisKIHZvaWQKIHN5bXNfb2ZfY2hhcmFjdGVyICh2b2lkKQogewpkaWZmIC0tZ2l0 IGEvc3JjL2NoYXJhY3Rlci5oIGIvc3JjL2NoYXJhY3Rlci5oCmluZGV4IGZjOGEwZGQ3NGQuLjYy ZDI1MmU5MWIgMTAwNjQ0Ci0tLSBhL3NyYy9jaGFyYWN0ZXIuaAorKysgYi9zcmMvY2hhcmFjdGVy LmgKQEAgLTY4MCw2ICs2ODAsNyBAQCBleHRlcm4gYm9vbCBhbHBoYWJldGljcCAoaW50KTsKIGV4 dGVybiBib29sIGFscGhhbnVtZXJpY3AgKGludCk7CiBleHRlcm4gYm9vbCBncmFwaGljcCAoaW50 KTsKIGV4dGVybiBib29sIHByaW50YWJsZXAgKGludCk7CitleHRlcm4gYm9vbCBibGFua3AgKGlu dCk7CiAKIC8qIFJldHVybiBhIHRyYW5zbGF0aW9uIHRhYmxlIG9mIGlkIG51bWJlciBJRC4gICov CiAjZGVmaW5lIEdFVF9UUkFOU0xBVElPTl9UQUJMRShpZCkgXApkaWZmIC0tZ2l0IGEvc3JjL3Jl Z2V4LmMgYi9zcmMvcmVnZXguYwppbmRleCBhZTNmZGU4MGM5Li43ZTcwYzQ5NGY0IDEwMDY0NAot LS0gYS9zcmMvcmVnZXguYworKysgYi9zcmMvcmVnZXguYwpAQCAtMzEwLDExICszMTAsMTIgQEAg ZW51bSBzeW50YXhjb2RlIHsgU3doaXRlc3BhY2UgPSAwLCBTd29yZCA9IDEsIFNzeW1ib2wgPSAy IH07CiAJCSAgICAgfHwgKChjKSA+PSAnYScgJiYgKGMpIDw9ICdmJykJXAogCQkgICAgIHx8ICgo YykgPj0gJ0EnICYmIChjKSA8PSAnRicpKQogCi0vKiBUaGlzIGlzIG9ubHkgdXNlZCBmb3Igc2lu Z2xlLWJ5dGUgY2hhcmFjdGVycy4gICovCi0jIGRlZmluZSBJU0JMQU5LKGMpICgoYykgPT0gJyAn IHx8IChjKSA9PSAnXHQnKQotCiAvKiBUaGUgcmVzdCBtdXN0IGhhbmRsZSBtdWx0aWJ5dGUgY2hh cmFjdGVycy4gICovCiAKKyMgZGVmaW5lIElTQkxBTksoYykgKElTX1JFQUxfQVNDSUkgKGMpICAg ICAgICAgICAgICAgICAgXAorICAgICAgICAgICAgICAgICAgICAgPyAoKGMpID09ICcgJyB8fCAo YykgPT0gJ1x0JykgICAgICBcCisgICAgICAgICAgICAgICAgICAgICA6IGJsYW5rcCAoYykpCisK ICMgZGVmaW5lIElTR1JBUEgoYykgKFNJTkdMRV9CWVRFX0NIQVJfUCAoYykJCQkJXAogCQkgICAg ID8gKGMpID4gJyAnICYmICEoKGMpID49IDAxNzcgJiYgKGMpIDw9IDAyNDApCVwKIAkJICAgICA6 IGdyYXBoaWNwIChjKSkKQEAgLTE3OTAsNiArMTc5MSw3IEBAIHN0cnVjdCByYW5nZV90YWJsZV93 b3JrX2FyZWEKICNkZWZpbmUgQklUX0FMTlVNCTB4ODAKICNkZWZpbmUgQklUX0dSQVBICTB4MTAw CiAjZGVmaW5lIEJJVF9QUklOVAkweDIwMAorI2RlZmluZSBCSVRfQkxBTksgICAgICAgMHg0MDAK IAwKIAogLyogU2V0IHRoZSBiaXQgZm9yIGNoYXJhY3RlciBDIGluIGEgbGlzdC4gICovCkBAIC0y MDY2LDggKzIwNjgsOSBAQCByZV93Y3R5cGVfdG9fYml0IChyZV93Y3R5cGVfdCBjYykKICAgICBj YXNlIFJFQ0NfU1BBQ0U6IHJldHVybiBCSVRfU1BBQ0U7CiAgICAgY2FzZSBSRUNDX0dSQVBIOiBy ZXR1cm4gQklUX0dSQVBIOwogICAgIGNhc2UgUkVDQ19QUklOVDogcmV0dXJuIEJJVF9QUklOVDsK KyAgICBjYXNlIFJFQ0NfQkxBTks6IHJldHVybiBCSVRfQkxBTks7CiAgICAgY2FzZSBSRUNDX0FT Q0lJOiBjYXNlIFJFQ0NfRElHSVQ6IGNhc2UgUkVDQ19YRElHSVQ6IGNhc2UgUkVDQ19DTlRSTDoK LSAgICBjYXNlIFJFQ0NfQkxBTks6IGNhc2UgUkVDQ19VTklCWVRFOiBjYXNlIFJFQ0NfRVJST1I6 IHJldHVybiAwOworICAgIGNhc2UgUkVDQ19VTklCWVRFOiBjYXNlIFJFQ0NfRVJST1I6IHJldHVy biAwOwogICAgIGRlZmF1bHQ6CiAgICAgICBhYm9ydCAoKTsKICAgICB9CkBAIC00NjU4LDYgKzQ2 NjEsNyBAQCBleGVjdXRlX2NoYXJzZXQgKGNvbnN0X3JlX2NoYXIgKipwcCwgdW5zaWduZWQgYywg dW5zaWduZWQgY29yaWcsIGJvb2wgdW5pYnl0ZSkKIAkgIChjbGFzc19iaXRzICYgQklUX0FMTlVN ICYmIElTQUxOVU0gKGMpKSB8fAogCSAgKGNsYXNzX2JpdHMgJiBCSVRfQUxQSEEgJiYgSVNBTFBI QSAoYykpIHx8CiAJICAoY2xhc3NfYml0cyAmIEJJVF9TUEFDRSAmJiBJU1NQQUNFIChjKSkgfHwK KyAgICAgICAgICAoY2xhc3NfYml0cyAmIEJJVF9CTEFOSyAmJiBJU0JMQU5LIChjKSkgfHwKIAkg IChjbGFzc19iaXRzICYgQklUX1dPUkQgICYmIElTV09SRCAgKGMpKSB8fAogCSAgKChjbGFzc19i aXRzICYgQklUX1VQUEVSKSAmJgogCSAgIChJU1VQUEVSIChjKSB8fCAoY29yaWcgIT0gYyAmJgpk aWZmIC0tZ2l0IGEvdGVzdC9saXNwL3N1YnItdGVzdHMuZWwgYi90ZXN0L2xpc3Avc3Vici10ZXN0 cy5lbAppbmRleCAzYzVkYmNkYmQ3Li5hM2IwOGU5Njk3IDEwMDY0NAotLS0gYS90ZXN0L2xpc3Av c3Vici10ZXN0cy5lbAorKysgYi90ZXN0L2xpc3Avc3Vici10ZXN0cy5lbApAQCAtMjcxLDUgKzI3 MSwxNSBAQCBzdWJyLXRlc3QtLWZyYW1lcy0xCiAgIChsZXQgKChmcmFtZS1saXN0cyAoc3Vici10 ZXN0LS1mcmFtZXMtMSAnc3Vici10ZXN0LS1mcmFtZXMtMikpKQogICAgIChzaG91bGQgKGVxdWFs IChjYXIgZnJhbWUtbGlzdHMpIChjZHIgZnJhbWUtbGlzdHMpKSkpKQogCisoZXJ0LWRlZnRlc3Qg c3Vici10ZXN0cy0tc3RyaW5nLW1hdGNoLXAtLWJsYW5rICgpCisgICJUZXN0IHRoYXQgWzpibGFu azpdIG1hdGNoZXMgaG9yaXpvbnRhbCB3aGl0ZXNwYWNlLCBjZi4gQnVnIzI1MzY2LiIKKyAgKHNo b3VsZCAoZXF1YWwgKHN0cmluZy1tYXRjaC1wICJcXGBbWzpibGFuazpdXVxcJyIgIiAiKSAwKSkK KyAgKHNob3VsZCAoZXF1YWwgKHN0cmluZy1tYXRjaC1wICJcXGBbWzpibGFuazpdXVxcJyIgIlx0 IikgMCkpCisgIChzaG91bGQtbm90IChzdHJpbmctbWF0Y2gtcCAiXFxgW1s6Ymxhbms6XV1cXCci ICJcbiIpKQorICAoc2hvdWxkLW5vdCAoc3RyaW5nLW1hdGNoLXAgIlxcYFtbOmJsYW5rOl1dXFwn IiAiYSIpKQorICAoc2hvdWxkIChlcXVhbCAoc3RyaW5nLW1hdGNoLXAgIlxcYFtbOmJsYW5rOl1d XFwnIiAiXE57SEFJUiBTUEFDRX0iKSAwKSkKKyAgKHNob3VsZCAoZXF1YWwgKHN0cmluZy1tYXRj aC1wICJcXGBbWzpibGFuazpdXVxcJyIgIlx1MzAwMCIpIDApKQorICAoc2hvdWxkLW5vdCAoc3Ry aW5nLW1hdGNoLXAgIlxcYFtbOmJsYW5rOl1dXFwnIiAiXE57TElORSBTRVBBUkFUT1J9IikpKQor CiAocHJvdmlkZSAnc3Vici10ZXN0cykKIDs7OyBzdWJyLXRlc3RzLmVsIGVuZHMgaGVyZQotLSAK Mi4xMS4wCgo= --94eb2c0c57ae9322a505456e470e-- From debbugs-submit-bounces@debbugs.gnu.org Fri Jan 06 10:11:39 2017 Received: (at 25366) by debbugs.gnu.org; 6 Jan 2017 15:11:39 +0000 Received: from localhost ([127.0.0.1]:45270 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cPWB9-0005AZ-6x for submit@debbugs.gnu.org; Fri, 06 Jan 2017 10:11:39 -0500 Received: from eggs.gnu.org ([208.118.235.92]:46370) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cPWB8-0005AM-DL for 25366@debbugs.gnu.org; Fri, 06 Jan 2017 10:11:38 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cPWB0-00089u-1H for 25366@debbugs.gnu.org; Fri, 06 Jan 2017 10:11:33 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_50,RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:59662) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cPWAz-00089h-Tj; Fri, 06 Jan 2017 10:11:29 -0500 Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:1028 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1cPWAz-0004Vg-6j; Fri, 06 Jan 2017 10:11:29 -0500 Date: Fri, 06 Jan 2017 17:11:48 +0200 Message-Id: <83bmvkcjez.fsf@gnu.org> From: Eli Zaretskii To: Philipp Stephani In-reply-to: (message from Philipp Stephani on Fri, 06 Jan 2017 15:00:22 +0000) Subject: Re: bug#25366: 26.0.50; [:blank:] character class should match all Unicode horizontal whitespace References: <838tqpecaq.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -8.2 (--------) X-Debbugs-Envelope-To: 25366 Cc: 25366@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -8.2 (--------) > From: Philipp Stephani > Date: Fri, 06 Jan 2017 15:00:22 +0000 > Cc: 25366@debbugs.gnu.org > > http://www.unicode.org/reports/tr18/tr18-19.html#Compatibility_Properties > > Patches to that effect are welcome. > > Here's a patch. Thanks. A few minor comments below. > +/* Return true if C is a horizontal whitespace character, as defined > + by http://www.unicode.org/reports/tr18/tr18-19.html#blank. */ > +bool > +blankp (int c) > +{ > + if (c == '\t') > + return true; Why does this test explicitly only for a TAB? What about SPC, for example? > --- a/doc/lispref/searching.texi > +++ b/doc/lispref/searching.texi > @@ -553,7 +553,10 @@ Char Classes > (@pxref{Character Properties}) indicates they are alphabetic > characters. > @item [:blank:] > -This matches space and tab only. > +This matches horizontal whitespace, as defined by Unicode Technical > +Standard #18. In particular, it matches tabs and characters whose > +Unicode @samp{general-category} property (@pxref{Character > +Properties}) indicates they are spacing separators. Similarly here: I find the lack of reference to a space potentially confusing. > +** The regular expression character class [:blank:] now matches > +Unicode horizontal whitespace as defined in > +http://www.unicode.org/reports/tr18/tr18-19.html#blank. The reference to a particular version of UTS#18 might become obsolete when a new version is released. So I suggest to provide a general reference to the report and its section, not an exact URL. From debbugs-submit-bounces@debbugs.gnu.org Fri Jan 06 14:11:16 2017 Received: (at 25366) by debbugs.gnu.org; 6 Jan 2017 19:11:16 +0000 Received: from localhost ([127.0.0.1]:45407 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cPZv2-0000xx-7g for submit@debbugs.gnu.org; Fri, 06 Jan 2017 14:11:16 -0500 Received: from mail-oi0-f47.google.com ([209.85.218.47]:33576) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cPZv0-0000xl-2N for 25366@debbugs.gnu.org; Fri, 06 Jan 2017 14:11:14 -0500 Received: by mail-oi0-f47.google.com with SMTP id 128so431167650oig.0 for <25366@debbugs.gnu.org>; Fri, 06 Jan 2017 11:11:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=v0tOk71aO0N1nUMY+Cu5R/FcOdOrwyrvs8xsODMABUI=; b=HkQscL+fUwnv+FVTKp1U2Y+ffX42XKiWWgV6HDKRhnPPu95LG8qGt0PYhhvCxtJ9g4 Rv9fRH08LyLZR6gvOdynUULB56qQMUyMHfuwzmt5aoLZWSjLlNqbr0pSsrN5anR4dq19 WeyUaugh7EysnMTY4eYlQUkwJyxi3ABx4dsqKBRyIC1HDFrvbJC1GiuQgejM163SMhV8 fnfDB814ZRYJXrHeb+WkFg+eRnZNqhrDO7U/pz4OOQBzFHFmHTp+hKq4psky3649tjZ9 6C3JP+wtrhXSPqaSrXf6aaiblsR4KLHEmqdA95IyHMWiZSc6jIE0YiEf5vVqjSxZJDkh NBOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=v0tOk71aO0N1nUMY+Cu5R/FcOdOrwyrvs8xsODMABUI=; b=sHXPFe/C+msxS04e2N2U4jrT/TykzJo68lbmmI/xVmbKaGPgTC5vNa7uNzAEBYik67 C/YG8ght6Dt/utA784/NwzbNA61c8j4tbqwCVUmwMY1tMZp6eFlJqnM6nsuyAmtocEgc oqiCoKdfQ1Uvn4eNmPVOMPLyxYV9zgBr69qNmtjWW0CaRgoVFnM/bA55T6TzNB3y3aFs r28X3XM7f/gCi+65pjw09z7TQ49c10sqSyOegmHRKbvGlmsDMsJVkYbZzhRxPfCE3UKR wqXBjW3t2wHUL2NXQHOEFTiIfQLxyXenBT9wGQ5hDfIKwF6ViERgNj+lNONpDl2ldfwl gYsQ== X-Gm-Message-State: AIkVDXIAMJjX5bUXUFlnqsmNb2NB+70qrs0bn0rBLkcWa1yyDBXbZOIwWyx0szfmdpB5hYE/pGM1cArN6xooXw== X-Received: by 10.157.17.212 with SMTP id y20mr1748549oty.230.1483729868346; Fri, 06 Jan 2017 11:11:08 -0800 (PST) MIME-Version: 1.0 References: <838tqpecaq.fsf@gnu.org> <83bmvkcjez.fsf@gnu.org> In-Reply-To: <83bmvkcjez.fsf@gnu.org> From: Philipp Stephani Date: Fri, 06 Jan 2017 19:10:57 +0000 Message-ID: Subject: Re: bug#25366: 26.0.50; [:blank:] character class should match all Unicode horizontal whitespace To: Eli Zaretskii Content-Type: multipart/alternative; boundary=94eb2c18ff8abed9c5054571c723 X-Spam-Score: -0.4 (/) X-Debbugs-Envelope-To: 25366 Cc: 25366@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.4 (/) --94eb2c18ff8abed9c5054571c723 Content-Type: text/plain; charset=UTF-8 Eli Zaretskii schrieb am Fr., 6. Jan. 2017 um 16:11 Uhr: > > From: Philipp Stephani > > Date: Fri, 06 Jan 2017 15:00:22 +0000 > > Cc: 25366@debbugs.gnu.org > > > > > http://www.unicode.org/reports/tr18/tr18-19.html#Compatibility_Properties > > > > Patches to that effect are welcome. > > > > Here's a patch. > > Thanks. A few minor comments below. > > > +/* Return true if C is a horizontal whitespace character, as defined > > + by http://www.unicode.org/reports/tr18/tr18-19.html#blank. */ > > +bool > > +blankp (int c) > > +{ > > + if (c == '\t') > > + return true; > > Why does this test explicitly only for a TAB? What about SPC, for > example? > Because TAB is the only character that is blank, but doesn't have the general category Zs. I've now also included space and added a comment. The risk that the general category of space will ever be changed seems very small. > > > --- a/doc/lispref/searching.texi > > +++ b/doc/lispref/searching.texi > > @@ -553,7 +553,10 @@ Char Classes > > (@pxref{Character Properties}) indicates they are alphabetic > > characters. > > @item [:blank:] > > -This matches space and tab only. > > +This matches horizontal whitespace, as defined by Unicode Technical > > +Standard #18. In particular, it matches tabs and characters whose > > +Unicode @samp{general-category} property (@pxref{Character > > +Properties}) indicates they are spacing separators. > > Similarly here: I find the lack of reference to a space potentially > confusing. > Added. > > > +** The regular expression character class [:blank:] now matches > > +Unicode horizontal whitespace as defined in > > +http://www.unicode.org/reports/tr18/tr18-19.html#blank. > > The reference to a particular version of UTS#18 might become obsolete > when a new version is released. So I suggest to provide a general > reference to the report and its section, not an exact URL. > Done. --94eb2c18ff8abed9c5054571c723 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


Eli Za= retskii <eliz@gnu.org> schrieb am= Fr., 6. Jan. 2017 um 16:11=C2=A0Uhr:
> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Fri, 06 Jan 2017 15:00:22 +0000
> Cc: 25366@debbugs.gnu.org
>
>=C2=A0 http://www.unicode.org/reports/tr18/tr18-19.html#Compatibility_Properti= es
>
>=C2=A0 Patches to that effect are welcome.
>
> Here's a patch.

Thanks.=C2=A0 A few minor comments below.

> +/* Return true if C is a horizontal whitespace character, as defined<= br class=3D"gmail_msg"> > +=C2=A0 =C2=A0by http= ://www.unicode.org/reports/tr18/tr18-19.html#blank.=C2=A0 */
> +bool
> +blankp (int c)
> +{
> +=C2=A0 if (c =3D=3D '\t')
> +=C2=A0 =C2=A0 return true;

Why does this test explicitly only for a TAB?=C2=A0 What about SPC, for
example?

Because TA= B is the only character that is blank, but doesn't have the general cat= egory Zs.
I've now also included space and added a comment. T= he risk that the general category of space will ever be changed seems very = small.
=C2=A0

> --- a/doc/lispref/searching.texi
> +++ b/doc/lispref/searching.texi
> @@ -553,7 +553,10 @@ Char Classes
>=C2=A0 (@pxref{Character Properties}) indicates they are alphabetic
>=C2=A0 characters.
>=C2=A0 @item [:blank:]
> -This matches space and tab only.
> +This matches horizontal whitespace, as defined by Unicode Technical > +Standard #18.=C2=A0 In particular, it matches tabs and characters who= se
> +Unicode @samp{general-category} property (@pxref{Character
> +Properties}) indicates they are spacing separators.

Similarly here: I find the lack of reference to a space potentially
confusing.

Added.
=C2=A0

> +** The regular expression character class [:blank:] now matches
> +Unicode horizontal whitespace as defined in
> +http://www.unicode.o= rg/reports/tr18/tr18-19.html#blank.

The reference to a particular version of UTS#18 might become obsolete
when a new version is released.=C2=A0 So I suggest to provide a general
reference to the report and its section, not an exact URL.

Done.=C2=A0
--94eb2c18ff8abed9c5054571c723-- From debbugs-submit-bounces@debbugs.gnu.org Fri Jan 06 14:21:23 2017 Received: (at 25366-done) by debbugs.gnu.org; 6 Jan 2017 19:21:23 +0000 Received: from localhost ([127.0.0.1]:45413 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cPa4p-0001DR-At for submit@debbugs.gnu.org; Fri, 06 Jan 2017 14:21:23 -0500 Received: from mail-oi0-f54.google.com ([209.85.218.54]:32920) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cPa4o-0001DD-6z for 25366-done@debbugs.gnu.org; Fri, 06 Jan 2017 14:21:22 -0500 Received: by mail-oi0-f54.google.com with SMTP id 128so431472243oig.0 for <25366-done@debbugs.gnu.org>; Fri, 06 Jan 2017 11:21:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=gTlP+i3pcqgLDzVnKWcodyQ9nfeD8DtBUJXR6925rQg=; b=S6SpmJ0J8aR503TIbbFN5NgYHD4uHsu21ERf2oWto5Dn1ew8YURcaEfm5/KMnUdpcw CcrkHDlFRiDrEe20nnIm6WsDsx3Ooy5O6XUp0n91G4x6U6TcgH+zeKECkM5tVo3mjjow lXzWhudfg+YOH/CkCsJSPXCOgDc2CKJdkRKLBiyizSP5PA29Y4i2D8GBdfTIEzQzEhdi esGhhAX0nIV0FWaPEkE1QYYCxPUJMou6ikKgNGMaFYARjfOukc+6bpcb8SLkPXIc3G9M Zn6UpOtK+CE8qTd5hxcmY/XCOTrX7tDZwxysfYEbqWQsTL8Bm5LGR7640jiB2NgZlTf0 iuLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=gTlP+i3pcqgLDzVnKWcodyQ9nfeD8DtBUJXR6925rQg=; b=MlE0wjUut+vmDz4sXn8pbibqEaxZO5H8jwA2L2kXWP7jI+/ZQa5xETDccoZ99kozhx l86msfXz6bcTn08/6iLSJu3tdeVc5BjhGm5C/tPlZJa9P1japoSA+MsZmSttlEnZDZN3 ByKK7N2g1iFQcEGrwYzMrg5PeomAFXNTfXQkToiPNKVQpnRr05QYOK/u3EVi6UUHYn32 YUYmkdOYXPIqp/106SW3U/BTSiPBK6lJuskejwwhSdC418oIU7OAT0UEFJnLef0DVWZG nNbdLjfMoov3OYuIrB4p3WWXDHfgLCNKc6Hz30fAa855s1m5QeUxCfMUgrlAvGX7wNqn Td+g== X-Gm-Message-State: AIkVDXJVpv4F1WV5r6qFxDioNECFwn+N6RlRwsfF/9xGCgJrRuECEgY8pGT3l/9dUhqs3ALezRIDEn6LUsPgdQ== X-Received: by 10.157.40.121 with SMTP id h54mr1612460otd.179.1483730476417; Fri, 06 Jan 2017 11:21:16 -0800 (PST) MIME-Version: 1.0 References: <838tqpecaq.fsf@gnu.org> <83bmvkcjez.fsf@gnu.org> In-Reply-To: From: Philipp Stephani Date: Fri, 06 Jan 2017 19:21:05 +0000 Message-ID: Subject: Re: bug#25366: 26.0.50; [:blank:] character class should match all Unicode horizontal whitespace To: Eli Zaretskii Content-Type: multipart/alternative; boundary=001a1142d768fd46f0054571eb0f X-Spam-Score: 0.7 (/) X-Debbugs-Envelope-To: 25366-done Cc: 25366-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.7 (/) --001a1142d768fd46f0054571eb0f Content-Type: text/plain; charset=UTF-8 Philipp Stephani schrieb am Fr., 6. Jan. 2017 um 20:10 Uhr: > Eli Zaretskii schrieb am Fr., 6. Jan. 2017 um 16:11 Uhr: > > > From: Philipp Stephani > > Date: Fri, 06 Jan 2017 15:00:22 +0000 > > Cc: 25366@debbugs.gnu.org > > > > > http://www.unicode.org/reports/tr18/tr18-19.html#Compatibility_Properties > > > > Patches to that effect are welcome. > > > > Here's a patch. > > Thanks. A few minor comments below. > > > +/* Return true if C is a horizontal whitespace character, as defined > > + by http://www.unicode.org/reports/tr18/tr18-19.html#blank. */ > > +bool > > +blankp (int c) > > +{ > > + if (c == '\t') > > + return true; > > Why does this test explicitly only for a TAB? What about SPC, for > example? > > > Because TAB is the only character that is blank, but doesn't have the > general category Zs. > I've now also included space and added a comment. The risk that the > general category of space will ever be changed seems very small. > > > > > --- a/doc/lispref/searching.texi > > +++ b/doc/lispref/searching.texi > > @@ -553,7 +553,10 @@ Char Classes > > (@pxref{Character Properties}) indicates they are alphabetic > > characters. > > @item [:blank:] > > -This matches space and tab only. > > +This matches horizontal whitespace, as defined by Unicode Technical > > +Standard #18. In particular, it matches tabs and characters whose > > +Unicode @samp{general-category} property (@pxref{Character > > +Properties}) indicates they are spacing separators. > > Similarly here: I find the lack of reference to a space potentially > confusing. > > > Added. > > > > > +** The regular expression character class [:blank:] now matches > > +Unicode horizontal whitespace as defined in > > +http://www.unicode.org/reports/tr18/tr18-19.html#blank. > > The reference to a particular version of UTS#18 might become obsolete > when a new version is released. So I suggest to provide a general > reference to the report and its section, not an exact URL. > > > Done. > Pushed to master as 512e9886be. --001a1142d768fd46f0054571eb0f Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


Philip= p Stephani <p.stephani2@gmail.c= om> schrieb am Fr., 6. Jan. 2017 um 20:10=C2=A0Uhr:
Eli Zaretskii &= lt;el= iz@gnu.org> schrieb am Fr., 6. Jan. 2017 um 16:11=C2=A0Uhr:
> From: Ph= ilipp Stephani <p.stephani2@gmail.com>
> Date: Fri, 06 Jan 2017 15:00:22 +0000
> Cc: 25366@debbugs.gnu.org
>
>=C2=A0 http://www.unicode.org/reports/tr18/tr18-19.html#Compatibility_Properti= es
>
>=C2=A0 Patches to that effect are welcome.
>
> Here's a patch.

Thanks.=C2=A0 A few minor comments below.

> +/* Return true if C is a horizontal whitespace character, as defined<= br class=3D"gmail_msg"> > +=C2=A0 =C2=A0by http= ://www.unicode.org/reports/tr18/tr18-19.html#blank.=C2=A0 */
> +bool
> +blankp (int c)
> +{
> +=C2=A0 if (c =3D=3D '\t')
> +=C2=A0 =C2=A0 return true;

Why does this test explicitly only for a TAB?=C2=A0 What about SPC, for
example?

<= div class=3D"gmail_quote gmail_msg">
Because TAB is= the only character that is blank, but doesn't have the general categor= y Zs.
I've now also included space and ad= ded a comment. The risk that the general category of space will ever be cha= nged seems very small.
=C2=A0

> --- a/doc/lispref/searching.texi
> +++ b/doc/lispref/searching.texi
> @@ -553,7 +553,10 @@ Char Classes
>=C2=A0 (@pxref{Character Properties}) indicates they are alphabetic
>=C2=A0 characters.
>=C2=A0 @item [:blank:]
> -This matches space and tab only.
> +This matches horizontal whitespace, as defined by Unicode Technical > +Standard #18.=C2=A0 In particular, it matches tabs and characters who= se
> +Unicode @samp{general-category} property (@pxref{Character
> +Properties}) indicates they are spacing separators.

Similarly here: I find the lack of reference to a space potentially
confusing.
Added.
=
=C2=A0

> +** The regular expression character class [:blank:] now matches
> +Unicode horizontal whitespace as defined in
> +http://www.unicode.o= rg/reports/tr18/tr18-19.html#blank.

The reference to a particular version of UTS#18 might become obsolete
when a new version is released.=C2=A0 So I suggest to provide a general
reference to the report and its section, not an exact URL.

=
Done.=C2=A0


Pushed to master as 512e9886be.=C2=A0=
--001a1142d768fd46f0054571eb0f-- From unknown Sat Aug 16 10:47:36 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Sat, 04 Feb 2017 12:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator