From debbugs-submit-bounces@debbugs.gnu.org Wed Jun 26 17:37:48 2019 Received: (at submit) by debbugs.gnu.org; 26 Jun 2019 21:37:48 +0000 Received: from localhost ([127.0.0.1]:37132 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hgFbv-0003K7-Kn for submit@debbugs.gnu.org; Wed, 26 Jun 2019 17:37:48 -0400 Received: from lists.gnu.org ([209.51.188.17]:43494) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hgFbu-0003Jq-4V for submit@debbugs.gnu.org; Wed, 26 Jun 2019 17:37:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54257) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hgFbs-0006Xk-UI for bug-gnu-emacs@gnu.org; Wed, 26 Jun 2019 17:37:46 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,RCVD_IN_DNSWL_NONE, URIBL_BLOCKED autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hgFbr-00073D-Pa for bug-gnu-emacs@gnu.org; Wed, 26 Jun 2019 17:37:44 -0400 Received: from bongo.elm.relay.mailchannels.net ([23.83.212.21]:10141) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hgFbr-00072D-A1 for bug-gnu-emacs@gnu.org; Wed, 26 Jun 2019 17:37:43 -0400 X-Sender-Id: dreamhost|x-authsender|jurta@jurta.org Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id B9589500519 for ; Wed, 26 Jun 2019 21:37:40 +0000 (UTC) Received: from pdx1-sub0-mail-a82.g.dreamhost.com (100-96-14-97.trex.outbound.svc.cluster.local [100.96.14.97]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id 03C45501B54 for ; Wed, 26 Jun 2019 21:37:40 +0000 (UTC) X-Sender-Id: dreamhost|x-authsender|jurta@jurta.org Received: from pdx1-sub0-mail-a82.g.dreamhost.com ([TEMPUNAVAIL]. [64.90.62.162]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384) by 0.0.0.0:2500 (trex/5.17.2); Wed, 26 Jun 2019 21:37:40 +0000 X-MC-Relay: Neutral X-MailChannels-SenderId: dreamhost|x-authsender|jurta@jurta.org X-MailChannels-Auth-Id: dreamhost X-Chemical-Power: 3d17b5b10f1edc7b_1561585060557_3362061150 X-MC-Loop-Signature: 1561585060557:112029162 X-MC-Ingress-Time: 1561585060557 Received: from pdx1-sub0-mail-a82.g.dreamhost.com (localhost [127.0.0.1]) by pdx1-sub0-mail-a82.g.dreamhost.com (Postfix) with ESMTP id DEAF480F7A for ; Wed, 26 Jun 2019 14:37:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=linkov.net; h=from:to :subject:date:message-id:mime-version:content-type; s= linkov.net; bh=R1Nd9xVcZ+OcEo0YkMIYCj0CWbM=; b=bQV7i1PSarF29Kz3S 1S46vXuKiK0uOxcXwHB9QXNF+nP3KyEkN0clW7NDy/fpo/uPV7VjRutgVfVnufvY OjJIiUmlBPAn19O/Kp5vH9lS8WC9rxePpZbt2y2JTK/BkVCJPu9+k5odxZeGQXgN Czy4KWXJiOKkshGnBxgGaFk49U= Received: from mail.jurta.org (m91-129-109-209.cust.tele2.ee [91.129.109.209]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: jurta@jurta.org) by pdx1-sub0-mail-a82.g.dreamhost.com (Postfix) with ESMTPSA id 9DF3680F71 for ; Wed, 26 Jun 2019 14:37:33 -0700 (PDT) X-DH-BACKEND: pdx1-sub0-mail-a82 From: Juri Linkov To: bug-gnu-emacs@gnu.org Subject: Lax char-fold search Organization: LINKOV.NET Date: Thu, 27 Jun 2019 00:12:08 +0300 Message-ID: <87a7e4rq1z.fsf@mail.linkov.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-VR-OUT-STATUS: OK X-VR-OUT-SCORE: 0 X-VR-OUT-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgeduvddrudejgddtudcutefuodetggdotefrodftvfcurfhrohhfihhlvgemucggtfgfnhhsuhgsshgtrhhisggvpdfftffgtefojffquffvnecuuegrihhlohhuthemuceftddtnecunecujfgurhephffvufhofffkfgggtgesmhdtreertderjeenucfhrhhomheplfhurhhiucfnihhnkhhovhcuoehjuhhriheslhhinhhkohhvrdhnvghtqeenucfkphepledurdduvdelrddutdelrddvtdelnecurfgrrhgrmhepmhhouggvpehsmhhtphdphhgvlhhopehmrghilhdrjhhurhhtrgdrohhrghdpihhnvghtpeeluddruddvledruddtledrvddtledprhgvthhurhhnqdhprghthheplfhurhhiucfnihhnkhhovhcuoehjuhhriheslhhinhhkohhvrdhnvghtqedpmhgrihhlfhhrohhmpehjuhhriheslhhinhhkohhvrdhnvghtpdhnrhgtphhtthhopegsuhhgqdhgnhhuqdgvmhgrtghssehgnhhurdhorhhgnecuvehluhhsthgvrhfuihiivgeptd X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 23.83.212.21 X-Spam-Score: -1.4 (-) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.4 (--) --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Before finishing bug#35689, I'd like to implement the argument LAX of char-fold-to-regexp, so e.g. during isearch when point is at the beginnin= g of the word with ligature =E2=80=9C=EF=AC=81x=E2=80=9D, typing three lett= ers =E2=80=98f=E2=80=99, =E2=80=98i=E2=80=99, =E2=80=98x=E2=80=99 will keep point matching on the same word: --=-=-= Content-Type: text/x-diff; charset=utf-8 Content-Disposition: inline; filename=char-fold-to-regexp-lax.patch Content-Transfer-Encoding: quoted-printable diff --git a/lisp/char-fold.el b/lisp/char-fold.el index 7223ecf738..9d3ea17b41 100644 --- a/lisp/char-fold.el +++ b/lisp/char-fold.el @@ -148,12 +148,18 @@ char-fold--make-space-string (make-list n (or (aref char-fold-table ?\s) " "))))) =20 ;;;###autoload -(defun char-fold-to-regexp (string &optional _lax from) +(defun char-fold-to-regexp (string &optional lax from) "Return a regexp matching anything that char-folds into STRING. Any character in STRING that has an entry in `char-fold-table' is replaced with that entry (which is a regexp) and other characters are `regexp-quote'd. =20 +When LAX is non-nil, then the final character also matches ligatures +partially, for instance, the search string \"f\" will match \"=EF=AC=81\= ", +so when typing the search string in isearch while the cursor is on +a ligature, the search won't try to immediately advance to the next +complete match, but will stay on the partially matched ligature. + If the resulting regexp would be too long for Emacs to handle, just return the result of calling `regexp-quote' on STRING. =20 @@ -183,7 +189,11 @@ char-fold-to-regexp ;; Long string. The regexp would probably be too long. (alist (unless (> end 50) (aref multi-char-table c)))) - (push (let ((matched-entries nil) + (push (if (and lax alist (=3D (1+ i) end)) + (concat "\\(?:" regexp "\\|" + (mapconcat (lambda (entry) + (cdr entry)) alist "\\|") "\= \)") + (let ((matched-entries nil) (max-length 0)) (dolist (entry alist) (let* ((suffix (car entry)) @@ -212,7 +222,7 @@ char-fold-to-regexp (concat suffix-regexp (char-fold-to-regexp = subs nil length)))) `((0 . ,regexp) . ,matched-entrie= s) "\\|") - "\\)")))) + "\\)"))))) out)))) (setq i (1+ i))) (when (> spaces 0) diff --git a/test/lisp/char-fold-tests.el b/test/lisp/char-fold-tests.el index 3fde312a13..e9dfd2b733 100644 --- a/test/lisp/char-fold-tests.el +++ b/test/lisp/char-fold-tests.el @@ -82,6 +82,14 @@ char-fold--test-search-with-contents (set-char-table-extra-slot char-fold-table 0 multi) (char-fold--test-match-exactly (car it) (cdr it))))) =20 +(ert-deftest char-fold--test-multi-lax () + (dolist (it '(("f" . "=EF=AC=81") ("f" . "=EF=AC=80"))) + (with-temp-buffer + (insert (cdr it)) + (goto-char (point-min)) + (should (search-forward-regexp + (char-fold-to-regexp (car it) 'lax) nil 'noerror))))) + (ert-deftest char-fold--test-fold-to-regexp () (let ((char-fold-table (make-char-table 'char-fold-table)) (multi (make-char-table 'char-fold-table))) --=-=-=-- From debbugs-submit-bounces@debbugs.gnu.org Thu Jul 04 16:50:12 2019 Received: (at control) by debbugs.gnu.org; 4 Jul 2019 20:50:12 +0000 Received: from localhost ([127.0.0.1]:52008 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hj8gG-0000o0-A7 for submit@debbugs.gnu.org; Thu, 04 Jul 2019 16:50:12 -0400 Received: from bumble.birch.relay.mailchannels.net ([23.83.209.25]:27370) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hj8gD-0000np-Sw for control@debbugs.gnu.org; Thu, 04 Jul 2019 16:50:10 -0400 X-Sender-Id: dreamhost|x-authsender|jurta@jurta.org Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 1948E21CFD for ; Thu, 4 Jul 2019 20:50:08 +0000 (UTC) Received: from pdx1-sub0-mail-a17.g.dreamhost.com (100-96-8-155.trex.outbound.svc.cluster.local [100.96.8.155]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id A41A5221AD for ; Thu, 4 Jul 2019 20:50:07 +0000 (UTC) X-Sender-Id: dreamhost|x-authsender|jurta@jurta.org Received: from pdx1-sub0-mail-a17.g.dreamhost.com ([TEMPUNAVAIL]. [64.90.62.162]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384) by 0.0.0.0:2500 (trex/5.17.3); Thu, 04 Jul 2019 20:50:07 +0000 X-MC-Relay: Neutral X-MailChannels-SenderId: dreamhost|x-authsender|jurta@jurta.org X-MailChannels-Auth-Id: dreamhost X-Left-Fearful: 2fa72ec969c1a767_1562273407741_2539444055 X-MC-Loop-Signature: 1562273407741:3038334783 X-MC-Ingress-Time: 1562273407741 Received: from pdx1-sub0-mail-a17.g.dreamhost.com (localhost [127.0.0.1]) by pdx1-sub0-mail-a17.g.dreamhost.com (Postfix) with ESMTP id D79027F6DB for ; Thu, 4 Jul 2019 13:50:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=linkov.net; h=from:to :subject:references:date:in-reply-to:message-id:mime-version :content-type; s=linkov.net; bh=TQ8bsbUBJLDhaWlQfzjXAM0Lua0=; b= toyjFPzRXkU3naaxHo469a11oke8nOVR/mM3xZ3QB8NamCsuPQsVF9TZP+AWHPzi hZmBNBC3aBhXPmDfXjJvc0UrMYjxwd0ThspvuZhR5vT308vRgKtcdZcafpzhNhwg CfxRuLbUuCQEgef/eDpve/LMR4NJeuJym32N1O7VGfc= Received: from mail.jurta.org (m91-129-109-127.cust.tele2.ee [91.129.109.127]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: jurta@jurta.org) by pdx1-sub0-mail-a17.g.dreamhost.com (Postfix) with ESMTPSA id 55E617F69A for ; Thu, 4 Jul 2019 13:49:59 -0700 (PDT) X-DH-BACKEND: pdx1-sub0-mail-a17 From: Juri Linkov To: control@debbugs.gnu.org Subject: Re: bug#36398: Lax char-fold search Organization: LINKOV.NET References: <87a7e4rq1z.fsf@mail.linkov.net> Date: Thu, 04 Jul 2019 23:49:45 +0300 In-Reply-To: <87a7e4rq1z.fsf@mail.linkov.net> (Juri Linkov's message of "Thu, 27 Jun 2019 00:12:08 +0300") Message-ID: <877e8xy17a.fsf@mail.linkov.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain X-VR-OUT-STATUS: OK X-VR-OUT-SCORE: 0 X-VR-OUT-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgeduvddrfedvgdduheeiucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuggftfghnshhusghstghrihgsvgdpffftgfetoffjqffuvfenuceurghilhhouhhtmecufedttdenucenucfjughrpefhvffuohhfffgjkfgfgggtsehttdertddtredtnecuhfhrohhmpefluhhrihcunfhinhhkohhvuceojhhurhhisehlihhnkhhovhdrnhgvtheqnecukfhppeeluddruddvledruddtledruddvjeenucfrrghrrghmpehmohguvgepshhmthhppdhhvghlohepmhgrihhlrdhjuhhrthgrrdhorhhgpdhinhgvthepledurdduvdelrddutdelrdduvdejpdhrvghtuhhrnhdqphgrthhhpefluhhrihcunfhinhhkohhvuceojhhurhhisehlihhnkhhovhdrnhgvtheqpdhmrghilhhfrhhomhepjhhurhhisehlihhnkhhovhdrnhgvthdpnhhrtghpthhtoheptghonhhtrhholhesuggvsggsuhhgshdrghhnuhdrohhrghenucevlhhushhtvghrufhiiigvpedt X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) severity 36398 wishlist tags 36398 + patch close 36398 27.0.50 From unknown Tue Sep 23 23:21:45 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Fri, 02 Aug 2019 11:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator