From unknown Mon Sep 08 01:50:27 2025 X-Loop: help-debbugs@gnu.org Subject: bug#39483: 27.0.60; ispell ignores syntax/category tables word boundaries Resent-From: "Paul W. Rankin" Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 07 Feb 2020 15:46:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 39483 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: 39483@debbugs.gnu.org X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.158109030728823 (code B ref -1); Fri, 07 Feb 2020 15:46:01 +0000 Received: (at submit) by debbugs.gnu.org; 7 Feb 2020 15:45:07 +0000 Received: from localhost ([127.0.0.1]:50153 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j05oY-0007Up-Qg for submit@debbugs.gnu.org; Fri, 07 Feb 2020 10:45:07 -0500 Received: from lists.gnu.org ([209.51.188.17]:35233) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j05oX-0007Uh-TX for submit@debbugs.gnu.org; Fri, 07 Feb 2020 10:45:06 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:45161) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j05oW-000686-JC for bug-gnu-emacs@gnu.org; Fri, 07 Feb 2020 10:45:05 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.7 required=5.0 tests=BAYES_40,RCVD_IN_DNSWL_LOW, URIBL_BLOCKED autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j05oV-0004qV-Hb for bug-gnu-emacs@gnu.org; Fri, 07 Feb 2020 10:45:04 -0500 Received: from wout2-smtp.messagingengine.com ([64.147.123.25]:57377) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1j05oV-0004ob-9s for bug-gnu-emacs@gnu.org; Fri, 07 Feb 2020 10:45:03 -0500 Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.west.internal (Postfix) with ESMTP id 078DE50F for ; Fri, 7 Feb 2020 10:45:01 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Fri, 07 Feb 2020 10:45:02 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=paulwrankin.com; h=from:to:subject:date:message-id:mime-version:content-type :content-transfer-encoding; s=fm1; bh=OonGEMF+BzvClldV8lmSPDrkuF CRcyCnQ5wGJ/JlBKA=; b=rROcWXcFyMOjChs8FVQ49+BL0DrQ3eeqRKwkcF05Px 3o6aViOQxq65SRXts6wQUJw5FYa1Etg4DsLJ/JRsw/uPnLP29FGrfop6XgXg1IBA 35lLJPGAk/mGFQWRMF2d+hgTSDzrxqh36O3w6EzN7qDeTyDYeZaKLFxG/f9eAPmR LmmHQqVGGS+Af12RfivVt5c3QurAoaGR/wE3w9cD2SSAlXY4HPJM5P0gCoTCWu64 zLDGi3hOmozewXNcmn0K2yhiBIxMD74yVZMYX9tUWCxYnz9d7/sOImLycJ0hFz65 TyJWnG9xwOqIkqxypWULe0dkiwT5qABAGh/f5NzuIR/g== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=content-transfer-encoding:content-type :date:from:message-id:mime-version:subject:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; bh=OonGEM F+BzvClldV8lmSPDrkuFCRcyCnQ5wGJ/JlBKA=; b=WnxvMg3FmNdRG0QDLMDTPY ouzc63Bj6vyEIq0NOe4BM+ghH1s59wRplhsmlcqvO81pEPMdE6pgsVUuX0nYMsgu JFqfSpfp4aiSH4t4a3CVnIBRfs9gp3SEeHqsqf4OSh7LgtWn5ivfs/fIrDquvJCc TtSkibJd2y94/gPHp0srtXn2D+IjnFub+yFKVda9QqlKpvxeeqjgwOoi2kYIWYsW 3EeewrQdx46isrt2hapOBM3JjMQka2ieJe3L5EugrZjWljsFdjQ4sNRxS+r87xpb Ly4zbPaC8e7EqLy26ct9/RbK9HjtaqbQY3ZxqtpEzrnP8TtE/7R2cKkYdECvChcw == X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedugedrheehgdektdcutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhepfgfhvffufffkgggtgfesthhqredttd erjeenucfhrhhomhepfdfrrghulhcuhgdrucftrghnkhhinhdfuceohhgvlhhlohesphgr uhhlfihrrghnkhhinhdrtghomheqnecuffhomhgrihhnpehprghulhifrhgrnhhkihhnrd gtohhmnecukfhppeduvddtrddvvddrheegrdduheefnecuvehluhhsthgvrhfuihiivgep tdenucfrrghrrghmpehmrghilhhfrhhomhephhgvlhhlohesphgruhhlfihrrghnkhhinh drtghomh X-ME-Proxy: Received: from localhost (unknown [120.22.54.153]) by mail.messagingengine.com (Postfix) with ESMTPA id EEDFE30600DC for ; Fri, 7 Feb 2020 10:44:59 -0500 (EST) User-agent: mu4e 1.2.0; emacs 27.0.60 From: "Paul W. Rankin" Date: Sat, 08 Feb 2020 01:44:52 +1000 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 64.147.123.25 X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Hello, It appears that the function `ispell-get-word' makes its own judgements on word boundaries, ignoring the buffer's syntax tables and character categories. This becomes a problem with using `electric-quote-mode' and ispell, because contractions are parsed as separate words. e.g. Calling `ispell-word' for "doesn=E2=80=99t" returns: T is correct To reproduce: 1. emacs -Q 2. (in *scratch*) M-x text-mode RET 3. enter text "doesn=E2=80=99t" (i.e. "doesn" C-x 8 ] "t") 4. M-: (modify-syntax-entry ?=E2=80=99 "w") 5. M-: (modify-category-entry ?=E2=80=99 ?^) 6. M-$ | ispell-word Expected results: Given the above syntax and category tables, M-f | forward-word and M-b | backward-word now consider "doesn=E2=80=99t" as a single word, and so should should be passed to the `ispell-program-name' and produce the same result as when checked on the command line: % echo "doesn=E2=80=99t" | aspell -a @(#) International Ispell Version 3.1.20 (but really Aspell 0.60.8) * % echo "doesn=E2=80=99t" | enchant-2 -a @(#) International Ispell Version 3.1.20 (but really Enchant 2.2.7) * Actual results: The word "doesn=E2=80=99t" is parsed as "t": T is correct Attempts at workarounds: I've tried altering slot 3 of the corresponding `ispell-dictionary-base-ali= st' entries from "[']" to "['=E2=80=99]" to no avail. Setup: GNU Emacs 27.0.60 (build 2, x86_64-apple-darwin19.3.0, NS appkit-1894.30 Version 10.15.3 (Build 19D76)) of 2020-02-05 --=20 Paul W. Rankin https://www.paulwrankin.com From unknown Mon Sep 08 01:50:27 2025 X-Loop: help-debbugs@gnu.org Subject: bug#39483: 27.0.60; ispell ignores syntax/category tables word boundaries Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 07 Feb 2020 18:25:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 39483 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: "Paul W. Rankin" Cc: 39483@debbugs.gnu.org Received: via spool by 39483-submit@debbugs.gnu.org id=B39483.158109984111146 (code B ref 39483); Fri, 07 Feb 2020 18:25:01 +0000 Received: (at 39483) by debbugs.gnu.org; 7 Feb 2020 18:24:01 +0000 Received: from localhost ([127.0.0.1]:50233 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j08IL-0002te-J9 for submit@debbugs.gnu.org; Fri, 07 Feb 2020 13:24:01 -0500 Received: from eggs.gnu.org ([209.51.188.92]:38713) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j08IJ-0002tN-Rn for 39483@debbugs.gnu.org; Fri, 07 Feb 2020 13:24:00 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:57790) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1j08IE-0008S3-79; Fri, 07 Feb 2020 13:23:54 -0500 Received: from [176.228.60.248] (port=3123 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1j08ID-00013x-JM; Fri, 07 Feb 2020 13:23:54 -0500 Date: Fri, 07 Feb 2020 20:23:33 +0200 Message-Id: <83r1z6dzoa.fsf@gnu.org> From: Eli Zaretskii In-reply-to: (hello@paulwrankin.com) References: MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) > From: "Paul W. Rankin" > Date: Sat, 08 Feb 2020 01:44:52 +1000 > > It appears that the function `ispell-get-word' makes its own judgements > on word boundaries, ignoring the buffer's syntax tables and character > categories. That is true. And I don't really see how it can be any different, since ispell.el must have the same notion of a word as the underlying dictionary, otherwise you will have false positives and/or false negatives, right? ispell.el looks up the word characters and non-word characters in its database, and the doc string of ispell-dictionary-base-alist explains how. > This becomes a problem with using `electric-quote-mode' and > ispell, because contractions are parsed as separate words. e.g. Calling > ispell word for "doesn’t" returns: > > T is correct > > To reproduce: > > 1. emacs -Q > 2. (in *scratch*) M-x text-mode RET > 3. enter text "doesn’t" (i.e. "doesn" C-x 8 ] "t") > 4. M-: (modify-syntax-entry ?’ "w") > 5. M-: (modify-category-entry ?’ ?^) > 6. M-$ | ispell-word The buffer syntax table has no effect on ispell.el, and shouldn't have any effect on it. > Attempts at workarounds: > > I've tried altering slot 3 of the corresponding `ispell-dictionary-base-alist' > entries from "[']" to "['’]" to no avail. That's the right direction, but you didn't follow it far enough. First, ispell-dictionary-base-alist is the default value, and is used to produce ispell-dictionary-alist, which is one you should change (alternatively, customize ispell-local-dictionary-alist). More importantly, the definitions of each dictionary include more than just one character set: there are 3 character sets there and one parameter for encoding the string passed to the spell-checker, and you should be sure to set them all as appropriate for the dictionary you use. My suggestion is to step with Edebug through ispell-get-word and see why it doesn't consider "doesn’t" as a single word in your case. > Setup: > > GNU Emacs 27.0.60 (build 2, x86_64-apple-darwin19.3.0, NS appkit-1894.30 > Version 10.15.3 (Build 19D76)) of 2020-02-05 This omits crucial information, like the dictionary in use and the locale-dependent settings that affect encoding. (In any case, I don't think this list is the right place of discussing this issue.) From unknown Mon Sep 08 01:50:27 2025 X-Loop: help-debbugs@gnu.org Subject: bug#39483: 27.0.60; ispell ignores syntax/category tables word boundaries Resent-From: "Paul W. Rankin" Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 08 Feb 2020 05:48:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 39483 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: 39483@debbugs.gnu.org Received: via spool by 39483-submit@debbugs.gnu.org id=B39483.158114086323983 (code B ref 39483); Sat, 08 Feb 2020 05:48:01 +0000 Received: (at 39483) by debbugs.gnu.org; 8 Feb 2020 05:47:43 +0000 Received: from localhost ([127.0.0.1]:50557 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j0Ixy-0006El-U5 for submit@debbugs.gnu.org; Sat, 08 Feb 2020 00:47:43 -0500 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:47015) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j0Ixw-0006EX-Py for 39483@debbugs.gnu.org; Sat, 08 Feb 2020 00:47:41 -0500 Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id 83EE721B13; Sat, 8 Feb 2020 00:47:35 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute3.internal (MEProxy); Sat, 08 Feb 2020 00:47:35 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=paulwrankin.com; h=references:from:to:cc:subject:in-reply-to:date:message-id :mime-version:content-type:content-transfer-encoding; s=fm1; bh= q1Gd3vhTus349I9lQIzd8rs6hxRCDCgQaPsiJPGIUEk=; b=JLYr3tI9UVzjzjrG muMqLNx9UQ3VnnUmyqqWXv48ywicmi+MrX+iBtEeQ9tmcASIn6PrlOktXcC0OQnB U/d3SdDKhRGe/IPK54Gn2/u/R7rCtB4vikstcF908XyIHuJ48QmJnwWAQi4o7dXX 7PvlfGELq0uEdX4tXboAjIUJQio4g7vzZDHO/m66I/pW6m/l/knd3XiqogQp60qN 6A8BrgnL8VTqEOgN+flEX2+gwVf0lUcY28b1kpO2T88Ss0uxMhBoT+ErZyesssgU GhkmqOpK5SEKPEkVTCWzBuTDg3+V/OQTYjh6xBjXtxNj9PfAvdehJ9kAL8Sh8PA1 KaQgRg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm2; bh=q1Gd3vhTus349I9lQIzd8rs6hxRCDCgQaPsiJPGIU Ek=; b=CBbxxNS3YIbCYdr2JcPBEoKqwvj3bFnDwz7n7yLKZO9Bv8g4iFBxyMewz Uq4xsBu3SvtpjPPoMRnQZ203HQU60TESig6CNekFNPzhGbhflp1CrXXR2fbzaJ/w opIebRzScp1LYxoxNPtfjEoZ/3ctF2xWvWsNsS3RAzB5T/RYVv+DyK8ToHK6hEMJ gdxv9U3hu1NV/X3pEp1VEPYo/pZDhTW7V6mlkbhZLH8Gi0bbZbOIz/CNu4kXVhol FkP351cxXs5XXcdyoD7sL+zT+j4ge2TVYFiT4kVBx7Nc7VhUUNns504RNPvwMlBv cppqfC/Uoyl7WPKOS4n9f5ne3bK4w== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedugedrheeigdekjecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpehffgfhvffujgffkfggtgfgsehtqhertddtreejnecuhfhrohhmpedfrfgruhhl ucghrdcutfgrnhhkihhnfdcuoehhvghllhhosehprghulhifrhgrnhhkihhnrdgtohhmqe enucfkphepuddvtddrvddvrdduleegrdejgeenucevlhhushhtvghrufhiiigvpedtnecu rfgrrhgrmhepmhgrihhlfhhrohhmpehhvghllhhosehprghulhifrhgrnhhkihhnrdgtoh hm X-ME-Proxy: Received: from localhost (unknown [120.22.194.74]) by mail.messagingengine.com (Postfix) with ESMTPA id 761EE328005E; Sat, 8 Feb 2020 00:47:33 -0500 (EST) References: <83r1z6dzoa.fsf@gnu.org> User-agent: mu4e 1.2.0; emacs 27.0.60 From: "Paul W. Rankin" In-reply-to: <83r1z6dzoa.fsf@gnu.org> Date: Sat, 08 Feb 2020 15:47:27 +1000 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) On Sat, Feb 08 2020, Eli Zaretskii wrote: >> From: "Paul W. Rankin" >> Attempts at workarounds: >>=20 >> I've tried altering slot 3 of the corresponding `ispell-dictionary-base-= alist' >> entries from "[']" to "['=E2=80=99]" to no avail. > > That's the right direction, but you didn't follow it far enough. > First, ispell-dictionary-base-alist is the default value, and is used > to produce ispell-dictionary-alist, which is one you should change > (alternatively, customize ispell-local-dictionary-alist). Thanks, that got it. I'd discussed this on #emacs IRC and the consensus was to report. Lead astray!! From unknown Mon Sep 08 01:50:27 2025 X-Loop: help-debbugs@gnu.org Subject: bug#39483: 27.0.60; ispell ignores syntax/category tables word boundaries Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 08 Feb 2020 08:19:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 39483 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: "Paul W. Rankin" Cc: 39483@debbugs.gnu.org Received: via spool by 39483-submit@debbugs.gnu.org id=B39483.15811499316259 (code B ref 39483); Sat, 08 Feb 2020 08:19:01 +0000 Received: (at 39483) by debbugs.gnu.org; 8 Feb 2020 08:18:51 +0000 Received: from localhost ([127.0.0.1]:50589 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j0LKF-0001ct-6L for submit@debbugs.gnu.org; Sat, 08 Feb 2020 03:18:51 -0500 Received: from eggs.gnu.org ([209.51.188.92]:35580) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j0LKD-0001cg-4x for 39483@debbugs.gnu.org; Sat, 08 Feb 2020 03:18:50 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:42888) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1j0LK7-0007ak-VB; Sat, 08 Feb 2020 03:18:43 -0500 Received: from [176.228.60.248] (port=2421 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1j0LK1-0006ej-UY; Sat, 08 Feb 2020 03:18:38 -0500 Date: Sat, 08 Feb 2020 10:18:20 +0200 Message-Id: <83h801eblf.fsf@gnu.org> From: Eli Zaretskii In-reply-to: (hello@paulwrankin.com) References: <83r1z6dzoa.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) > From: "Paul W. Rankin" > Cc: 39483@debbugs.gnu.org > Date: Sat, 08 Feb 2020 15:47:27 +1000 > > On Sat, Feb 08 2020, Eli Zaretskii wrote: > > >> From: "Paul W. Rankin" > >> Attempts at workarounds: > >> > >> I've tried altering slot 3 of the corresponding `ispell-dictionary-base-alist' > >> entries from "[']" to "['’]" to no avail. > > > > That's the right direction, but you didn't follow it far enough. > > First, ispell-dictionary-base-alist is the default value, and is used > > to produce ispell-dictionary-alist, which is one you should change > > (alternatively, customize ispell-local-dictionary-alist). > > Thanks, that got it. I'd be interested to see your solution in full, for the record. Thanks. From unknown Mon Sep 08 01:50:27 2025 X-Loop: help-debbugs@gnu.org Subject: bug#39483: 27.0.60; ispell ignores syntax/category tables word boundaries Resent-From: "Paul W. Rankin" Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 08 Feb 2020 09:29:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 39483 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: 39483@debbugs.gnu.org Received: via spool by 39483-submit@debbugs.gnu.org id=B39483.158115413612705 (code B ref 39483); Sat, 08 Feb 2020 09:29:02 +0000 Received: (at 39483) by debbugs.gnu.org; 8 Feb 2020 09:28:56 +0000 Received: from localhost ([127.0.0.1]:50606 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j0MQ3-0003Ir-SF for submit@debbugs.gnu.org; Sat, 08 Feb 2020 04:28:56 -0500 Received: from out4-smtp.messagingengine.com ([66.111.4.28]:47941) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j0MQ2-0003If-BP for 39483@debbugs.gnu.org; Sat, 08 Feb 2020 04:28:54 -0500 Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id 3E88621F16; Sat, 8 Feb 2020 04:28:49 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Sat, 08 Feb 2020 04:28:49 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=paulwrankin.com; h=references:from:to:cc:subject:in-reply-to:date:message-id :mime-version:content-type:content-transfer-encoding; s=fm1; bh= bO3gG6KwRDlGeX0IBikbXKvTmKFd43XkCOKhyWKJe+c=; b=jqPt4DvXFtMX0qp6 xjjRO6vbGr5nap4fHtPwpIKDcyHLYY+H4sE5n4UlE/qfOnt7FU4WmImBRe1ouGXB 3U2ARSojpNZWXS73B8Io4NkimeGniioAVV8trmkT1RlEzoNkXkeQZpIZmFkh+tlQ OIt8d6FXm/x/5O8yJHR8HwNVWLqh0MO7JBUGV/Y6O5X9RZoZt1thq5t92Oj3S8kT JXiQlY4R9MFvuR1CLdUeDWWilVvHUddu0q8iXRh+ZlonXWmrsWFHMpxFuuJKVfxY i2Ds6RWMqdjNDi/+lzzddbWGgA7x7b5JvqvMiJHjsf9BbFqihn7Q+SJWBdZjljvl bAeG4w== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm2; bh=bO3gG6KwRDlGeX0IBikbXKvTmKFd43XkCOKhyWKJe +c=; b=MIU3P2EDQaJHpYkyvbuj2mWj41GrO8Frtxa9q5RtSjnLg0JcjsDj3zuB8 mncqBGzhenLeyeMEAZvzeD6Pivh9HqltoAmYekitF8WwD/J46VGI192h7S9NQ89J euzImdyEjpZqlnsG6n9JRx0XhCFF+S/RZnWCekuzCzaaVERdZVXuoOdSgghoj1C8 tD5ztOz+xLo6BXp0uuZDLLcUcSN16EJlCX5A6EYCfKAmPTlTWJh4Nj193beXvJSo WHAMiw/RcpFGLzAqFH2pFm8HAsDmqMs1vzorAkjl5MMjNc4iiX4MLsMsq7WvLnw8 ld9ZCAZU0HGj15jehuT3HwCKIz43Q== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedugedrheejgddtgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpehffgfhvffujgffkfggtgfgsehtqhertddtreejnecuhfhrohhmpedfrfgruhhl ucghrdcutfgrnhhkihhnfdcuoehhvghllhhosehprghulhifrhgrnhhkihhnrdgtohhmqe enucfkphepuddvtddrvddvrdduleegrdejgeenucevlhhushhtvghrufhiiigvpedtnecu rfgrrhgrmhepmhgrihhlfhhrohhmpehhvghllhhosehprghulhifrhgrnhhkihhnrdgtoh hm X-ME-Proxy: Received: from localhost (unknown [120.22.194.74]) by mail.messagingengine.com (Postfix) with ESMTPA id B538C30605C8; Sat, 8 Feb 2020 04:28:47 -0500 (EST) References: <83r1z6dzoa.fsf@gnu.org> <83h801eblf.fsf@gnu.org> User-agent: mu4e 1.2.0; emacs 27.0.60 From: "Paul W. Rankin" In-reply-to: <83h801eblf.fsf@gnu.org> Date: Sat, 08 Feb 2020 19:28:40 +1000 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) On Sat, Feb 08 2020, Eli Zaretskii wrote: >> >> From: "Paul W. Rankin" >> >> Attempts at workarounds: >> >> >> >> I've tried altering slot 3 of the corresponding `ispell-dictionary-ba= se-alist' >> >> entries from "[']" to "['=E2=80=99]" to no avail. >> > >> > That's the right direction, but you didn't follow it far enough. >> > First, ispell-dictionary-base-alist is the default value, and is used >> > to produce ispell-dictionary-alist, which is one you should change >> > (alternatively, customize ispell-local-dictionary-alist). >> >> Thanks, that got it. > > I'd be interested to see your solution in full, for the record. I went down the wrong path with syntax tables when I saw M-f/M-b was stepping through the word like doesn|=E2=80=99|t| so I figured it was about= word boundaries. Searching through the manual I couldn't find anything in "(emacs) Quotation Marks" or "(emacs) Spelling" but found the references to syntax tables regarding word boundaries in "(elisp) Word Motion". As it turns out it was just a case of customising ispell-local-dictionary-alist and adding both a default and "en_US" entry with OTHERCHARS regexp as "['=E2=80=99]" pretty much exactly as the docstring on ispell-dictionary-alist says. From unknown Mon Sep 08 01:50:27 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: "Paul W. Rankin" Subject: bug#39483: closed (Re: bug#39483: 27.0.60; ispell ignores syntax/category tables word boundaries) Message-ID: References: <838slde6la.fsf@gnu.org> X-Gnu-PR-Message: they-closed 39483 X-Gnu-PR-Package: emacs Reply-To: 39483@debbugs.gnu.org Date: Sat, 08 Feb 2020 10:07:02 +0000 Content-Type: multipart/mixed; boundary="----------=_1581156422-16319-1" This is a multi-part message in MIME format... ------------=_1581156422-16319-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #39483: 27.0.60; ispell ignores syntax/category tables word boundaries which was filed against the emacs package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 39483@debbugs.gnu.org. --=20 39483: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D39483 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1581156422-16319-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 39483-done) by debbugs.gnu.org; 8 Feb 2020 10:06:49 +0000 Received: from localhost ([127.0.0.1]:50653 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j0N0j-0004Ep-Lu for submit@debbugs.gnu.org; Sat, 08 Feb 2020 05:06:49 -0500 Received: from eggs.gnu.org ([209.51.188.92]:52862) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j0N0i-0004Ee-IZ for 39483-done@debbugs.gnu.org; Sat, 08 Feb 2020 05:06:48 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:43938) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1j0N0d-0003aa-Af; Sat, 08 Feb 2020 05:06:43 -0500 Received: from [176.228.60.248] (port=1173 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1j0N0b-0007Xq-Vl; Sat, 08 Feb 2020 05:06:42 -0500 Date: Sat, 08 Feb 2020 12:06:25 +0200 Message-Id: <838slde6la.fsf@gnu.org> From: Eli Zaretskii To: "Paul W. Rankin" In-reply-to: (hello@paulwrankin.com) Subject: Re: bug#39483: 27.0.60; ispell ignores syntax/category tables word boundaries References: <83r1z6dzoa.fsf@gnu.org> <83h801eblf.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 39483-done Cc: 39483-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) > From: "Paul W. Rankin" > Cc: 39483@debbugs.gnu.org > Date: Sat, 08 Feb 2020 19:28:40 +1000 > > As it turns out it was just a case of customising > ispell-local-dictionary-alist and adding both a default and "en_US" > entry with OTHERCHARS regexp as "['’]" pretty much exactly as the > docstring on ispell-dictionary-alist says. OK, thanks. With that, I'm closing the bug report. ------------=_1581156422-16319-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 7 Feb 2020 15:45:07 +0000 Received: from localhost ([127.0.0.1]:50153 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j05oY-0007Up-Qg for submit@debbugs.gnu.org; Fri, 07 Feb 2020 10:45:07 -0500 Received: from lists.gnu.org ([209.51.188.17]:35233) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j05oX-0007Uh-TX for submit@debbugs.gnu.org; Fri, 07 Feb 2020 10:45:06 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:45161) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j05oW-000686-JC for bug-gnu-emacs@gnu.org; Fri, 07 Feb 2020 10:45:05 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.7 required=5.0 tests=BAYES_40,RCVD_IN_DNSWL_LOW, URIBL_BLOCKED autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j05oV-0004qV-Hb for bug-gnu-emacs@gnu.org; Fri, 07 Feb 2020 10:45:04 -0500 Received: from wout2-smtp.messagingengine.com ([64.147.123.25]:57377) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1j05oV-0004ob-9s for bug-gnu-emacs@gnu.org; Fri, 07 Feb 2020 10:45:03 -0500 Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.west.internal (Postfix) with ESMTP id 078DE50F for ; Fri, 7 Feb 2020 10:45:01 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Fri, 07 Feb 2020 10:45:02 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=paulwrankin.com; h=from:to:subject:date:message-id:mime-version:content-type :content-transfer-encoding; s=fm1; bh=OonGEMF+BzvClldV8lmSPDrkuF CRcyCnQ5wGJ/JlBKA=; b=rROcWXcFyMOjChs8FVQ49+BL0DrQ3eeqRKwkcF05Px 3o6aViOQxq65SRXts6wQUJw5FYa1Etg4DsLJ/JRsw/uPnLP29FGrfop6XgXg1IBA 35lLJPGAk/mGFQWRMF2d+hgTSDzrxqh36O3w6EzN7qDeTyDYeZaKLFxG/f9eAPmR LmmHQqVGGS+Af12RfivVt5c3QurAoaGR/wE3w9cD2SSAlXY4HPJM5P0gCoTCWu64 zLDGi3hOmozewXNcmn0K2yhiBIxMD74yVZMYX9tUWCxYnz9d7/sOImLycJ0hFz65 TyJWnG9xwOqIkqxypWULe0dkiwT5qABAGh/f5NzuIR/g== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=content-transfer-encoding:content-type :date:from:message-id:mime-version:subject:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; bh=OonGEM F+BzvClldV8lmSPDrkuFCRcyCnQ5wGJ/JlBKA=; b=WnxvMg3FmNdRG0QDLMDTPY ouzc63Bj6vyEIq0NOe4BM+ghH1s59wRplhsmlcqvO81pEPMdE6pgsVUuX0nYMsgu JFqfSpfp4aiSH4t4a3CVnIBRfs9gp3SEeHqsqf4OSh7LgtWn5ivfs/fIrDquvJCc TtSkibJd2y94/gPHp0srtXn2D+IjnFub+yFKVda9QqlKpvxeeqjgwOoi2kYIWYsW 3EeewrQdx46isrt2hapOBM3JjMQka2ieJe3L5EugrZjWljsFdjQ4sNRxS+r87xpb Ly4zbPaC8e7EqLy26ct9/RbK9HjtaqbQY3ZxqtpEzrnP8TtE/7R2cKkYdECvChcw == X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedugedrheehgdektdcutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhepfgfhvffufffkgggtgfesthhqredttd erjeenucfhrhhomhepfdfrrghulhcuhgdrucftrghnkhhinhdfuceohhgvlhhlohesphgr uhhlfihrrghnkhhinhdrtghomheqnecuffhomhgrihhnpehprghulhifrhgrnhhkihhnrd gtohhmnecukfhppeduvddtrddvvddrheegrdduheefnecuvehluhhsthgvrhfuihiivgep tdenucfrrghrrghmpehmrghilhhfrhhomhephhgvlhhlohesphgruhhlfihrrghnkhhinh drtghomh X-ME-Proxy: Received: from localhost (unknown [120.22.54.153]) by mail.messagingengine.com (Postfix) with ESMTPA id EEDFE30600DC for ; Fri, 7 Feb 2020 10:44:59 -0500 (EST) User-agent: mu4e 1.2.0; emacs 27.0.60 From: "Paul W. Rankin" To: bug-gnu-emacs@gnu.org Subject: 27.0.60; ispell ignores syntax/category tables word boundaries Date: Sat, 08 Feb 2020 01:44:52 +1000 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 64.147.123.25 X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Hello, It appears that the function `ispell-get-word' makes its own judgements on word boundaries, ignoring the buffer's syntax tables and character categories. This becomes a problem with using `electric-quote-mode' and ispell, because contractions are parsed as separate words. e.g. Calling `ispell-word' for "doesn=E2=80=99t" returns: T is correct To reproduce: 1. emacs -Q 2. (in *scratch*) M-x text-mode RET 3. enter text "doesn=E2=80=99t" (i.e. "doesn" C-x 8 ] "t") 4. M-: (modify-syntax-entry ?=E2=80=99 "w") 5. M-: (modify-category-entry ?=E2=80=99 ?^) 6. M-$ | ispell-word Expected results: Given the above syntax and category tables, M-f | forward-word and M-b | backward-word now consider "doesn=E2=80=99t" as a single word, and so should should be passed to the `ispell-program-name' and produce the same result as when checked on the command line: % echo "doesn=E2=80=99t" | aspell -a @(#) International Ispell Version 3.1.20 (but really Aspell 0.60.8) * % echo "doesn=E2=80=99t" | enchant-2 -a @(#) International Ispell Version 3.1.20 (but really Enchant 2.2.7) * Actual results: The word "doesn=E2=80=99t" is parsed as "t": T is correct Attempts at workarounds: I've tried altering slot 3 of the corresponding `ispell-dictionary-base-ali= st' entries from "[']" to "['=E2=80=99]" to no avail. Setup: GNU Emacs 27.0.60 (build 2, x86_64-apple-darwin19.3.0, NS appkit-1894.30 Version 10.15.3 (Build 19D76)) of 2020-02-05 --=20 Paul W. Rankin https://www.paulwrankin.com ------------=_1581156422-16319-1--