From unknown Mon Aug 18 14:19:41 2025
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Mailer: MIME-tools 5.509 (Entity 5.509)
Content-Type: text/plain; charset=utf-8
From: bug#23019 <23019@debbugs.gnu.org>
To: bug#23019 <23019@debbugs.gnu.org>
Subject: Status: parse-partial-sexp doesn't output the full state needed
 for its continuance.
Reply-To: bug#23019 <23019@debbugs.gnu.org>
Date: Mon, 18 Aug 2025 21:19:41 +0000

retitle 23019 parse-partial-sexp doesn't output the full state needed for i=
ts continuance.
reassign 23019 emacs
submitter 23019 Alan Mackenzie <acm@muc.de>
severity 23019 normal

thanks


From debbugs-submit-bounces@debbugs.gnu.org Tue Mar 15 05:11:31 2016
Received: (at submit) by debbugs.gnu.org; 15 Mar 2016 09:11:31 +0000
Received: from localhost ([127.0.0.1]:48515 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces@debbugs.gnu.org>)
	id 1afl0l-0003AG-5z
	for submit@debbugs.gnu.org; Tue, 15 Mar 2016 05:11:31 -0400
Received: from eggs.gnu.org ([208.118.235.92]:46028)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <acm@muc.de>) id 1afl0j-00039y-PG
 for submit@debbugs.gnu.org; Tue, 15 Mar 2016 05:11:29 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <acm@muc.de>) id 1afl0d-0001Kr-SO
 for submit@debbugs.gnu.org; Tue, 15 Mar 2016 05:11:24 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled
 version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:56957)
 by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <acm@muc.de>)
 id 1afl0d-0001Kn-Oq
 for submit@debbugs.gnu.org; Tue, 15 Mar 2016 05:11:23 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:36979)
 by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <acm@muc.de>)
 id 1afl0Y-0006nC-PY
 for bug-gnu-emacs@gnu.org; Tue, 15 Mar 2016 05:11:23 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <acm@muc.de>) id 1afl0V-0001Hl-Bl
 for bug-gnu-emacs@gnu.org; Tue, 15 Mar 2016 05:11:18 -0400
Received: from mail.muc.de ([193.149.48.3]:51332)
 by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <acm@muc.de>)
 id 1afl0V-0001HW-2w
 for bug-gnu-emacs@gnu.org; Tue, 15 Mar 2016 05:11:15 -0400
Received: (qmail 8762 invoked by uid 3782); 15 Mar 2016 09:11:13 -0000
Received: from acm.muc.de (p548A54E8.dip0.t-ipconnect.de [84.138.84.232]) by
 colin.muc.de (tmda-ofmipd) with ESMTP;
 Tue, 15 Mar 2016 10:11:11 +0100
Received: (qmail 3166 invoked by uid 1000); 15 Mar 2016 09:13:55 -0000
Date: Tue, 15 Mar 2016 09:13:55 +0000
To: bug-gnu-emacs@gnu.org
Subject: parse-partial-sexp doesn't output the full state needed for its
 continuance.
Message-ID: <20160315091355.GA2263@acm.fritz.box>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.5.24 (2015-08-30)
X-Delivery-Agent: TMDA/1.1.12 (Macallan)
From: Alan Mackenzie <acm@muc.de>
X-Primary-Address: acm@muc.de
X-detected-operating-system: by eggs.gnu.org: FreeBSD 9.x
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -4.4 (----)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit@debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request@debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
X-Spam-Score: -4.4 (----)

Hello, Emacs.

When parse-partial-sexp finishes a parse, it fails to record whether or
not its end point is just after the first character of a two character
comment starter or ender.  When the resulting state is used as an
argument to resume the parse, p-p-s will be unaware that the comment has
started or ended and produce false results.

Proposed solution: Add an extra element to the parser state, recording the
syntax of the last character passed over before the end of the parse.
This would be used by parse-partial-sexp to initialise its parse.

Also: the existing element 9 (the list of currently open parens) and the
new element should be explicitly documented in the Elisp manual, together
with a statement that there may be further elements in the parse state
used internally by parse-partial-sexp (for future expansion).

-- 
Alan Mackenzie (Nuremberg, Germany).


From debbugs-submit-bounces@debbugs.gnu.org Tue Mar 15 05:34:01 2016
Received: (at submit) by debbugs.gnu.org; 15 Mar 2016 09:34:01 +0000
Received: from localhost ([127.0.0.1]:48522 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces@debbugs.gnu.org>)
	id 1aflMX-0003hO-3T
	for submit@debbugs.gnu.org; Tue, 15 Mar 2016 05:34:01 -0400
Received: from eggs.gnu.org ([208.118.235.92]:52454)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <andreas.roehler@easy-emacs.de>) id 1aflMV-0003h6-E8
 for submit@debbugs.gnu.org; Tue, 15 Mar 2016 05:33:59 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <andreas.roehler@easy-emacs.de>) id 1aflMK-0007uv-Gw
 for submit@debbugs.gnu.org; Tue, 15 Mar 2016 05:33:54 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled
 version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:35898)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <andreas.roehler@easy-emacs.de>) id 1aflMK-0007uq-Db
 for submit@debbugs.gnu.org; Tue, 15 Mar 2016 05:33:48 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:43384)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <andreas.roehler@easy-emacs.de>) id 1aflMJ-0002KO-ER
 for bug-gnu-emacs@gnu.org; Tue, 15 Mar 2016 05:33:48 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <andreas.roehler@easy-emacs.de>) id 1aflME-0007se-Em
 for bug-gnu-emacs@gnu.org; Tue, 15 Mar 2016 05:33:47 -0400
Received: from mout.kundenserver.de ([212.227.126.134]:61274)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <andreas.roehler@easy-emacs.de>) id 1aflME-0007sV-4w
 for bug-gnu-emacs@gnu.org; Tue, 15 Mar 2016 05:33:42 -0400
Received: from [192.168.178.35] ([77.3.14.174]) by mrelayeu.kundenserver.de
 (mreue004) with ESMTPSA (Nemesis) id 0MSFaB-1aGiDf0xxr-00TYEI; Tue, 15 Mar
 2016 10:33:40 +0100
Subject: Re: bug#23019: parse-partial-sexp doesn't output the full state
 needed for its continuance.
To: bug-gnu-emacs@gnu.org
References: <20160315091355.GA2263@acm.fritz.box>
From: =?UTF-8?Q?Andreas_R=c3=b6hler?= <andreas.roehler@easy-emacs.de>
Message-ID: <56E7D74C.4070805@easy-emacs.de>
Date: Tue, 15 Mar 2016 10:35:08 +0100
User-Agent: Mozilla/5.0 (X11; Linux i686; rv:38.0) Gecko/20100101
 Icedove/38.5.0
MIME-Version: 1.0
In-Reply-To: <20160315091355.GA2263@acm.fritz.box>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-Provags-ID: V03:K0:GojAa+tcayaQN/WDVp0/j9cla8X6Qsop9p6Cnuxhf6ptl4YJf/N
 6atQkCtVe3l7ENTb7kwX3QgHDzddky9rjpy72fMfx5copmcOeBUQOL+HSBjNHhaWaHwzey8
 nzNR9aaXlXfgAR5w9SLlgnzi3yK9uc6ghWY/+MQ/W71Euc11ndj6wTk1nCnrkwy00a+KD83
 SrPcS9OoiBUUGQOLMACqQ==
X-UI-Out-Filterresults: notjunk:1;V01:K0:6+/vvnIlhAo=:iA7V4IiE8BZ5dZh5psLEfJ
 +kaWV5MmHCHC3Bb2fu6nW8CcXBZwjM+/UNqnIfPmBMIuTEYvO0ls/hbijS1zA1UGLFt32usdf
 eaAbw2CP8czQ4snLDm1BJUjkGhJxAbczq0uq471PgFSgljq1O2hyn02eMbgDylLJvt470Xgm9
 U8VwbLScitEae9gjnrPypnFojeCJLYyavnVXnnXU+0nCVSOGuzkA3vP9cr+WEjV3dCh9v9Yw/
 U9W45sWOSn0dS7aqrTvOQtwjySlrTOsEEhYVEzsG+brQzG752cuDubIfv0xA7zSsZ/Z6oPbbn
 58QNliWauqh9pIFKxAYmsjoiUcZQCuRsBO1SdKuE/22F/en1xkcNAKZAY+Zpur5PvD0iEGRFm
 +7LqN6NHctKA/u4REMdUARSL0GtepVWm6iNBmf0VTVuoC187m+vtA9xT1TxFlpczk7kLGFZZW
 Oh+KRwRP/Holt/q7rg0onHYJ3UZeYwdOSHMrCtoyoND5rw8+rDYtNuTon7yb26sXGgqmwBptb
 7A6QNFs5vGsWz02BXuBT8idmpTP1mbaMzpnWbokm8++WlKe+gH8Ra6qNvv0K5q0DKvWSNkGtQ
 ooh/cBsU7/lKZL53jxG1Eeqr+o8tJLUDJe+HKzEW5/qasaR8c4/yV8veyQk6YP+K7o97OwRnx
 Ox52ahT/Antyy8DDlRbTBQ0Q4OK2YBHQjzxf7NnQL4TG5TjYgPlrv7vpxz9J2OYOrFYPU85VJ
 GU/X+6vprO7/x+3Y
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -5.0 (-----)
X-Debbugs-Envelope-To: submit
Cc: Alan Mackenzie <acm@muc.de>
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit@debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request@debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
X-Spam-Score: -5.0 (-----)


On 15.03.2016 10:13, Alan Mackenzie wrote:
> Hello, Emacs.
>
> When parse-partial-sexp finishes a parse, it fails to record whether or
> not its end point is just after the first character of a two character
> comment starter or ender.  When the resulting state is used as an
> argument to resume the parse, p-p-s will be unaware that the comment has
> started or ended and produce false results.
>
> Proposed solution: Add an extra element to the parser state, recording the
> syntax of the last character passed over before the end of the parse.
> This would be used by parse-partial-sexp to initialise its parse.
>
> Also: the existing element 9 (the list of currently open parens) and the
> new element should be explicitly documented in the Elisp manual, together
> with a statement that there may be further elements in the parse state
> used internally by parse-partial-sexp (for future expansion).
>

Hi Alan,

a comment start might be composed not just by two characters, but by 
three or more. What then?

Cheers,

Andreas


From debbugs-submit-bounces@debbugs.gnu.org Tue Mar 15 06:12:55 2016
Received: (at submit) by debbugs.gnu.org; 15 Mar 2016 10:12:55 +0000
Received: from localhost ([127.0.0.1]:48544 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces@debbugs.gnu.org>)
	id 1aflyA-0004b1-Og
	for submit@debbugs.gnu.org; Tue, 15 Mar 2016 06:12:54 -0400
Received: from eggs.gnu.org ([208.118.235.92]:37699)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <acm@muc.de>) id 1afly9-0004ak-3A
 for submit@debbugs.gnu.org; Tue, 15 Mar 2016 06:12:53 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <acm@muc.de>) id 1afly3-0008DK-00
 for submit@debbugs.gnu.org; Tue, 15 Mar 2016 06:12:47 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled
 version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:47043)
 by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <acm@muc.de>)
 id 1afly2-0008DG-TQ
 for submit@debbugs.gnu.org; Tue, 15 Mar 2016 06:12:46 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:56881)
 by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <acm@muc.de>)
 id 1afly1-0006bo-Vd
 for bug-gnu-emacs@gnu.org; Tue, 15 Mar 2016 06:12:46 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <acm@muc.de>) id 1aflxy-00088F-LC
 for bug-gnu-emacs@gnu.org; Tue, 15 Mar 2016 06:12:45 -0400
Received: from mail.muc.de ([193.149.48.3]:20852)
 by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <acm@muc.de>)
 id 1aflxy-00087x-CR
 for bug-gnu-emacs@gnu.org; Tue, 15 Mar 2016 06:12:42 -0400
Received: (qmail 21107 invoked by uid 3782); 15 Mar 2016 10:12:41 -0000
Received: from acm.muc.de (p548A54E8.dip0.t-ipconnect.de [84.138.84.232]) by
 colin.muc.de (tmda-ofmipd) with ESMTP;
 Tue, 15 Mar 2016 11:12:38 +0100
Received: (qmail 3502 invoked by uid 1000); 15 Mar 2016 10:15:21 -0000
Date: Tue, 15 Mar 2016 10:15:21 +0000
To: Andreas =?iso-8859-1?Q?R=F6hler?= <andreas.roehler@easy-emacs.de>
Subject: Re: bug#23019: parse-partial-sexp doesn't output the full state
 needed for its continuance.
Message-ID: <20160315101521.GB2263@acm.fritz.box>
References: <20160315091355.GA2263@acm.fritz.box>
 <56E7D74C.4070805@easy-emacs.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <56E7D74C.4070805@easy-emacs.de>
User-Agent: Mutt/1.5.24 (2015-08-30)
X-Delivery-Agent: TMDA/1.1.12 (Macallan)
From: Alan Mackenzie <acm@muc.de>
X-Primary-Address: acm@muc.de
X-detected-operating-system: by eggs.gnu.org: FreeBSD 9.x
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -4.4 (----)
X-Debbugs-Envelope-To: submit
Cc: bug-gnu-emacs@gnu.org
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit@debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request@debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
X-Spam-Score: -4.4 (----)

Hello, Andreas.

On Tue, Mar 15, 2016 at 10:35:08AM +0100, Andreas Röhler wrote:


> On 15.03.2016 10:13, Alan Mackenzie wrote:
> > Hello, Emacs.

> > When parse-partial-sexp finishes a parse, it fails to record whether or
> > not its end point is just after the first character of a two character
> > comment starter or ender.  When the resulting state is used as an
> > argument to resume the parse, p-p-s will be unaware that the comment has
> > started or ended and produce false results.

> > Proposed solution: Add an extra element to the parser state, recording the
> > syntax of the last character passed over before the end of the parse.
> > This would be used by parse-partial-sexp to initialise its parse.

> > Also: the existing element 9 (the list of currently open parens) and the
> > new element should be explicitly documented in the Elisp manual, together
> > with a statement that there may be further elements in the parse state
> > used internally by parse-partial-sexp (for future expansion).


> a comment start might be composed not just by two characters, but by 
> three or more. What then?

We'd have to start thinking about extending parse-partial-sexp, or
invent some workaround.  Maybe.  There must be some languages (?html)
where this is the case.  What is done in these?

> Cheers,

> Andreas

-- 
Alan Mackenzie (Nuremberg, Germany).


From debbugs-submit-bounces@debbugs.gnu.org Tue Mar 15 09:37:42 2016
Received: (at submit) by debbugs.gnu.org; 15 Mar 2016 13:37:42 +0000
Received: from localhost ([127.0.0.1]:48656 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces@debbugs.gnu.org>)
	id 1afpAM-0002p1-Fv
	for submit@debbugs.gnu.org; Tue, 15 Mar 2016 09:37:42 -0400
Received: from eggs.gnu.org ([208.118.235.92]:42899)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <andreas.roehler@easy-emacs.de>) id 1afpAK-0002oo-FD
 for submit@debbugs.gnu.org; Tue, 15 Mar 2016 09:37:40 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <andreas.roehler@easy-emacs.de>) id 1afpAE-0005nO-91
 for submit@debbugs.gnu.org; Tue, 15 Mar 2016 09:37:35 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled
 version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:37388)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <andreas.roehler@easy-emacs.de>) id 1afpAE-0005nJ-5z
 for submit@debbugs.gnu.org; Tue, 15 Mar 2016 09:37:34 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:33863)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <andreas.roehler@easy-emacs.de>) id 1afpAD-00064o-4y
 for bug-gnu-emacs@gnu.org; Tue, 15 Mar 2016 09:37:34 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <andreas.roehler@easy-emacs.de>) id 1afpA8-0005m9-4q
 for bug-gnu-emacs@gnu.org; Tue, 15 Mar 2016 09:37:33 -0400
Received: from mout.kundenserver.de ([212.227.17.13]:57292)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <andreas.roehler@easy-emacs.de>) id 1afpA7-0005lv-Rs
 for bug-gnu-emacs@gnu.org; Tue, 15 Mar 2016 09:37:28 -0400
Received: from [192.168.178.35] ([77.3.14.174]) by mrelayeu.kundenserver.de
 (mreue104) with ESMTPSA (Nemesis) id 0MJU4Z-1aeSeT0aEN-0033by; Tue, 15 Mar
 2016 14:37:26 +0100
Subject: Re: bug#23019: parse-partial-sexp doesn't output the full state
 needed for its continuance.
To: Alan Mackenzie <acm@muc.de>
References: <20160315091355.GA2263@acm.fritz.box>
 <56E7D74C.4070805@easy-emacs.de> <20160315101521.GB2263@acm.fritz.box>
From: =?UTF-8?Q?Andreas_R=c3=b6hler?= <andreas.roehler@easy-emacs.de>
Message-ID: <56E8106E.7020402@easy-emacs.de>
Date: Tue, 15 Mar 2016 14:38:54 +0100
User-Agent: Mozilla/5.0 (X11; Linux i686; rv:38.0) Gecko/20100101
 Icedove/38.5.0
MIME-Version: 1.0
In-Reply-To: <20160315101521.GB2263@acm.fritz.box>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 8bit
X-Provags-ID: V03:K0:0OssIvbLDFs9Sm9ODr6Xnnj2ICLhpZa/9z+IBo0XuEUOH/VuTIV
 b6lunIyg3oYTlo1ZbjODv8gIOLY1ldbXXFYFVNe6lmLCYPnvl3WyTnE7AeTaHFZqfqrqNLc
 0aEmyBdOIrPslltj2VOwINhrJpCXOBtZV1qZSji6hopfadrYGnp9rNa6ym06Y57zlFR2Xv5
 Ba4k6LyNXoE+4mb5RXKdQ==
X-UI-Out-Filterresults: notjunk:1;V01:K0:pbUgErAZ0p8=:YiMK5v/6S1iwKxJLMVSgvj
 G6pjzjgj1aE7w+fFFQ0jf23O2kvp8LFLv/nrDayeQTXLoG8yL/jx8P/jVtbEjdjAMN1ZBQAF2
 PHeS7GNb9WVgG/hPt5ANUD8n2/zfm06cwZj6VOupxvhA5NeJvQx6uuhxIgdq/ArD8P7kQWg33
 ZjohCKwmVWHA0cgFdxtJi96PgR9pmVskndmJyNhglRjv7UktAlveT6pOh6fdY5U6Craau9Kll
 qOqW0BwHfKAy/0iewzbj9VappJ0RBXURoMZ7VtQHRiH1pPuR8zY9ntcnsHBWFyO/Fm97/eJ9d
 MR6lSEw6EeSPs5mGcIlWNArlg57OhvDTKut2wnYeSnxaVu6gk2a1Kr94vEBbVkh2IJ4NPskvU
 vj97CH4rl2Os7U3DYLI/vUTFCU5VUqu3M4yrUUcYKCrR0jiMy7GD63cob9j2wh5fCrWFwQu/U
 AaABwF6tbPYheZys3qz8OjakXN/7D996pFuuqVBTS7fqyJvpse8UJfwnpYzYtYDwENADsPn86
 mBC2gzNsokvqjfqyT8JLJUUgzb6oZVHegx0vbohEvwpNakh9wBNOsDc2yw3WxhzZQ+yiR3FIs
 e3d5sdXe6aED0JWgEugdc7Cn6dyNL68etjz7qEih4hxeHFfwykeeF195MKlESDRvVsTH8cLKJ
 UN/t3H5W5trMZk7lFeXsNaLbxG/qmrWlAAWpyGVSqxuEj8L9eCaich6LxwE1zNZS1e+BDj9ee
 COP/UuukiS+DOw/m
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -5.0 (-----)
X-Debbugs-Envelope-To: submit
Cc: bug-gnu-emacs@gnu.org
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit@debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request@debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
X-Spam-Score: -5.0 (-----)


On 15.03.2016 11:15, Alan Mackenzie wrote:
> Hello, Andreas.
>
> On Tue, Mar 15, 2016 at 10:35:08AM +0100, Andreas Röhler wrote:
>
>
>> On 15.03.2016 10:13, Alan Mackenzie wrote:
>>> Hello, Emacs.
>>> When parse-partial-sexp finishes a parse, it fails to record whether or
>>> not its end point is just after the first character of a two character
>>> comment starter or ender.  When the resulting state is used as an
>>> argument to resume the parse, p-p-s will be unaware that the comment has
>>> started or ended and produce false results.
>>> Proposed solution: Add an extra element to the parser state, recording the
>>> syntax of the last character passed over before the end of the parse.
>>> This would be used by parse-partial-sexp to initialise its parse.
>>> Also: the existing element 9 (the list of currently open parens) and the
>>> new element should be explicitly documented in the Elisp manual, together
>>> with a statement that there may be further elements in the parse state
>>> used internally by parse-partial-sexp (for future expansion).
>
>> a comment start might be composed not just by two characters, but by
>> three or more. What then?
> We'd have to start thinking about extending parse-partial-sexp, or
> invent some workaround.  Maybe.  There must be some languages (?html)
> where this is the case.  What is done in these?

May you send me this (or more) example use-cases? Couldn't find the one 
already given, sorry.

Addressed this issue in my generic beg-end.el

https://github.com/andreas-roehler/werkstatt/blob/master/subroutines/beg-end.el

In case beg-end forms used a start-string, look if the char-at-point 
would match this string.
Then look if the char-before is before in string, etc.


From debbugs-submit-bounces@debbugs.gnu.org Thu Mar 17 08:58:36 2016
Received: (at 23019) by debbugs.gnu.org; 17 Mar 2016 12:58:36 +0000
Received: from localhost ([127.0.0.1]:50735 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces@debbugs.gnu.org>)
	id 1agXVc-0003zl-5U
	for submit@debbugs.gnu.org; Thu, 17 Mar 2016 08:58:36 -0400
Received: from ironport2-out.teksavvy.com ([206.248.154.181]:25472)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <monnier@iro.umontreal.ca>) id 1agXVZ-0003zX-C1
 for 23019@debbugs.gnu.org; Thu, 17 Mar 2016 08:58:34 -0400
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A0A3FgA731xV/6jw92hcgxCEAoVVwwsEAgKBPDwRAQEBAQEBAYEKQQWDXQEBAwFWIwULCw4mEhQYDSQuiAkIzyMBAQEHAgEfizqFBQeELQWQNKRQI4FmVYFZIoJ4AQEB
X-IPAS-Result: A0A3FgA731xV/6jw92hcgxCEAoVVwwsEAgKBPDwRAQEBAQEBAYEKQQWDXQEBAwFWIwULCw4mEhQYDSQuiAkIzyMBAQEHAgEfizqFBQeELQWQNKRQI4FmVYFZIoJ4AQEB
X-IronPort-AV: E=Sophos;i="5.13,465,1427774400"; d="scan'208";a="196475014"
Received: from 104-247-240-168.cpe.teksavvy.com (HELO pastel.home)
 ([104.247.240.168])
 by ironport2-out.teksavvy.com with ESMTP; 17 Mar 2016 08:58:27 -0400
Received: by pastel.home (Postfix, from userid 20848)
 id 59CAD6405A; Thu, 17 Mar 2016 08:58:27 -0400 (EDT)
From: Stefan Monnier <monnier@iro.umontreal.ca>
To: Alan Mackenzie <acm@muc.de>
Subject: Re: bug#23019: parse-partial-sexp doesn't output the full state
 needed for its continuance.
Message-ID: <jwvvb4lclp0.fsf-monnier+emacsbug@gnu.org>
References: <20160315091355.GA2263@acm.fritz.box>
Date: Thu, 17 Mar 2016 08:58:27 -0400
In-Reply-To: <20160315091355.GA2263@acm.fritz.box> (Alan Mackenzie's message
 of "Tue, 15 Mar 2016 09:13:55 +0000")
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1.50 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: 0.3 (/)
X-Debbugs-Envelope-To: 23019
Cc: 23019@debbugs.gnu.org
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit@debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request@debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
X-Spam-Score: 0.3 (/)

> Proposed solution: Add an extra element to the parser state, recording the
> syntax of the last character passed over before the end of the parse.
> This would be used by parse-partial-sexp to initialise its parse.

Another option is to record "the start of current element" (in case we
were in the middle of an element).  This could potentially reuse (nth
5 ppss) by generalizing it, or it could use a new entry.

The choice probably doesn't matter much and will probably be more
a question of "what's easier to implement".

> Also: the existing element 9 (the list of currently open parens) and the
> new element should be explicitly documented in the Elisp manual, together
> with a statement that there may be further elements in the parse state
> used internally by parse-partial-sexp (for future expansion).

Indeed.

Andreas R=F6hler added:
> a comment start might be composed not just by two characters, but by three
> or more. What then?

Andreas, I suggest that you go back and take a closer look at
parse-partial-sexp, syntax-ppss, and syntax-tables in general because
lately you've made several comments like the one here which show you're
just not familiar with the topic at all.  Syntax tables do not support
comment markers longer than 2 characters (currently).  Emacs supports
those via the `syntax-table' text-property only (which typically marks
the first char of each "long comment starter" as being "the comment
starter").


        Stefan


From debbugs-submit-bounces@debbugs.gnu.org Thu Mar 17 17:46:56 2016
Received: (at 23019) by debbugs.gnu.org; 17 Mar 2016 21:46:56 +0000
Received: from localhost ([127.0.0.1]:51512 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces@debbugs.gnu.org>)
	id 1agfkt-0006Wr-G0
	for submit@debbugs.gnu.org; Thu, 17 Mar 2016 17:46:56 -0400
Received: from mail.muc.de ([193.149.48.3]:25683)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <acm@muc.de>) id 1agfkr-0006Wj-GC
 for 23019@debbugs.gnu.org; Thu, 17 Mar 2016 17:46:54 -0400
Received: (qmail 43228 invoked by uid 3782); 17 Mar 2016 21:46:52 -0000
Received: from acm.muc.de (p548A5932.dip0.t-ipconnect.de [84.138.89.50]) by
 colin.muc.de (tmda-ofmipd) with ESMTP;
 Thu, 17 Mar 2016 22:46:47 +0100
Received: (qmail 26919 invoked by uid 1000); 17 Mar 2016 21:49:34 -0000
Date: Thu, 17 Mar 2016 21:49:34 +0000
To: Stefan Monnier <monnier@iro.umontreal.ca>
Subject: Re: bug#23019: parse-partial-sexp doesn't output the full state
 needed for its continuance.
Message-ID: <20160317214934.GB9038@acm.fritz.box>
References: <20160315091355.GA2263@acm.fritz.box>
 <jwvvb4lclp0.fsf-monnier+emacsbug@gnu.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <jwvvb4lclp0.fsf-monnier+emacsbug@gnu.org>
User-Agent: Mutt/1.5.24 (2015-08-30)
X-Delivery-Agent: TMDA/1.1.12 (Macallan)
From: Alan Mackenzie <acm@muc.de>
X-Primary-Address: acm@muc.de
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 23019
Cc: 23019@debbugs.gnu.org
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit@debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request@debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
X-Spam-Score: -0.0 (/)

On Thu, Mar 17, 2016 at 08:58:27AM -0400, Stefan Monnier wrote:
> > Proposed solution: Add an extra element to the parser state, recording the
> > syntax of the last character passed over before the end of the parse.
> > This would be used by parse-partial-sexp to initialise its parse.

> Another option is to record "the start of current element" (in case we
> were in the middle of an element).  This could potentially reuse (nth
> 5 ppss) by generalizing it, or it could use a new entry.

> The choice probably doesn't matter much and will probably be more
> a question of "what's easier to implement".

> > Also: the existing element 9 (the list of currently open parens) and the
> > new element should be explicitly documented in the Elisp manual, together
> > with a statement that there may be further elements in the parse state
> > used internally by parse-partial-sexp (for future expansion).

> Indeed.

OK, I've got a patch ready.  It's bigger than anticipated, purely
because it also does some refactoring.  It actually adds two elements to
the parser state, and I believe that makes the parser state complete.

Here's the patch:


Enhance parse-partial-sexp correctly to handle two character commit delimiters

Do this by adding two new fields to the parser state: the syntax of the last
character scanned, and the last end of comment scanned.  This should make the
parser state complete.
Also document element 9 of the parser state.  Also refactor the code a bit.

* src/syntax.c (struct lisp_parse_state): Add two new fields.
(internalize_parse_state): New function, extracted from scan_sexps_forward.
(back_comment): Call internalize_parse_state.
(forw_comment): Return the syntax of the last character scanned to the caller.
(Fforward_comment, scan_lists): New dummy variables, passed to forw_comment.
(scan_sexps_forward): Remove a redundant state parameter.  Access all `state'
information via the address parameter `state'.  Remove the code which converts
from external to internal form of `state'.  Access buffer contents only from
`from' onwards.  Reformulate code at the top of the main loop correctly to
recognize comment openers when starting in the middle of one.  Call
forw_comment with extra argument (for return of final syntax value).
(Fparse_partial_sexp): Document elements 9, 10, 11 of the parser state in the
doc string.  Clarify the doc string in general.  Call
internalize_parse_state.  Take account of the new elements when consing up the
output parser state.

* doc/lispref/syntax.texi: (Parser State): Document element 9 and the new
elements 10 and 11.  Minor wording corrections (remove reference to "trivial
cases").
(Low Level Parsing): Minor corrections


diff --git a/doc/lispref/syntax.texi b/doc/lispref/syntax.texi
index d5a7eba..67a00d7 100644
--- a/doc/lispref/syntax.texi
+++ b/doc/lispref/syntax.texi
@@ -791,10 +791,10 @@ Parser State
 @subsection Parser State
 @cindex parser state
 
-  A @dfn{parser state} is a list of ten elements describing the state
-of the syntactic parser, after it parses the text between a specified
-starting point and a specified end point in the buffer.  Parsing
-functions such as @code{syntax-ppss}
+  A @dfn{parser state} is a list of (currently) twelve elements
+describing the state of the syntactic parser, after it parses the text
+between a specified starting point and a specified end point in the
+buffer.  Parsing functions such as @code{syntax-ppss}
 @ifnottex
 (@pxref{Position Parse})
 @end ifnottex
@@ -851,15 +851,21 @@ Parser State
 this element is @code{nil}.
 
 @item
-Internal data for continuing the parsing.  The meaning of this
-data is subject to change; it is used if you pass this list
-as the @var{state} argument to another call.
+The list of the positions of the currently open parentheses, starting
+with the outermost.
+
+@item
+The @var{syntax-code} (@pxref{Syntax Table Internals}) of the last
+buffer position scanned, or @code{nil} if no scanning has happened.
+
+@item
+The position after the previous end of comment, or @code{nil} if the
+scanning has not passed a comment end.
 @end enumerate
 
   Elements 1, 2, and 6 are ignored in a state which you pass as an
-argument to continue parsing, and elements 8 and 9 are used only in
-trivial cases.  Those elements are mainly used internally by the
-parser code.
+argument to continue parsing.  Elements 9 to 11 are mainly used
+internally by the parser code.
 
   One additional piece of useful information is available from a
 parser state using this function:
@@ -898,11 +904,11 @@ Low-Level Parsing
 
 If the fourth argument @var{stop-before} is non-@code{nil}, parsing
 stops when it comes to any character that starts a sexp.  If
-@var{stop-comment} is non-@code{nil}, parsing stops when it comes to the
-start of an unnested comment.  If @var{stop-comment} is the symbol
+@var{stop-comment} is non-@code{nil}, parsing stops after the start of
+an unnested comment.  If @var{stop-comment} is the symbol
 @code{syntax-table}, parsing stops after the start of an unnested
-comment or a string, or the end of an unnested comment or a string,
-whichever comes first.
+comment or a string, or after the end of an unnested comment or a
+string, whichever comes first.
 
 If @var{state} is @code{nil}, @var{start} is assumed to be at the top
 level of parenthesis structure, such as the beginning of a function
diff --git a/src/syntax.c b/src/syntax.c
index 249d0d5..11b1ff0 100644
--- a/src/syntax.c
+++ b/src/syntax.c
@@ -153,6 +153,9 @@ struct lisp_parse_state
     ptrdiff_t comstr_start;  /* Position of last comment/string starter.  */
     Lisp_Object levelstarts; /* Char numbers of starts-of-expression
 				of levels (starting from outermost).  */
+    int prev_syntax; /* Syntax of previous character scanned, or Smax. */
+    ptrdiff_t prev_comment_end; /* Position after end of last closed
+                                   comment, or -1. */
   };
 
 /* These variables are a cache for finding the start of a defun.
@@ -176,7 +179,8 @@ static Lisp_Object skip_syntaxes (bool, Lisp_Object, Lisp_Object);
 static Lisp_Object scan_lists (EMACS_INT, EMACS_INT, EMACS_INT, bool);
 static void scan_sexps_forward (struct lisp_parse_state *,
                                 ptrdiff_t, ptrdiff_t, ptrdiff_t, EMACS_INT,
-                                bool, Lisp_Object, int);
+                                bool, int);
+static void internalize_parse_state (Lisp_Object, struct lisp_parse_state *);
 static bool in_classes (int, Lisp_Object);
 static void parse_sexp_propertize (ptrdiff_t charpos);
 
@@ -911,10 +915,11 @@ back_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop,
 	}
       do
 	{
+          internalize_parse_state (Qnil, &state);
 	  scan_sexps_forward (&state,
 			      defun_start, defun_start_byte,
 			      comment_end, TYPE_MINIMUM (EMACS_INT),
-			      0, Qnil, 0);
+			      0, 0);
 	  defun_start = comment_end;
 	  if (!adjusted)
 	    {
@@ -2314,7 +2319,9 @@ in_classes (int c, Lisp_Object iso_classes)
    into *CHARPOS_PTR and the corresponding bytepos into *BYTEPOS_PTR.
    Else, return false and store the charpos STOP into *CHARPOS_PTR, the
    corresponding bytepos into *BYTEPOS_PTR and the current nesting
-   (as defined for state.incomment) in *INCOMMENT_PTR.
+   (as defined for state->incomment) in *INCOMMENT_PTR.  The
+   SYNTAX_WITH_FLAGS of the last character scanned in the comment is
+   stored into *last_syntax_ptr.
 
    The comment end is the last character of the comment rather than the
    character just after the comment.
@@ -2326,7 +2333,7 @@ static bool
 forw_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop,
 	      EMACS_INT nesting, int style, int prev_syntax,
 	      ptrdiff_t *charpos_ptr, ptrdiff_t *bytepos_ptr,
-	      EMACS_INT *incomment_ptr)
+	      EMACS_INT *incomment_ptr, int *last_syntax_ptr)
 {
   register int c, c1;
   register enum syntaxcode code;
@@ -2346,6 +2353,7 @@ forw_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop,
 	  *incomment_ptr = nesting;
 	  *charpos_ptr = from;
 	  *bytepos_ptr = from_byte;
+          *last_syntax_ptr = syntax;
 	  return 0;
 	}
       c = FETCH_CHAR_AS_MULTIBYTE (from_byte);
@@ -2415,6 +2423,7 @@ forw_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop,
     }
   *charpos_ptr = from;
   *bytepos_ptr = from_byte;
+  *last_syntax_ptr = syntax;
   return 1;
 }
 
@@ -2436,6 +2445,7 @@ between them, return t; otherwise return nil.  */)
   EMACS_INT count1;
   ptrdiff_t out_charpos, out_bytepos;
   EMACS_INT dummy;
+  int dummy2;
 
   CHECK_NUMBER (count);
   count1 = XINT (count);
@@ -2499,7 +2509,7 @@ between them, return t; otherwise return nil.  */)
 	}
       /* We're at the start of a comment.  */
       found = forw_comment (from, from_byte, stop, comnested, comstyle, 0,
-			    &out_charpos, &out_bytepos, &dummy);
+			    &out_charpos, &out_bytepos, &dummy, &dummy2);
       from = out_charpos; from_byte = out_bytepos;
       if (!found)
 	{
@@ -2659,6 +2669,7 @@ scan_lists (EMACS_INT from, EMACS_INT count, EMACS_INT depth, bool sexpflag)
   ptrdiff_t from_byte;
   ptrdiff_t out_bytepos, out_charpos;
   EMACS_INT dummy;
+  int dummy2;
   bool multibyte_symbol_p = sexpflag && multibyte_syntax_as_symbol;
 
   if (depth > 0) min_depth = 0;
@@ -2755,7 +2766,8 @@ scan_lists (EMACS_INT from, EMACS_INT count, EMACS_INT depth, bool sexpflag)
 	      UPDATE_SYNTAX_TABLE_FORWARD (from);
 	      found = forw_comment (from, from_byte, stop,
 				    comnested, comstyle, 0,
-				    &out_charpos, &out_bytepos, &dummy);
+				    &out_charpos, &out_bytepos, &dummy,
+                                    &dummy2);
 	      from = out_charpos, from_byte = out_bytepos;
 	      if (!found)
 		{
@@ -3119,7 +3131,7 @@ the prefix syntax flag (p).  */)
 }
 
 /* Parse forward from FROM / FROM_BYTE to END,
-   assuming that FROM has state OLDSTATE (nil means FROM is start of function),
+   assuming that FROM has state STATE (nil means FROM is start of function),
    and return a description of the state of the parse at END.
    If STOPBEFORE, stop at the start of an atom.
    If COMMENTSTOP is 1, stop at the start of a comment.
@@ -3127,12 +3139,11 @@ the prefix syntax flag (p).  */)
    after the beginning of a string, or after the end of a string.  */
 
 static void
-scan_sexps_forward (struct lisp_parse_state *stateptr,
+scan_sexps_forward (struct lisp_parse_state *state,
 		    ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t end,
 		    EMACS_INT targetdepth, bool stopbefore,
-		    Lisp_Object oldstate, int commentstop)
+		    int commentstop)
 {
-  struct lisp_parse_state state;
   enum syntaxcode code;
   int c1;
   bool comnested;
@@ -3148,7 +3159,7 @@ scan_sexps_forward (struct lisp_parse_state *stateptr,
   Lisp_Object tem;
   ptrdiff_t prev_from;		/* Keep one character before FROM.  */
   ptrdiff_t prev_from_byte;
-  int prev_from_syntax;
+  int prev_from_syntax, prev_prev_from_syntax;
   bool boundary_stop = commentstop == -1;
   bool nofence;
   bool found;
@@ -3165,6 +3176,7 @@ scan_sexps_forward (struct lisp_parse_state *stateptr,
 do { prev_from = from;				\
      prev_from_byte = from_byte; 		\
      temp = FETCH_CHAR_AS_MULTIBYTE (prev_from_byte);	\
+     prev_prev_from_syntax = prev_from_syntax;  \
      prev_from_syntax = SYNTAX_WITH_FLAGS (temp); \
      INC_BOTH (from, from_byte);		\
      if (from < end)				\
@@ -3174,88 +3186,38 @@ do { prev_from = from;				\
   immediate_quit = 1;
   QUIT;
 
-  if (NILP (oldstate))
-    {
-      depth = 0;
-      state.instring = -1;
-      state.incomment = 0;
-      state.comstyle = 0;	/* comment style a by default.  */
-      state.comstr_start = -1;	/* no comment/string seen.  */
-    }
-  else
-    {
-      tem = Fcar (oldstate);
-      if (!NILP (tem))
-	depth = XINT (tem);
-      else
-	depth = 0;
-
-      oldstate = Fcdr (oldstate);
-      oldstate = Fcdr (oldstate);
-      oldstate = Fcdr (oldstate);
-      tem = Fcar (oldstate);
-      /* Check whether we are inside string_fence-style string: */
-      state.instring = (!NILP (tem)
-			? (CHARACTERP (tem) ? XFASTINT (tem) : ST_STRING_STYLE)
-			: -1);
-
-      oldstate = Fcdr (oldstate);
-      tem = Fcar (oldstate);
-      state.incomment = (!NILP (tem)
-			 ? (INTEGERP (tem) ? XINT (tem) : -1)
-			 : 0);
-
-      oldstate = Fcdr (oldstate);
-      tem = Fcar (oldstate);
-      start_quoted = !NILP (tem);
+  depth = state->depth;
+  start_quoted = state->quoted;
+  prev_prev_from_syntax = Smax;
+  prev_from_syntax = state->prev_syntax;
 
-      /* if the eighth element of the list is nil, we are in comment
-	 style a.  If it is non-nil, we are in comment style b */
-      oldstate = Fcdr (oldstate);
-      oldstate = Fcdr (oldstate);
-      tem = Fcar (oldstate);
-      state.comstyle = (NILP (tem)
-			? 0
-			: (RANGED_INTEGERP (0, tem, ST_COMMENT_STYLE)
-			   ? XINT (tem)
-			   : ST_COMMENT_STYLE));
-
-      oldstate = Fcdr (oldstate);
-      tem = Fcar (oldstate);
-      state.comstr_start =
-	RANGED_INTEGERP (PTRDIFF_MIN, tem, PTRDIFF_MAX) ? XINT (tem) : -1;
-      oldstate = Fcdr (oldstate);
-      tem = Fcar (oldstate);
-      while (!NILP (tem))		/* >= second enclosing sexps.  */
-	{
-	  Lisp_Object temhd = Fcar (tem);
-	  if (RANGED_INTEGERP (PTRDIFF_MIN, temhd, PTRDIFF_MAX))
-	    curlevel->last = XINT (temhd);
-	  if (++curlevel == endlevel)
-	    curlevel--; /* error ("Nesting too deep for parser"); */
-	  curlevel->prev = -1;
-	  curlevel->last = -1;
-	  tem = Fcdr (tem);
-	}
+  tem = state->levelstarts;
+  while (!NILP (tem))		/* >= second enclosing sexps.  */
+    {
+      Lisp_Object temhd = Fcar (tem);
+      if (RANGED_INTEGERP (PTRDIFF_MIN, temhd, PTRDIFF_MAX))
+        curlevel->last = XINT (temhd);
+      if (++curlevel == endlevel)
+        curlevel--; /* error ("Nesting too deep for parser"); */
+      curlevel->prev = -1;
+      curlevel->last = -1;
+      tem = Fcdr (tem);
     }
-  state.quoted = 0;
-  mindepth = depth;
-
   curlevel->prev = -1;
   curlevel->last = -1;
 
-  SETUP_SYNTAX_TABLE (prev_from, 1);
-  temp = FETCH_CHAR (prev_from_byte);
-  prev_from_syntax = SYNTAX_WITH_FLAGS (temp);
-  UPDATE_SYNTAX_TABLE_FORWARD (from);
+  state->quoted = 0;
+  mindepth = depth;
+
+  SETUP_SYNTAX_TABLE (from, 1);
 
   /* Enter the loop at a place appropriate for initial state.  */
 
-  if (state.incomment)
+  if (state->incomment)
     goto startincomment;
-  if (state.instring >= 0)
+  if (state->instring >= 0)
     {
-      nofence = state.instring != ST_STRING_STYLE;
+      nofence = state->instring != ST_STRING_STYLE;
       if (start_quoted)
 	goto startquotedinstring;
       goto startinstring;
@@ -3266,10 +3228,10 @@ do { prev_from = from;				\
   while (from < end)
     {
       int syntax;
-      INC_FROM;
-      code = prev_from_syntax & 0xff;
 
       if (from < end
+          && (state->prev_comment_end == -1
+              || prev_from >= state->prev_comment_end)
 	  && SYNTAX_FLAGS_COMSTART_FIRST (prev_from_syntax)
 	  && (c1 = FETCH_CHAR (from_byte),
 	      syntax = SYNTAX_WITH_FLAGS (c1),
@@ -3280,32 +3242,37 @@ do { prev_from = from;				\
 	  /* Record the comment style we have entered so that only
 	     the comment-end sequence of the same style actually
 	     terminates the comment section.  */
-	  state.comstyle
+	  state->comstyle
 	    = SYNTAX_FLAGS_COMMENT_STYLE (syntax, prev_from_syntax);
 	  comnested = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax)
 		       | SYNTAX_FLAGS_COMMENT_NESTED (syntax));
-	  state.incomment = comnested ? 1 : -1;
-	  state.comstr_start = prev_from;
+	  state->incomment = comnested ? 1 : -1;
+	  state->comstr_start = prev_from;
 	  INC_FROM;
 	  code = Scomment;
 	}
-      else if (code == Scomment_fence)
-	{
-	  /* Record the comment style we have entered so that only
-	     the comment-end sequence of the same style actually
-	     terminates the comment section.  */
-	  state.comstyle = ST_COMMENT_STYLE;
-	  state.incomment = -1;
-	  state.comstr_start = prev_from;
-	  code = Scomment;
-	}
-      else if (code == Scomment)
-	{
-	  state.comstyle = SYNTAX_FLAGS_COMMENT_STYLE (prev_from_syntax, 0);
-	  state.incomment = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax) ?
-			     1 : -1);
-	  state.comstr_start = prev_from;
-	}
+      else
+        {
+          INC_FROM;
+          code = prev_from_syntax & 0xff;
+          if (code == Scomment_fence)
+            {
+              /* Record the comment style we have entered so that only
+                 the comment-end sequence of the same style actually
+                 terminates the comment section.  */
+              state->comstyle = ST_COMMENT_STYLE;
+              state->incomment = -1;
+              state->comstr_start = prev_from;
+              code = Scomment;
+            }
+          else if (code == Scomment)
+            {
+              state->comstyle = SYNTAX_FLAGS_COMMENT_STYLE (prev_from_syntax, 0);
+              state->incomment = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax) ?
+                                 1 : -1);
+              state->comstr_start = prev_from;
+            }
+        }
 
       if (SYNTAX_FLAGS_PREFIX (prev_from_syntax))
 	continue;
@@ -3357,18 +3324,21 @@ do { prev_from = from;				\
 	     middle of it.  We don't want to do that if we're just at the
 	     beginning of the comment (think of (*) ... (*)).  */
 	  found = forw_comment (from, from_byte, end,
-				state.incomment, state.comstyle,
-				(from == BEGV || from < state.comstr_start + 3)
+				state->incomment, state->comstyle,
+				(from == BEGV || from < state->comstr_start + 3)
 				? 0 : prev_from_syntax,
-				&out_charpos, &out_bytepos, &state.incomment);
+				&out_charpos, &out_bytepos, &state->incomment,
+                                &prev_from_syntax);
 	  from = out_charpos; from_byte = out_bytepos;
-	  /* Beware!  prev_from and friends are invalid now.
-	     Luckily, the `done' doesn't use them and the INC_FROM
-	     sets them to a sane value without looking at them. */
+	  /* Beware!  prev_from and friends (except prev_from_syntax)
+	     are invalid now.  Luckily, the `done' doesn't use them
+	     and the INC_FROM sets them to a sane value without
+	     looking at them. */
 	  if (!found) goto done;
 	  INC_FROM;
-	  state.incomment = 0;
-	  state.comstyle = 0;	/* reset the comment style */
+	  state->incomment = 0;
+	  state->comstyle = 0;	/* reset the comment style */
+          state->prev_comment_end = from;
 	  if (boundary_stop) goto done;
 	  break;
 
@@ -3396,16 +3366,16 @@ do { prev_from = from;				\
 
 	case Sstring:
 	case Sstring_fence:
-	  state.comstr_start = from - 1;
+	  state->comstr_start = from - 1;
 	  if (stopbefore) goto stop;  /* this arg means stop at sexp start */
 	  curlevel->last = prev_from;
-	  state.instring = (code == Sstring
+	  state->instring = (code == Sstring
 			    ? (FETCH_CHAR_AS_MULTIBYTE (prev_from_byte))
 			    : ST_STRING_STYLE);
 	  if (boundary_stop) goto done;
 	startinstring:
 	  {
-	    nofence = state.instring != ST_STRING_STYLE;
+	    nofence = state->instring != ST_STRING_STYLE;
 
 	    while (1)
 	      {
@@ -3419,7 +3389,7 @@ do { prev_from = from;				\
 		/* Check C_CODE here so that if the char has
 		   a syntax-table property which says it is NOT
 		   a string character, it does not end the string.  */
-		if (nofence && c == state.instring && c_code == Sstring)
+		if (nofence && c == state->instring && c_code == Sstring)
 		  break;
 
 		switch (c_code)
@@ -3442,7 +3412,7 @@ do { prev_from = from;				\
 	      }
 	  }
 	string_end:
-	  state.instring = -1;
+	  state->instring = -1;
 	  curlevel->prev = curlevel->last;
 	  INC_FROM;
 	  if (boundary_stop) goto done;
@@ -3461,25 +3431,99 @@ do { prev_from = from;				\
  stop:   /* Here if stopping before start of sexp. */
   from = prev_from;    /* We have just fetched the char that starts it; */
   from_byte = prev_from_byte;
+  prev_from_syntax = prev_prev_from_syntax;
   goto done; /* but return the position before it. */
 
  endquoted:
-  state.quoted = 1;
+  state->quoted = 1;
  done:
-  state.depth = depth;
-  state.mindepth = mindepth;
-  state.thislevelstart = curlevel->prev;
-  state.prevlevelstart
+  state->depth = depth;
+  state->mindepth = mindepth;
+  state->thislevelstart = curlevel->prev;
+  state->prevlevelstart
     = (curlevel == levelstart) ? -1 : (curlevel - 1)->last;
-  state.location = from;
-  state.location_byte = from_byte;
-  state.levelstarts = Qnil;
+  state->location = from;
+  state->location_byte = from_byte;
+  state->levelstarts = Qnil;
   while (curlevel > levelstart)
-    state.levelstarts = Fcons (make_number ((--curlevel)->last),
-			       state.levelstarts);
+    state->levelstarts = Fcons (make_number ((--curlevel)->last),
+                                state->levelstarts);
+  state->prev_syntax = prev_from_syntax;
   immediate_quit = 0;
+}
+
+/* Convert a (lisp) parse state to the internal form used in
+   scan_sexps_forward.  */
+static void
+internalize_parse_state (Lisp_Object external, struct lisp_parse_state *state)
+{
+  Lisp_Object tem;
+
+  if (NILP (external))
+    {
+      state->depth = 0;
+      state->instring = -1;
+      state->incomment = 0;
+      state->quoted = 0;
+      state->comstyle = 0;	/* comment style a by default.  */
+      state->comstr_start = -1;	/* no comment/string seen.  */
+      state->levelstarts = Qnil;
+      state->prev_syntax = Smax;
+      state->prev_comment_end = -1;
+    }
+  else
+    {
+      tem = Fcar (external);
+      if (!NILP (tem))
+	state->depth = XINT (tem);
+      else
+	state->depth = 0;
+
+      external = Fcdr (external);
+      external = Fcdr (external);
+      external = Fcdr (external);
+      tem = Fcar (external);
+      /* Check whether we are inside string_fence-style string: */
+      state->instring = (!NILP (tem)
+                         ? (CHARACTERP (tem) ? XFASTINT (tem) : ST_STRING_STYLE)
+                         : -1);
+
+      external = Fcdr (external);
+      tem = Fcar (external);
+      state->incomment = (!NILP (tem)
+                          ? (INTEGERP (tem) ? XINT (tem) : -1)
+                          : 0);
 
-  *stateptr = state;
+      external = Fcdr (external);
+      tem = Fcar (external);
+      state->quoted = !NILP (tem);
+
+      /* if the eighth element of the list is nil, we are in comment
+	 style a.  If it is non-nil, we are in comment style b */
+      external = Fcdr (external);
+      external = Fcdr (external);
+      tem = Fcar (external);
+      state->comstyle = (NILP (tem)
+                         ? 0
+                         : (RANGED_INTEGERP (0, tem, ST_COMMENT_STYLE)
+                            ? XINT (tem)
+                            : ST_COMMENT_STYLE));
+
+      external = Fcdr (external);
+      tem = Fcar (external);
+      state->comstr_start =
+	RANGED_INTEGERP (PTRDIFF_MIN, tem, PTRDIFF_MAX) ? XINT (tem) : -1;
+      external = Fcdr (external);
+      tem = Fcar (external);
+      state->levelstarts = tem;
+
+      external = Fcdr (external);
+      tem = Fcar (external);
+      state->prev_syntax = NILP (tem) ? Smax : XINT (tem);
+      external = Fcdr (external);
+      tem = Fcar (external);
+      state->prev_comment_end = NILP (tem) ? -1 : XINT (tem);
+    }
 }
 
 DEFUN ("parse-partial-sexp", Fparse_partial_sexp, Sparse_partial_sexp, 2, 6, 0,
@@ -3488,6 +3532,7 @@ Parsing stops at TO or when certain criteria are met;
  point is set to where parsing stops.
 If fifth arg OLDSTATE is omitted or nil,
  parsing assumes that FROM is the beginning of a function.
+
 Value is a list of elements describing final state of parsing:
  0. depth in parens.
  1. character address of start of innermost containing list; nil if none.
@@ -3501,16 +3546,20 @@ Value is a list of elements describing final state of parsing:
  6. the minimum paren-depth encountered during this scan.
  7. style of comment, if any.
  8. character address of start of comment or string; nil if not in one.
- 9. Intermediate data for continuation of parsing (subject to change).
+ 9. List of positions of currently open parens, outermost first.
+10. Syntax of last character scanned, or nil if no scanning has happened.
+11. Position after end of previous comment scanned, or nil.
+12..... Possible further internal information used by `parse-partial-sexp'.
+
 If third arg TARGETDEPTH is non-nil, parsing stops if the depth
 in parentheses becomes equal to TARGETDEPTH.
-Fourth arg STOPBEFORE non-nil means stop when come to
+Fourth arg STOPBEFORE non-nil means stop when we come to
  any character that starts a sexp.
 Fifth arg OLDSTATE is a list like what this function returns.
  It is used to initialize the state of the parse.  Elements number 1, 2, 6
  are ignored.
-Sixth arg COMMENTSTOP non-nil means stop at the start of a comment.
- If it is symbol `syntax-table', stop after the start of a comment or a
+Sixth arg COMMENTSTOP non-nil means stop after the start of a comment.
+ If it is the symbol `syntax-table', stop after the start of a comment or a
  string, or after end of a comment or a string.  */)
   (Lisp_Object from, Lisp_Object to, Lisp_Object targetdepth,
    Lisp_Object stopbefore, Lisp_Object oldstate, Lisp_Object commentstop)
@@ -3527,15 +3576,17 @@ Sixth arg COMMENTSTOP non-nil means stop at the start of a comment.
     target = TYPE_MINIMUM (EMACS_INT);	/* We won't reach this depth.  */
 
   validate_region (&from, &to);
+  internalize_parse_state (oldstate, &state);
   scan_sexps_forward (&state, XINT (from), CHAR_TO_BYTE (XINT (from)),
 		      XINT (to),
-		      target, !NILP (stopbefore), oldstate,
+		      target, !NILP (stopbefore),
 		      (NILP (commentstop)
 		       ? 0 : (EQ (commentstop, Qsyntax_table) ? -1 : 1)));
 
   SET_PT_BOTH (state.location, state.location_byte);
 
-  return Fcons (make_number (state.depth),
+  return
+    Fcons (make_number (state.depth),
 	   Fcons (state.prevlevelstart < 0
 		  ? Qnil : make_number (state.prevlevelstart),
 	     Fcons (state.thislevelstart < 0
@@ -3553,11 +3604,18 @@ Sixth arg COMMENTSTOP non-nil means stop at the start of a comment.
 				  ? Qsyntax_table
 				  : make_number (state.comstyle))
 			       : Qnil),
-			      Fcons (((state.incomment
-				       || (state.instring >= 0))
-				      ? make_number (state.comstr_start)
-				      : Qnil),
-				     Fcons (state.levelstarts, Qnil))))))))));
+		         Fcons (((state.incomment
+                                  || (state.instring >= 0))
+                                 ? make_number (state.comstr_start)
+                                 : Qnil),
+			   Fcons (state.levelstarts,
+                             Fcons (state.prev_syntax == Smax
+                                    ? Qnil
+                                    : make_number (state.prev_syntax),
+                               Fcons (state.prev_comment_end == -1
+                                      ? Qnil
+                                      : make_number (state.prev_comment_end),
+                                      Qnil))))))))))));
 }
 
 void


>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).


From debbugs-submit-bounces@debbugs.gnu.org Fri Mar 18 00:49:18 2016
Received: (at 23019) by debbugs.gnu.org; 18 Mar 2016 04:49:18 +0000
Received: from localhost ([127.0.0.1]:51652 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces@debbugs.gnu.org>)
	id 1agmLd-0001jw-OV
	for submit@debbugs.gnu.org; Fri, 18 Mar 2016 00:49:18 -0400
Received: from pruche.dit.umontreal.ca ([132.204.246.22]:42879)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <monnier@iro.umontreal.ca>) id 1agmLZ-0001jm-U6
 for 23019@debbugs.gnu.org; Fri, 18 Mar 2016 00:49:15 -0400
Received: from ceviche.home (lechon.iro.umontreal.ca [132.204.27.242])
 by pruche.dit.umontreal.ca (8.14.7/8.14.1) with ESMTP id u2I4n86m012383;
 Fri, 18 Mar 2016 00:49:10 -0400
Received: by ceviche.home (Postfix, from userid 20848)
 id E0120661AA; Fri, 18 Mar 2016 00:49:07 -0400 (EDT)
From: Stefan Monnier <monnier@IRO.UMontreal.CA>
To: Alan Mackenzie <acm@muc.de>
Subject: Re: bug#23019: parse-partial-sexp doesn't output the full state
 needed for its continuance.
Message-ID: <jwvmvpwcsn1.fsf-monnier+emacsbugs@gnu.org>
References: <20160315091355.GA2263@acm.fritz.box>
 <jwvvb4lclp0.fsf-monnier+emacsbug@gnu.org>
 <20160317214934.GB9038@acm.fritz.box>
Date: Fri, 18 Mar 2016 00:49:07 -0400
In-Reply-To: <20160317214934.GB9038@acm.fritz.box> (Alan Mackenzie's message
 of "Thu, 17 Mar 2016 21:49:34 +0000")
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1.50 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-NAI-Spam-Flag: NO
X-NAI-Spam-Level: 
X-NAI-Spam-Threshold: 5
X-NAI-Spam-Score: 0.2
X-NAI-Spam-Rules: 2 Rules triggered
	GEN_SPAM_FEATRE=0.2, RV5613=0
X-NAI-Spam-Version: 2.3.0.9418 : core <5613> : inlines <4527> : streams
 <1604724> : uri <2168729>
X-Spam-Score: -1.3 (-)
X-Debbugs-Envelope-To: 23019
Cc: 23019@debbugs.gnu.org
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit@debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request@debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
X-Spam-Score: -1.3 (-)

> Do this by adding two new fields to the parser state: the syntax of the last
> character scanned, and the last end of comment scanned.  This should make the
> parser state complete.

Thanks.  I like the "syntax of the last character scanned", but I don't
understand the reasoning behind "last end of comment scanned".  Why is
this relevant?  Is it in case the "last character scanned" was a "slash
ending a comment" so as to avoid treating "*/*" as both a comment closer and
a subsequent opener?

If so, I'm not sure I like it.  It sounds to me like there's a chance
it's actually incomplete (e.g. it doesn't address the similar problem
when the "last character scanned" is an end of a string which also
happens to be a valid first-char of a comment-starter), and even if it
isn't, it "feels ad-hoc" to me.

Would it be difficult to do the following instead:
- get rid of element 11.
- change element 10 so it's nil if the last char was an "end of
  something".  Another way to look at it, is that the element 10 should
  only be non-nil if the "next lexeme" might start on that
  previous character.

I also have a side question: IIUC your patch makes the 5th element
redundant (can be replaced with a test whether "last char syntax" was
"escape"), is that right?


        Stefan


> Also document element 9 of the parser state.  Also refactor the code a bit.

> * src/syntax.c (struct lisp_parse_state): Add two new fields.
> (internalize_parse_state): New function, extracted from scan_sexps_forward.
> (back_comment): Call internalize_parse_state.
> (forw_comment): Return the syntax of the last character scanned to the caller.
> (Fforward_comment, scan_lists): New dummy variables, passed to forw_comment.
> (scan_sexps_forward): Remove a redundant state parameter.  Access all `state'
> information via the address parameter `state'.  Remove the code which converts
> from external to internal form of `state'.  Access buffer contents only from
> `from' onwards.  Reformulate code at the top of the main loop correctly to
> recognize comment openers when starting in the middle of one.  Call
> forw_comment with extra argument (for return of final syntax value).
> (Fparse_partial_sexp): Document elements 9, 10, 11 of the parser state in the
> doc string.  Clarify the doc string in general.  Call
> internalize_parse_state.  Take account of the new elements when consing up the
> output parser state.

> * doc/lispref/syntax.texi: (Parser State): Document element 9 and the new
> elements 10 and 11.  Minor wording corrections (remove reference to "trivial
> cases").
> (Low Level Parsing): Minor corrections


> diff --git a/doc/lispref/syntax.texi b/doc/lispref/syntax.texi
> index d5a7eba..67a00d7 100644
> --- a/doc/lispref/syntax.texi
> +++ b/doc/lispref/syntax.texi
> @@ -791,10 +791,10 @@ Parser State
>  @subsection Parser State
>  @cindex parser state
 
> -  A @dfn{parser state} is a list of ten elements describing the state
> -of the syntactic parser, after it parses the text between a specified
> -starting point and a specified end point in the buffer.  Parsing
> -functions such as @code{syntax-ppss}
> +  A @dfn{parser state} is a list of (currently) twelve elements
> +describing the state of the syntactic parser, after it parses the text
> +between a specified starting point and a specified end point in the
> +buffer.  Parsing functions such as @code{syntax-ppss}
>  @ifnottex
>  (@pxref{Position Parse})
>  @end ifnottex
> @@ -851,15 +851,21 @@ Parser State
>  this element is @code{nil}.
 
>  @item
> -Internal data for continuing the parsing.  The meaning of this
> -data is subject to change; it is used if you pass this list
> -as the @var{state} argument to another call.
> +The list of the positions of the currently open parentheses, starting
> +with the outermost.
> +
> +@item
> +The @var{syntax-code} (@pxref{Syntax Table Internals}) of the last
> +buffer position scanned, or @code{nil} if no scanning has happened.
> +
> +@item
> +The position after the previous end of comment, or @code{nil} if the
> +scanning has not passed a comment end.
>  @end enumerate
 
>    Elements 1, 2, and 6 are ignored in a state which you pass as an
> -argument to continue parsing, and elements 8 and 9 are used only in
> -trivial cases.  Those elements are mainly used internally by the
> -parser code.
> +argument to continue parsing.  Elements 9 to 11 are mainly used
> +internally by the parser code.
 
>    One additional piece of useful information is available from a
>  parser state using this function:
> @@ -898,11 +904,11 @@ Low-Level Parsing
 
>  If the fourth argument @var{stop-before} is non-@code{nil}, parsing
>  stops when it comes to any character that starts a sexp.  If
> -@var{stop-comment} is non-@code{nil}, parsing stops when it comes to the
> -start of an unnested comment.  If @var{stop-comment} is the symbol
> +@var{stop-comment} is non-@code{nil}, parsing stops after the start of
> +an unnested comment.  If @var{stop-comment} is the symbol
>  @code{syntax-table}, parsing stops after the start of an unnested
> -comment or a string, or the end of an unnested comment or a string,
> -whichever comes first.
> +comment or a string, or after the end of an unnested comment or a
> +string, whichever comes first.
 
>  If @var{state} is @code{nil}, @var{start} is assumed to be at the top
>  level of parenthesis structure, such as the beginning of a function
> diff --git a/src/syntax.c b/src/syntax.c
> index 249d0d5..11b1ff0 100644
> --- a/src/syntax.c
> +++ b/src/syntax.c
> @@ -153,6 +153,9 @@ struct lisp_parse_state
>      ptrdiff_t comstr_start;  /* Position of last comment/string starter.  */
>      Lisp_Object levelstarts; /* Char numbers of starts-of-expression
>  				of levels (starting from outermost).  */
> +    int prev_syntax; /* Syntax of previous character scanned, or Smax. */
> +    ptrdiff_t prev_comment_end; /* Position after end of last closed
> +                                   comment, or -1. */
>    };
>  
>  /* These variables are a cache for finding the start of a defun.
> @@ -176,7 +179,8 @@ static Lisp_Object skip_syntaxes (bool, Lisp_Object, Lisp_Object);
>  static Lisp_Object scan_lists (EMACS_INT, EMACS_INT, EMACS_INT, bool);
>  static void scan_sexps_forward (struct lisp_parse_state *,
>                                  ptrdiff_t, ptrdiff_t, ptrdiff_t, EMACS_INT,
> -                                bool, Lisp_Object, int);
> +                                bool, int);
> +static void internalize_parse_state (Lisp_Object, struct lisp_parse_state *);
>  static bool in_classes (int, Lisp_Object);
>  static void parse_sexp_propertize (ptrdiff_t charpos);
 
> @@ -911,10 +915,11 @@ back_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop,
>  	}
>        do
>  	{
> +          internalize_parse_state (Qnil, &state);
>  	  scan_sexps_forward (&state,
>  			      defun_start, defun_start_byte,
>  			      comment_end, TYPE_MINIMUM (EMACS_INT),
> -			      0, Qnil, 0);
> +			      0, 0);
>  	  defun_start = comment_end;
>  	  if (!adjusted)
>  	    {
> @@ -2314,7 +2319,9 @@ in_classes (int c, Lisp_Object iso_classes)
>     into *CHARPOS_PTR and the corresponding bytepos into *BYTEPOS_PTR.
>     Else, return false and store the charpos STOP into *CHARPOS_PTR, the
>     corresponding bytepos into *BYTEPOS_PTR and the current nesting
> -   (as defined for state.incomment) in *INCOMMENT_PTR.
> +   (as defined for state->incomment) in *INCOMMENT_PTR.  The
> +   SYNTAX_WITH_FLAGS of the last character scanned in the comment is
> +   stored into *last_syntax_ptr.
 
>     The comment end is the last character of the comment rather than the
>     character just after the comment.
> @@ -2326,7 +2333,7 @@ static bool
>  forw_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop,
>  	      EMACS_INT nesting, int style, int prev_syntax,
>  	      ptrdiff_t *charpos_ptr, ptrdiff_t *bytepos_ptr,
> -	      EMACS_INT *incomment_ptr)
> +	      EMACS_INT *incomment_ptr, int *last_syntax_ptr)
>  {
>    register int c, c1;
>    register enum syntaxcode code;
> @@ -2346,6 +2353,7 @@ forw_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop,
>  	  *incomment_ptr = nesting;
>  	  *charpos_ptr = from;
>  	  *bytepos_ptr = from_byte;
> +          *last_syntax_ptr = syntax;
>  	  return 0;
>  	}
>        c = FETCH_CHAR_AS_MULTIBYTE (from_byte);
> @@ -2415,6 +2423,7 @@ forw_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop,
>      }
>    *charpos_ptr = from;
>    *bytepos_ptr = from_byte;
> +  *last_syntax_ptr = syntax;
>    return 1;
>  }
 
> @@ -2436,6 +2445,7 @@ between them, return t; otherwise return nil.  */)
>    EMACS_INT count1;
>    ptrdiff_t out_charpos, out_bytepos;
>    EMACS_INT dummy;
> +  int dummy2;
 
>    CHECK_NUMBER (count);
>    count1 = XINT (count);
> @@ -2499,7 +2509,7 @@ between them, return t; otherwise return nil.  */)
>  	}
>        /* We're at the start of a comment.  */
>        found = forw_comment (from, from_byte, stop, comnested, comstyle, 0,
> -			    &out_charpos, &out_bytepos, &dummy);
> +			    &out_charpos, &out_bytepos, &dummy, &dummy2);
>        from = out_charpos; from_byte = out_bytepos;
>        if (!found)
>  	{
> @@ -2659,6 +2669,7 @@ scan_lists (EMACS_INT from, EMACS_INT count, EMACS_INT depth, bool sexpflag)
>    ptrdiff_t from_byte;
>    ptrdiff_t out_bytepos, out_charpos;
>    EMACS_INT dummy;
> +  int dummy2;
>    bool multibyte_symbol_p = sexpflag && multibyte_syntax_as_symbol;
 
>    if (depth > 0) min_depth = 0;
> @@ -2755,7 +2766,8 @@ scan_lists (EMACS_INT from, EMACS_INT count, EMACS_INT depth, bool sexpflag)
>  	      UPDATE_SYNTAX_TABLE_FORWARD (from);
>  	      found = forw_comment (from, from_byte, stop,
>  				    comnested, comstyle, 0,
> -				    &out_charpos, &out_bytepos, &dummy);
> +				    &out_charpos, &out_bytepos, &dummy,
> +                                    &dummy2);
>  	      from = out_charpos, from_byte = out_bytepos;
>  	      if (!found)
>  		{
> @@ -3119,7 +3131,7 @@ the prefix syntax flag (p).  */)
>  }
>  
>  /* Parse forward from FROM / FROM_BYTE to END,
> -   assuming that FROM has state OLDSTATE (nil means FROM is start of function),
> +   assuming that FROM has state STATE (nil means FROM is start of function),
>     and return a description of the state of the parse at END.
>     If STOPBEFORE, stop at the start of an atom.
>     If COMMENTSTOP is 1, stop at the start of a comment.
> @@ -3127,12 +3139,11 @@ the prefix syntax flag (p).  */)
>     after the beginning of a string, or after the end of a string.  */
 
>  static void
> -scan_sexps_forward (struct lisp_parse_state *stateptr,
> +scan_sexps_forward (struct lisp_parse_state *state,
>  		    ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t end,
>  		    EMACS_INT targetdepth, bool stopbefore,
> -		    Lisp_Object oldstate, int commentstop)
> +		    int commentstop)
>  {
> -  struct lisp_parse_state state;
>    enum syntaxcode code;
>    int c1;
>    bool comnested;
> @@ -3148,7 +3159,7 @@ scan_sexps_forward (struct lisp_parse_state *stateptr,
>    Lisp_Object tem;
>    ptrdiff_t prev_from;		/* Keep one character before FROM.  */
>    ptrdiff_t prev_from_byte;
> -  int prev_from_syntax;
> +  int prev_from_syntax, prev_prev_from_syntax;
>    bool boundary_stop = commentstop == -1;
>    bool nofence;
>    bool found;
> @@ -3165,6 +3176,7 @@ scan_sexps_forward (struct lisp_parse_state *stateptr,
>  do { prev_from = from;				\
>       prev_from_byte = from_byte; 		\
>       temp = FETCH_CHAR_AS_MULTIBYTE (prev_from_byte);	\
> +     prev_prev_from_syntax = prev_from_syntax;  \
>       prev_from_syntax = SYNTAX_WITH_FLAGS (temp); \
>       INC_BOTH (from, from_byte);		\
>       if (from < end)				\
> @@ -3174,88 +3186,38 @@ do { prev_from = from;				\
>    immediate_quit = 1;
>    QUIT;
 
> -  if (NILP (oldstate))
> -    {
> -      depth = 0;
> -      state.instring = -1;
> -      state.incomment = 0;
> -      state.comstyle = 0;	/* comment style a by default.  */
> -      state.comstr_start = -1;	/* no comment/string seen.  */
> -    }
> -  else
> -    {
> -      tem = Fcar (oldstate);
> -      if (!NILP (tem))
> -	depth = XINT (tem);
> -      else
> -	depth = 0;
> -
> -      oldstate = Fcdr (oldstate);
> -      oldstate = Fcdr (oldstate);
> -      oldstate = Fcdr (oldstate);
> -      tem = Fcar (oldstate);
> -      /* Check whether we are inside string_fence-style string: */
> -      state.instring = (!NILP (tem)
> -			? (CHARACTERP (tem) ? XFASTINT (tem) : ST_STRING_STYLE)
> -			: -1);
> -
> -      oldstate = Fcdr (oldstate);
> -      tem = Fcar (oldstate);
> -      state.incomment = (!NILP (tem)
> -			 ? (INTEGERP (tem) ? XINT (tem) : -1)
> -			 : 0);
> -
> -      oldstate = Fcdr (oldstate);
> -      tem = Fcar (oldstate);
> -      start_quoted = !NILP (tem);
> +  depth = state->depth;
> +  start_quoted = state->quoted;
> +  prev_prev_from_syntax = Smax;
> +  prev_from_syntax = state->prev_syntax;
 
> -      /* if the eighth element of the list is nil, we are in comment
> -	 style a.  If it is non-nil, we are in comment style b */
> -      oldstate = Fcdr (oldstate);
> -      oldstate = Fcdr (oldstate);
> -      tem = Fcar (oldstate);
> -      state.comstyle = (NILP (tem)
> -			? 0
> -			: (RANGED_INTEGERP (0, tem, ST_COMMENT_STYLE)
> -			   ? XINT (tem)
> -			   : ST_COMMENT_STYLE));
> -
> -      oldstate = Fcdr (oldstate);
> -      tem = Fcar (oldstate);
> -      state.comstr_start =
> -	RANGED_INTEGERP (PTRDIFF_MIN, tem, PTRDIFF_MAX) ? XINT (tem) : -1;
> -      oldstate = Fcdr (oldstate);
> -      tem = Fcar (oldstate);
> -      while (!NILP (tem))		/* >= second enclosing sexps.  */
> -	{
> -	  Lisp_Object temhd = Fcar (tem);
> -	  if (RANGED_INTEGERP (PTRDIFF_MIN, temhd, PTRDIFF_MAX))
> -	    curlevel->last = XINT (temhd);
> -	  if (++curlevel == endlevel)
> -	    curlevel--; /* error ("Nesting too deep for parser"); */
> -	  curlevel->prev = -1;
> -	  curlevel->last = -1;
> -	  tem = Fcdr (tem);
> -	}
> +  tem = state->levelstarts;
> +  while (!NILP (tem))		/* >= second enclosing sexps.  */
> +    {
> +      Lisp_Object temhd = Fcar (tem);
> +      if (RANGED_INTEGERP (PTRDIFF_MIN, temhd, PTRDIFF_MAX))
> +        curlevel->last = XINT (temhd);
> +      if (++curlevel == endlevel)
> +        curlevel--; /* error ("Nesting too deep for parser"); */
> +      curlevel->prev = -1;
> +      curlevel->last = -1;
> +      tem = Fcdr (tem);
>      }
> -  state.quoted = 0;
> -  mindepth = depth;
> -
curlevel-> prev = -1;
curlevel-> last = -1;
 
> -  SETUP_SYNTAX_TABLE (prev_from, 1);
> -  temp = FETCH_CHAR (prev_from_byte);
> -  prev_from_syntax = SYNTAX_WITH_FLAGS (temp);
> -  UPDATE_SYNTAX_TABLE_FORWARD (from);
> +  state->quoted = 0;
> +  mindepth = depth;
> +
> +  SETUP_SYNTAX_TABLE (from, 1);
 
>    /* Enter the loop at a place appropriate for initial state.  */
 
> -  if (state.incomment)
> +  if (state->incomment)
>      goto startincomment;
> -  if (state.instring >= 0)
> +  if (state->instring >= 0)
>      {
> -      nofence = state.instring != ST_STRING_STYLE;
> +      nofence = state->instring != ST_STRING_STYLE;
>        if (start_quoted)
>  	goto startquotedinstring;
>        goto startinstring;
> @@ -3266,10 +3228,10 @@ do { prev_from = from;				\
>    while (from < end)
>      {
>        int syntax;
> -      INC_FROM;
> -      code = prev_from_syntax & 0xff;
 
>        if (from < end
> +          && (state->prev_comment_end == -1
> +              || prev_from >= state->prev_comment_end)
>  	  && SYNTAX_FLAGS_COMSTART_FIRST (prev_from_syntax)
>  	  && (c1 = FETCH_CHAR (from_byte),
>  	      syntax = SYNTAX_WITH_FLAGS (c1),
> @@ -3280,32 +3242,37 @@ do { prev_from = from;				\
>  	  /* Record the comment style we have entered so that only
>  	     the comment-end sequence of the same style actually
>  	     terminates the comment section.  */
> -	  state.comstyle
> +	  state->comstyle
>  	    = SYNTAX_FLAGS_COMMENT_STYLE (syntax, prev_from_syntax);
>  	  comnested = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax)
>  		       | SYNTAX_FLAGS_COMMENT_NESTED (syntax));
> -	  state.incomment = comnested ? 1 : -1;
> -	  state.comstr_start = prev_from;
> +	  state->incomment = comnested ? 1 : -1;
> +	  state->comstr_start = prev_from;
>  	  INC_FROM;
>  	  code = Scomment;
>  	}
> -      else if (code == Scomment_fence)
> -	{
> -	  /* Record the comment style we have entered so that only
> -	     the comment-end sequence of the same style actually
> -	     terminates the comment section.  */
> -	  state.comstyle = ST_COMMENT_STYLE;
> -	  state.incomment = -1;
> -	  state.comstr_start = prev_from;
> -	  code = Scomment;
> -	}
> -      else if (code == Scomment)
> -	{
> -	  state.comstyle = SYNTAX_FLAGS_COMMENT_STYLE (prev_from_syntax, 0);
> -	  state.incomment = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax) ?
> -			     1 : -1);
> -	  state.comstr_start = prev_from;
> -	}
> +      else
> +        {
> +          INC_FROM;
> +          code = prev_from_syntax & 0xff;
> +          if (code == Scomment_fence)
> +            {
> +              /* Record the comment style we have entered so that only
> +                 the comment-end sequence of the same style actually
> +                 terminates the comment section.  */
> +              state->comstyle = ST_COMMENT_STYLE;
> +              state->incomment = -1;
> +              state->comstr_start = prev_from;
> +              code = Scomment;
> +            }
> +          else if (code == Scomment)
> +            {
> +              state->comstyle = SYNTAX_FLAGS_COMMENT_STYLE (prev_from_syntax, 0);
> +              state->incomment = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax) ?
> +                                 1 : -1);
> +              state->comstr_start = prev_from;
> +            }
> +        }
 
>        if (SYNTAX_FLAGS_PREFIX (prev_from_syntax))
>  	continue;
> @@ -3357,18 +3324,21 @@ do { prev_from = from;				\
>  	     middle of it.  We don't want to do that if we're just at the
>  	     beginning of the comment (think of (*) ... (*)).  */
>  	  found = forw_comment (from, from_byte, end,
> -				state.incomment, state.comstyle,
> -				(from == BEGV || from < state.comstr_start + 3)
> +				state->incomment, state->comstyle,
> +				(from == BEGV || from < state->comstr_start + 3)
>  				? 0 : prev_from_syntax,
> -				&out_charpos, &out_bytepos, &state.incomment);
> +				&out_charpos, &out_bytepos, &state->incomment,
> +                                &prev_from_syntax);
>  	  from = out_charpos; from_byte = out_bytepos;
> -	  /* Beware!  prev_from and friends are invalid now.
> -	     Luckily, the `done' doesn't use them and the INC_FROM
> -	     sets them to a sane value without looking at them. */
> +	  /* Beware!  prev_from and friends (except prev_from_syntax)
> +	     are invalid now.  Luckily, the `done' doesn't use them
> +	     and the INC_FROM sets them to a sane value without
> +	     looking at them. */
>  	  if (!found) goto done;
>  	  INC_FROM;
> -	  state.incomment = 0;
> -	  state.comstyle = 0;	/* reset the comment style */
> +	  state->incomment = 0;
> +	  state->comstyle = 0;	/* reset the comment style */
> +          state->prev_comment_end = from;
>  	  if (boundary_stop) goto done;
>  	  break;
 
> @@ -3396,16 +3366,16 @@ do { prev_from = from;				\
 
>  	case Sstring:
>  	case Sstring_fence:
> -	  state.comstr_start = from - 1;
> +	  state->comstr_start = from - 1;
>  	  if (stopbefore) goto stop;  /* this arg means stop at sexp start */
curlevel-> last = prev_from;
> -	  state.instring = (code == Sstring
> +	  state->instring = (code == Sstring
>  			    ? (FETCH_CHAR_AS_MULTIBYTE (prev_from_byte))
>  			    : ST_STRING_STYLE);
>  	  if (boundary_stop) goto done;
>  	startinstring:
>  	  {
> -	    nofence = state.instring != ST_STRING_STYLE;
> +	    nofence = state->instring != ST_STRING_STYLE;
 
>  	    while (1)
>  	      {
> @@ -3419,7 +3389,7 @@ do { prev_from = from;				\
>  		/* Check C_CODE here so that if the char has
>  		   a syntax-table property which says it is NOT
>  		   a string character, it does not end the string.  */
> -		if (nofence && c == state.instring && c_code == Sstring)
> +		if (nofence && c == state->instring && c_code == Sstring)
>  		  break;
 
>  		switch (c_code)
> @@ -3442,7 +3412,7 @@ do { prev_from = from;				\
>  	      }
>  	  }
>  	string_end:
> -	  state.instring = -1;
> +	  state->instring = -1;
curlevel-> prev = curlevel->last;
>  	  INC_FROM;
>  	  if (boundary_stop) goto done;
> @@ -3461,25 +3431,99 @@ do { prev_from = from;				\
>   stop:   /* Here if stopping before start of sexp. */
>    from = prev_from;    /* We have just fetched the char that starts it; */
>    from_byte = prev_from_byte;
> +  prev_from_syntax = prev_prev_from_syntax;
>    goto done; /* but return the position before it. */
 
>   endquoted:
> -  state.quoted = 1;
> +  state->quoted = 1;
>   done:
> -  state.depth = depth;
> -  state.mindepth = mindepth;
> -  state.thislevelstart = curlevel->prev;
> -  state.prevlevelstart
> +  state->depth = depth;
> +  state->mindepth = mindepth;
> +  state->thislevelstart = curlevel->prev;
> +  state->prevlevelstart
>      = (curlevel == levelstart) ? -1 : (curlevel - 1)->last;
> -  state.location = from;
> -  state.location_byte = from_byte;
> -  state.levelstarts = Qnil;
> +  state->location = from;
> +  state->location_byte = from_byte;
> +  state->levelstarts = Qnil;
>    while (curlevel > levelstart)
> -    state.levelstarts = Fcons (make_number ((--curlevel)->last),
> -			       state.levelstarts);
> +    state->levelstarts = Fcons (make_number ((--curlevel)->last),
> +                                state->levelstarts);
> +  state->prev_syntax = prev_from_syntax;
>    immediate_quit = 0;
> +}
> +
> +/* Convert a (lisp) parse state to the internal form used in
> +   scan_sexps_forward.  */
> +static void
> +internalize_parse_state (Lisp_Object external, struct lisp_parse_state *state)
> +{
> +  Lisp_Object tem;
> +
> +  if (NILP (external))
> +    {
> +      state->depth = 0;
> +      state->instring = -1;
> +      state->incomment = 0;
> +      state->quoted = 0;
> +      state->comstyle = 0;	/* comment style a by default.  */
> +      state->comstr_start = -1;	/* no comment/string seen.  */
> +      state->levelstarts = Qnil;
> +      state->prev_syntax = Smax;
> +      state->prev_comment_end = -1;
> +    }
> +  else
> +    {
> +      tem = Fcar (external);
> +      if (!NILP (tem))
> +	state->depth = XINT (tem);
> +      else
> +	state->depth = 0;
> +
> +      external = Fcdr (external);
> +      external = Fcdr (external);
> +      external = Fcdr (external);
> +      tem = Fcar (external);
> +      /* Check whether we are inside string_fence-style string: */
> +      state->instring = (!NILP (tem)
> +                         ? (CHARACTERP (tem) ? XFASTINT (tem) : ST_STRING_STYLE)
> +                         : -1);
> +
> +      external = Fcdr (external);
> +      tem = Fcar (external);
> +      state->incomment = (!NILP (tem)
> +                          ? (INTEGERP (tem) ? XINT (tem) : -1)
> +                          : 0);
 
> -  *stateptr = state;
> +      external = Fcdr (external);
> +      tem = Fcar (external);
> +      state->quoted = !NILP (tem);
> +
> +      /* if the eighth element of the list is nil, we are in comment
> +	 style a.  If it is non-nil, we are in comment style b */
> +      external = Fcdr (external);
> +      external = Fcdr (external);
> +      tem = Fcar (external);
> +      state->comstyle = (NILP (tem)
> +                         ? 0
> +                         : (RANGED_INTEGERP (0, tem, ST_COMMENT_STYLE)
> +                            ? XINT (tem)
> +                            : ST_COMMENT_STYLE));
> +
> +      external = Fcdr (external);
> +      tem = Fcar (external);
> +      state->comstr_start =
> +	RANGED_INTEGERP (PTRDIFF_MIN, tem, PTRDIFF_MAX) ? XINT (tem) : -1;
> +      external = Fcdr (external);
> +      tem = Fcar (external);
> +      state->levelstarts = tem;
> +
> +      external = Fcdr (external);
> +      tem = Fcar (external);
> +      state->prev_syntax = NILP (tem) ? Smax : XINT (tem);
> +      external = Fcdr (external);
> +      tem = Fcar (external);
> +      state->prev_comment_end = NILP (tem) ? -1 : XINT (tem);
> +    }
>  }
 
>  DEFUN ("parse-partial-sexp", Fparse_partial_sexp, Sparse_partial_sexp, 2, 6, 0,
> @@ -3488,6 +3532,7 @@ Parsing stops at TO or when certain criteria are met;
>   point is set to where parsing stops.
>  If fifth arg OLDSTATE is omitted or nil,
>   parsing assumes that FROM is the beginning of a function.
> +
>  Value is a list of elements describing final state of parsing:
>   0. depth in parens.
>   1. character address of start of innermost containing list; nil if none.
> @@ -3501,16 +3546,20 @@ Value is a list of elements describing final state of parsing:
>   6. the minimum paren-depth encountered during this scan.
>   7. style of comment, if any.
>   8. character address of start of comment or string; nil if not in one.
> - 9. Intermediate data for continuation of parsing (subject to change).
> + 9. List of positions of currently open parens, outermost first.
> +10. Syntax of last character scanned, or nil if no scanning has happened.
> +11. Position after end of previous comment scanned, or nil.
> +12..... Possible further internal information used by `parse-partial-sexp'.
> +
>  If third arg TARGETDEPTH is non-nil, parsing stops if the depth
>  in parentheses becomes equal to TARGETDEPTH.
> -Fourth arg STOPBEFORE non-nil means stop when come to
> +Fourth arg STOPBEFORE non-nil means stop when we come to
>   any character that starts a sexp.
>  Fifth arg OLDSTATE is a list like what this function returns.
>   It is used to initialize the state of the parse.  Elements number 1, 2, 6
>   are ignored.
> -Sixth arg COMMENTSTOP non-nil means stop at the start of a comment.
> - If it is symbol `syntax-table', stop after the start of a comment or a
> +Sixth arg COMMENTSTOP non-nil means stop after the start of a comment.
> + If it is the symbol `syntax-table', stop after the start of a comment or a
>   string, or after end of a comment or a string.  */)
>    (Lisp_Object from, Lisp_Object to, Lisp_Object targetdepth,
>     Lisp_Object stopbefore, Lisp_Object oldstate, Lisp_Object commentstop)
> @@ -3527,15 +3576,17 @@ Sixth arg COMMENTSTOP non-nil means stop at the start of a comment.
>      target = TYPE_MINIMUM (EMACS_INT);	/* We won't reach this depth.  */
 
>    validate_region (&from, &to);
> +  internalize_parse_state (oldstate, &state);
>    scan_sexps_forward (&state, XINT (from), CHAR_TO_BYTE (XINT (from)),
>  		      XINT (to),
> -		      target, !NILP (stopbefore), oldstate,
> +		      target, !NILP (stopbefore),
>  		      (NILP (commentstop)
>  		       ? 0 : (EQ (commentstop, Qsyntax_table) ? -1 : 1)));
 
>    SET_PT_BOTH (state.location, state.location_byte);
 
> -  return Fcons (make_number (state.depth),
> +  return
> +    Fcons (make_number (state.depth),
>  	   Fcons (state.prevlevelstart < 0
>  		  ? Qnil : make_number (state.prevlevelstart),
>  	     Fcons (state.thislevelstart < 0
> @@ -3553,11 +3604,18 @@ Sixth arg COMMENTSTOP non-nil means stop at the start of a comment.
>  				  ? Qsyntax_table
>  				  : make_number (state.comstyle))
>  			       : Qnil),
> -			      Fcons (((state.incomment
> -				       || (state.instring >= 0))
> -				      ? make_number (state.comstr_start)
> -				      : Qnil),
> -				     Fcons (state.levelstarts, Qnil))))))))));
> +		         Fcons (((state.incomment
> +                                  || (state.instring >= 0))
> +                                 ? make_number (state.comstr_start)
> +                                 : Qnil),
> +			   Fcons (state.levelstarts,
> +                             Fcons (state.prev_syntax == Smax
> +                                    ? Qnil
> +                                    : make_number (state.prev_syntax),
> +                               Fcons (state.prev_comment_end == -1
> +                                      ? Qnil
> +                                      : make_number (state.prev_comment_end),
> +                                      Qnil))))))))))));
>  }
>  
>  void


>> Stefan

> -- 
> Alan Mackenzie (Nuremberg, Germany).


From debbugs-submit-bounces@debbugs.gnu.org Fri Mar 18 11:09:13 2016
Received: (at 23019) by debbugs.gnu.org; 18 Mar 2016 15:09:13 +0000
Received: from localhost ([127.0.0.1]:52783 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces@debbugs.gnu.org>)
	id 1agw1Z-0003Jy-0L
	for submit@debbugs.gnu.org; Fri, 18 Mar 2016 11:09:13 -0400
Received: from mail.muc.de ([193.149.48.3]:48652)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <acm@muc.de>) id 1agw1X-0003Jq-RT
 for 23019@debbugs.gnu.org; Fri, 18 Mar 2016 11:09:12 -0400
Received: (qmail 21732 invoked by uid 3782); 18 Mar 2016 15:09:09 -0000
Received: from acm.muc.de (p548A53B1.dip0.t-ipconnect.de [84.138.83.177]) by
 colin.muc.de (tmda-ofmipd) with ESMTP;
 Fri, 18 Mar 2016 16:09:07 +0100
Received: (qmail 9531 invoked by uid 1000); 18 Mar 2016 15:11:55 -0000
Date: Fri, 18 Mar 2016 15:11:55 +0000
To: Stefan Monnier <monnier@IRO.UMontreal.CA>
Subject: Re: bug#23019: parse-partial-sexp doesn't output the full state
 needed for its continuance.
Message-ID: <20160318151154.GA9433@acm.fritz.box>
References: <20160315091355.GA2263@acm.fritz.box>
 <jwvvb4lclp0.fsf-monnier+emacsbug@gnu.org>
 <20160317214934.GB9038@acm.fritz.box>
 <jwvmvpwcsn1.fsf-monnier+emacsbugs@gnu.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <jwvmvpwcsn1.fsf-monnier+emacsbugs@gnu.org>
User-Agent: Mutt/1.5.24 (2015-08-30)
X-Delivery-Agent: TMDA/1.1.12 (Macallan)
From: Alan Mackenzie <acm@muc.de>
X-Primary-Address: acm@muc.de
X-Spam-Score: -0.0 (/)
X-Debbugs-Envelope-To: 23019
Cc: 23019@debbugs.gnu.org
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit@debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request@debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
X-Spam-Score: -0.0 (/)

Hello, Stefan.

On Fri, Mar 18, 2016 at 12:49:07AM -0400, Stefan Monnier wrote:
> > Do this by adding two new fields to the parser state: the syntax of the last
> > character scanned, and the last end of comment scanned.  This should make the
> > parser state complete.

> Thanks.  I like the "syntax of the last character scanned", but I don't
> understand the reasoning behind "last end of comment scanned".  Why is
> this relevant?  Is it in case the "last character scanned" was a "slash
> ending a comment" so as to avoid treating "*/*" as both a comment closer and
> a subsequent opener?

That's exactly the reason.

> If so, I'm not sure I like it.

I don't really like it either.

> It sounds to me like there's a chance it's actually incomplete (e.g.
> it doesn't address the similar problem when the "last character
> scanned" is an end of a string which also happens to be a valid
> first-char of a comment-starter), and even if it isn't, it "feels
> ad-hoc" to me.

Now even I wouldn't have come up with that end-of-string scenario.  ;-)
Such a scenario is presumably one reason why, in scan_sexps_forward, two
character comment delimiters are handled before strings.

> Would it be difficult to do the following instead:
> - get rid of element 11.

Done.

> - change element 10 so it's nil if the last char was an "end of
>   something".  Another way to look at it, is that the element 10 should
>   only be non-nil if the "next lexeme" might start on that
>   previous character.

I've tried this, and it's somewhat ugly.  Setting the "previous_syntax"
to nil is also needed for the asterisk in "/*".  The nil would appear to
mean "the syntactic value of the last character has already been used
up".  So the "previous_syntax" is nil in the most interesting cases.  It
also feels somewhat ad-hoc.

How about this idea: element 10 will record the syntax of the previous
character ONLY when it is potentially the first character of a two
character comment delimiter, otherwise it'll be nil.  At least that's
being honest about what the thing's being used for.

> I also have a side question: IIUC your patch makes the 5th element
> redundant (can be replaced with a test whether "last char syntax" was
> "escape"), is that right?

It would appear to be, yes.  We really can't get rid of element 5,
though, because there will surely be code out there that uses it.  But
if I change element 10 as outlined above, element 5 will no longer be
redundant.

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).


From debbugs-submit-bounces@debbugs.gnu.org Fri Mar 18 11:19:35 2016
Received: (at 23019) by debbugs.gnu.org; 18 Mar 2016 15:19:35 +0000
Received: from localhost ([127.0.0.1]:52791 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces@debbugs.gnu.org>)
	id 1agwBb-0003ZA-Cq
	for submit@debbugs.gnu.org; Fri, 18 Mar 2016 11:19:35 -0400
Received: from mail.muc.de ([193.149.48.3]:25701)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <acm@muc.de>) id 1agwBZ-0003Z2-5Y
 for 23019@debbugs.gnu.org; Fri, 18 Mar 2016 11:19:33 -0400
Received: (qmail 24108 invoked by uid 3782); 18 Mar 2016 15:19:32 -0000
Received: from acm.muc.de (p548A53B1.dip0.t-ipconnect.de [84.138.83.177]) by
 colin.muc.de (tmda-ofmipd) with ESMTP;
 Fri, 18 Mar 2016 16:19:31 +0100
Received: (qmail 9578 invoked by uid 1000); 18 Mar 2016 15:22:18 -0000
Date: Fri, 18 Mar 2016 15:22:18 +0000
To: Stefan Monnier <monnier@IRO.UMontreal.CA>
Subject: Re: bug#23019: parse-partial-sexp doesn't output the full state
 needed for its continuance.
Message-ID: <20160318152218.GA9552@acm.fritz.box>
References: <20160315091355.GA2263@acm.fritz.box>
 <jwvvb4lclp0.fsf-monnier+emacsbug@gnu.org>
 <20160317214934.GB9038@acm.fritz.box>
 <jwvmvpwcsn1.fsf-monnier+emacsbugs@gnu.org>
 <20160318151154.GA9433@acm.fritz.box>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20160318151154.GA9433@acm.fritz.box>
User-Agent: Mutt/1.5.24 (2015-08-30)
X-Delivery-Agent: TMDA/1.1.12 (Macallan)
From: Alan Mackenzie <acm@muc.de>
X-Primary-Address: acm@muc.de
X-Spam-Score: -0.0 (/)
X-Debbugs-Envelope-To: 23019
Cc: 23019@debbugs.gnu.org
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit@debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request@debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
X-Spam-Score: -0.0 (/)

Hello again, Stefan.

On Fri, Mar 18, 2016 at 03:11:55PM +0000, Alan Mackenzie wrote:

> Done.

> > - change element 10 so it's nil if the last char was an "end of
> >   something".  Another way to look at it, is that the element 10 should
> >   only be non-nil if the "next lexeme" might start on that
> >   previous character.

[ .... ]

> How about this idea: element 10 will record the syntax of the previous
> character ONLY when it is potentially the first character of a two
> character comment delimiter, otherwise it'll be nil.  At least that's
> being honest about what the thing's being used for.

That's exactly what you suggested.  Apologies for not reading your post
a bit more carefully.  I think we're agreed, then.  I'll implement it.

> >         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).


From debbugs-submit-bounces@debbugs.gnu.org Fri Mar 18 12:23:10 2016
Received: (at 23019) by debbugs.gnu.org; 18 Mar 2016 16:23:10 +0000
Received: from localhost ([127.0.0.1]:52848 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces@debbugs.gnu.org>)
	id 1agxB8-00056u-73
	for submit@debbugs.gnu.org; Fri, 18 Mar 2016 12:23:10 -0400
Received: from ironport2-out.teksavvy.com ([206.248.154.181]:23809)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <monnier@iro.umontreal.ca>) id 1agxB6-00056f-Hy
 for 23019@debbugs.gnu.org; Fri, 18 Mar 2016 12:23:08 -0400
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A0A+FgA731xV/xSQs2tcgxCEAoVVwwsEAgKBPD0QAQEBAQEBAYEKQQWDXQEBAwFWIwULCw4mEhQYDSSINwjPIwEBAQEGAQEBAR6LOoUFB4QtBbUEI4FmJBwVgVkigngBAQE
X-IPAS-Result: A0A+FgA731xV/xSQs2tcgxCEAoVVwwsEAgKBPD0QAQEBAQEBAYEKQQWDXQEBAwFWIwULCw4mEhQYDSSINwjPIwEBAQEGAQEBAR6LOoUFB4QtBbUEI4FmJBwVgVkigngBAQE
X-IronPort-AV: E=Sophos;i="5.13,465,1427774400"; d="scan'208";a="196675682"
Received: from 107-179-144-20.cpe.teksavvy.com (HELO pastel.home)
 ([107.179.144.20])
 by ironport2-out.teksavvy.com with ESMTP; 18 Mar 2016 12:23:03 -0400
Received: by pastel.home (Postfix, from userid 20848)
 id E37965FE67; Fri, 18 Mar 2016 12:23:02 -0400 (EDT)
From: Stefan Monnier <monnier@iro.umontreal.ca>
To: Alan Mackenzie <acm@muc.de>
Subject: Re: bug#23019: parse-partial-sexp doesn't output the full state
 needed for its continuance.
Message-ID: <jwv8u1f930c.fsf-monnier+emacsbugs@gnu.org>
References: <20160315091355.GA2263@acm.fritz.box>
 <jwvvb4lclp0.fsf-monnier+emacsbug@gnu.org>
 <20160317214934.GB9038@acm.fritz.box>
 <jwvmvpwcsn1.fsf-monnier+emacsbugs@gnu.org>
 <20160318151154.GA9433@acm.fritz.box>
Date: Fri, 18 Mar 2016 12:23:02 -0400
In-Reply-To: <20160318151154.GA9433@acm.fritz.box> (Alan Mackenzie's message
 of "Fri, 18 Mar 2016 15:11:55 +0000")
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1.50 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: 0.3 (/)
X-Debbugs-Envelope-To: 23019
Cc: 23019@debbugs.gnu.org
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit@debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request@debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
X-Spam-Score: 0.3 (/)

>> It sounds to me like there's a chance it's actually incomplete (e.g.
>> it doesn't address the similar problem when the "last character
>> scanned" is an end of a string which also happens to be a valid
>> first-char of a comment-starter), and even if it isn't, it "feels
>> ad-hoc" to me.
> Now even I wouldn't have come up with that end-of-string scenario.  ;-)

I don't work in embedded systems, but Coq/Agda's total functions force
you to consider all possible cases.

> Such a scenario is presumably one reason why, in scan_sexps_forward, two
> character comment delimiters are handled before strings.

It doesn't handle the exact same situation, but it's closely related
indeed.

>> - change element 10 so it's nil if the last char was an "end of
>> something".  Another way to look at it, is that the element 10 should
>> only be non-nil if the "next lexeme" might start on that
>> previous character.

> I've tried this, and it's somewhat ugly.  Setting the "previous_syntax"
> to nil is also needed for the asterisk in "/*".  The nil would appear to
> mean "the syntactic value of the last character has already been used
> up".  So the "previous_syntax" is nil in the most interesting cases.  It
> also feels somewhat ad-hoc.

> How about this idea: element 10 will record the syntax of the previous
> character ONLY when it is potentially the first character of a two
> character comment delimiter, otherwise it'll be nil.  At least that's
> being honest about what the thing's being used for.

IIUC the only difference between what I (think I) suggested and what
you're proposing is that you want to return nil for the "prev is
backslash" whereas I was suggesting to return non-nil in that case.
[ AFAIK the only two-char elements we handle so far as the comment
delimiters and the backslash escapes.  ]
Do I understand this right?

> It would appear to be, yes.  We really can't get rid of element 5,
> though, because there will surely be code out there that uses it.  But
> if I change element 10 as outlined above, element 5 will no longer be
> redundant.

I'd even be tempted to re-use element 5, although it might
conceivably break some code out there.

But even if we don't re-use element 5, I would actually much prefer to
render element 5 redundant.


        Stefan


From debbugs-submit-bounces@debbugs.gnu.org Fri Mar 18 12:27:50 2016
Received: (at 23019) by debbugs.gnu.org; 18 Mar 2016 16:27:50 +0000
Received: from localhost ([127.0.0.1]:52852 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces@debbugs.gnu.org>)
	id 1agxFd-0005D8-QE
	for submit@debbugs.gnu.org; Fri, 18 Mar 2016 12:27:49 -0400
Received: from ironport2-out.teksavvy.com ([206.248.154.181]:34765)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <monnier@iro.umontreal.ca>) id 1agxFb-0005Cv-Gy
 for 23019@debbugs.gnu.org; Fri, 18 Mar 2016 12:27:47 -0400
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A0A+FgA731xV/xSQs2tcgxCEAoVVwwsEAgKBPD0QAQEBAQEBAYEKQQWDXQEBBFYjEAsOJhIUGA0kiD/PIwEBAQEGAQEBAR6LOoUFB4QtBbUEI4I7gVkigngBAQE
X-IPAS-Result: A0A+FgA731xV/xSQs2tcgxCEAoVVwwsEAgKBPD0QAQEBAQEBAYEKQQWDXQEBBFYjEAsOJhIUGA0kiD/PIwEBAQEGAQEBAR6LOoUFB4QtBbUEI4I7gVkigngBAQE
X-IronPort-AV: E=Sophos;i="5.13,465,1427774400"; d="scan'208";a="196676235"
Received: from 107-179-144-20.cpe.teksavvy.com (HELO pastel.home)
 ([107.179.144.20])
 by ironport2-out.teksavvy.com with ESMTP; 18 Mar 2016 12:27:36 -0400
Received: by pastel.home (Postfix, from userid 20848)
 id C76E85FE67; Fri, 18 Mar 2016 12:27:36 -0400 (EDT)
From: Stefan Monnier <monnier@iro.umontreal.ca>
To: Alan Mackenzie <acm@muc.de>
Subject: Re: bug#23019: parse-partial-sexp doesn't output the full state
 needed for its continuance.
Message-ID: <jwv37rn92lg.fsf-monnier+emacsbugs@gnu.org>
References: <20160315091355.GA2263@acm.fritz.box>
 <jwvvb4lclp0.fsf-monnier+emacsbug@gnu.org>
 <20160317214934.GB9038@acm.fritz.box>
Date: Fri, 18 Mar 2016 12:27:36 -0400
In-Reply-To: <20160317214934.GB9038@acm.fritz.box> (Alan Mackenzie's message
 of "Thu, 17 Mar 2016 21:49:34 +0000")
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1.50 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: 0.3 (/)
X-Debbugs-Envelope-To: 23019
Cc: 23019@debbugs.gnu.org
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit@debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request@debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
X-Spam-Score: 0.3 (/)

> (scan_sexps_forward): Remove a redundant state parameter.  Access all `state'
> information via the address parameter `state'.

Have you taken a look at the performance impact of this part of the change?
I don't expect it will make much difference, but I'm actually wondering
whether it makes things slower or faster.


        Stefan


From debbugs-submit-bounces@debbugs.gnu.org Fri Mar 18 14:23:07 2016
Received: (at 23019) by debbugs.gnu.org; 18 Mar 2016 18:23:07 +0000
Received: from localhost ([127.0.0.1]:52943 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces@debbugs.gnu.org>)
	id 1agz3C-00082q-6S
	for submit@debbugs.gnu.org; Fri, 18 Mar 2016 14:23:07 -0400
Received: from mail.muc.de ([193.149.48.3]:10776)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <acm@muc.de>) id 1agz39-00082h-Oe
 for 23019@debbugs.gnu.org; Fri, 18 Mar 2016 14:23:05 -0400
Received: (qmail 68212 invoked by uid 3782); 18 Mar 2016 18:23:02 -0000
Received: from acm.muc.de (p548A53B1.dip0.t-ipconnect.de [84.138.83.177]) by
 colin.muc.de (tmda-ofmipd) with ESMTP;
 Fri, 18 Mar 2016 19:23:01 +0100
Received: (qmail 11281 invoked by uid 1000); 18 Mar 2016 18:25:47 -0000
Date: Fri, 18 Mar 2016 18:25:47 +0000
To: Stefan Monnier <monnier@iro.umontreal.ca>
Subject: Re: bug#23019: parse-partial-sexp doesn't output the full state
 needed for its continuance.
Message-ID: <20160318182547.GB9433@acm.fritz.box>
References: <20160315091355.GA2263@acm.fritz.box>
 <jwvvb4lclp0.fsf-monnier+emacsbug@gnu.org>
 <20160317214934.GB9038@acm.fritz.box>
 <jwvmvpwcsn1.fsf-monnier+emacsbugs@gnu.org>
 <20160318151154.GA9433@acm.fritz.box>
 <jwv8u1f930c.fsf-monnier+emacsbugs@gnu.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <jwv8u1f930c.fsf-monnier+emacsbugs@gnu.org>
User-Agent: Mutt/1.5.24 (2015-08-30)
X-Delivery-Agent: TMDA/1.1.12 (Macallan)
From: Alan Mackenzie <acm@muc.de>
X-Primary-Address: acm@muc.de
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 23019
Cc: 23019@debbugs.gnu.org
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit@debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request@debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
X-Spam-Score: -0.0 (/)

Hello, Stefan.

On Fri, Mar 18, 2016 at 12:23:02PM -0400, Stefan Monnier wrote:

> >> - change element 10 so it's nil if the last char was an "end of
> >> something".  Another way to look at it, is that the element 10 should
> >> only be non-nil if the "next lexeme" might start on that
> >> previous character.

> > I've tried this, and it's somewhat ugly.  Setting the "previous_syntax"
> > to nil is also needed for the asterisk in "/*".  The nil would appear to
> > mean "the syntactic value of the last character has already been used
> > up".  So the "previous_syntax" is nil in the most interesting cases.  It
> > also feels somewhat ad-hoc.

> > How about this idea: element 10 will record the syntax of the previous
> > character ONLY when it is potentially the first character of a two
> > character comment delimiter, otherwise it'll be nil.  At least that's
> > being honest about what the thing's being used for.

> IIUC the only difference between what I (think I) suggested and what
> you're proposing is that you want to return nil for the "prev is
> backslash" whereas I was suggesting to return non-nil in that case.
> [ AFAIK the only two-char elements we handle so far as the comment
> delimiters and the backslash escapes.  ]

We also have Scharquote, which scan_sexps_forward handles identically to
Sescape.

> Do I understand this right?

Yes, but I've no strong feelings on the matter.

> > It would appear to be, yes.  We really can't get rid of element 5,
> > though, because there will surely be code out there that uses it.  But
> > if I change element 10 as outlined above, element 5 will no longer be
> > redundant.

> I'd even be tempted to re-use element 5, although it might
> conceivably break some code out there.

I have bad feelings about that.  Is it really worth the risk, just to
save one cons cell on a list that not that many instances of exist at
any time?

> But even if we don't re-use element 5, I would actually much prefer to
> render element 5 redundant.

OK.  Here's an updated patch which does just that.  Comments would be
welcome.

>         Stefan


Amend parse-partial-sexp correctly to handle two character comment delimiters

Do this by adding a new field to the parser state: the syntax of the last
character scanned, should that be the first char of a (potential) two char
construct, nil otherwise.
This should make the parser state complete.
Also document element 9 of the parser state.  Also refactor the code a bit.

* src/syntax.c (struct lisp_parse_state): Add a new field.
(SYNTAX_FLAGS_COMSTARTEND_FIRST): New function.
(internalize_parse_state): New function, extracted from scan_sexps_forward.
(back_comment): Call internalize_parse_state.
(forw_comment): Return the syntax of the last character scanned to the caller.
(Fforward_comment, scan_lists): New dummy variables, passed to forw_comment.
(scan_sexps_forward): Remove a redundant state parameter.  Access all `state'
information via the address parameter `state'.  Remove the code which converts
from external to internal form of `state'.  Access buffer contents only from
`from' onwards.  Reformulate code at the top of the main loop correctly to
recognize comment openers when starting in the middle of one.  Call
forw_comment with extra argument (for return of final syntax value).
(Fparse_partial_sexp): Document elements 9, 10 of the parser state in the
doc string.  Clarify the doc string in general.  Call
internalize_parse_state.  Take account of the new elements when consing up the
output parser state.

* doc/lispref/syntax.texi: (Parser State): Document element 9 and the new
element 10.  Minor wording corrections (remove reference to "trivial cases").
(Low Level Parsing): Minor corrections.


diff --git a/doc/lispref/syntax.texi b/doc/lispref/syntax.texi
index d5a7eba..f81c164 100644
--- a/doc/lispref/syntax.texi
+++ b/doc/lispref/syntax.texi
@@ -791,10 +791,10 @@ Parser State
 @subsection Parser State
 @cindex parser state
 
-  A @dfn{parser state} is a list of ten elements describing the state
-of the syntactic parser, after it parses the text between a specified
-starting point and a specified end point in the buffer.  Parsing
-functions such as @code{syntax-ppss}
+  A @dfn{parser state} is a list of (currently) eleven elements
+describing the state of the syntactic parser, after it parses the text
+between a specified starting point and a specified end point in the
+buffer.  Parsing functions such as @code{syntax-ppss}
 @ifnottex
 (@pxref{Position Parse})
 @end ifnottex
@@ -851,15 +851,20 @@ Parser State
 this element is @code{nil}.
 
 @item
-Internal data for continuing the parsing.  The meaning of this
-data is subject to change; it is used if you pass this list
-as the @var{state} argument to another call.
+The list of the positions of the currently open parentheses, starting
+with the outermost.
+
+@item
+When the last buffer position scanned was the (potential) first
+character of a two character construct (comment delimiter or
+escaped/char-quoted character pair), the @var{syntax-code}
+(@pxref{Syntax Table Internals}) of that position.  Otherwise
+@code{nil}.
 @end enumerate
 
   Elements 1, 2, and 6 are ignored in a state which you pass as an
-argument to continue parsing, and elements 8 and 9 are used only in
-trivial cases.  Those elements are mainly used internally by the
-parser code.
+argument to continue parsing.  Elements 9 and 10 are mainly used
+internally by the parser code.
 
   One additional piece of useful information is available from a
 parser state using this function:
@@ -898,11 +903,11 @@ Low-Level Parsing
 
 If the fourth argument @var{stop-before} is non-@code{nil}, parsing
 stops when it comes to any character that starts a sexp.  If
-@var{stop-comment} is non-@code{nil}, parsing stops when it comes to the
-start of an unnested comment.  If @var{stop-comment} is the symbol
+@var{stop-comment} is non-@code{nil}, parsing stops after the start of
+an unnested comment.  If @var{stop-comment} is the symbol
 @code{syntax-table}, parsing stops after the start of an unnested
-comment or a string, or the end of an unnested comment or a string,
-whichever comes first.
+comment or a string, or after the end of an unnested comment or a
+string, whichever comes first.
 
 If @var{state} is @code{nil}, @var{start} is assumed to be at the top
 level of parenthesis structure, such as the beginning of a function
diff --git a/src/syntax.c b/src/syntax.c
index 249d0d5..e6a1942 100644
--- a/src/syntax.c
+++ b/src/syntax.c
@@ -81,6 +81,11 @@ SYNTAX_FLAGS_COMEND_SECOND (int flags)
   return (flags >> 19) & 1;
 }
 static bool
+SYNTAX_FLAGS_COMSTARTEND_FIRST (int flags)
+{
+  return (flags & 0x50000) != 0;
+}
+static bool
 SYNTAX_FLAGS_PREFIX (int flags)
 {
   return (flags >> 20) & 1;
@@ -153,6 +158,10 @@ struct lisp_parse_state
     ptrdiff_t comstr_start;  /* Position of last comment/string starter.  */
     Lisp_Object levelstarts; /* Char numbers of starts-of-expression
 				of levels (starting from outermost).  */
+    int prev_syntax; /* Syntax of previous position scanned, when
+                        that position (potentially) holds the first char
+                        of a 2-char construct, i.e. comment delimiter
+                        or Sescape, etc.  Smax otherwise. */
   };
 
 /* These variables are a cache for finding the start of a defun.
@@ -176,7 +185,8 @@ static Lisp_Object skip_syntaxes (bool, Lisp_Object, Lisp_Object);
 static Lisp_Object scan_lists (EMACS_INT, EMACS_INT, EMACS_INT, bool);
 static void scan_sexps_forward (struct lisp_parse_state *,
                                 ptrdiff_t, ptrdiff_t, ptrdiff_t, EMACS_INT,
-                                bool, Lisp_Object, int);
+                                bool, int);
+static void internalize_parse_state (Lisp_Object, struct lisp_parse_state *);
 static bool in_classes (int, Lisp_Object);
 static void parse_sexp_propertize (ptrdiff_t charpos);
 
@@ -911,10 +921,11 @@ back_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop,
 	}
       do
 	{
+          internalize_parse_state (Qnil, &state);
 	  scan_sexps_forward (&state,
 			      defun_start, defun_start_byte,
 			      comment_end, TYPE_MINIMUM (EMACS_INT),
-			      0, Qnil, 0);
+			      0, 0);
 	  defun_start = comment_end;
 	  if (!adjusted)
 	    {
@@ -2314,7 +2325,9 @@ in_classes (int c, Lisp_Object iso_classes)
    into *CHARPOS_PTR and the corresponding bytepos into *BYTEPOS_PTR.
    Else, return false and store the charpos STOP into *CHARPOS_PTR, the
    corresponding bytepos into *BYTEPOS_PTR and the current nesting
-   (as defined for state.incomment) in *INCOMMENT_PTR.
+   (as defined for state->incomment) in *INCOMMENT_PTR.  The
+   SYNTAX_WITH_FLAGS of the last character scanned in the comment is
+   stored into *last_syntax_ptr.
 
    The comment end is the last character of the comment rather than the
    character just after the comment.
@@ -2326,7 +2339,7 @@ static bool
 forw_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop,
 	      EMACS_INT nesting, int style, int prev_syntax,
 	      ptrdiff_t *charpos_ptr, ptrdiff_t *bytepos_ptr,
-	      EMACS_INT *incomment_ptr)
+	      EMACS_INT *incomment_ptr, int *last_syntax_ptr)
 {
   register int c, c1;
   register enum syntaxcode code;
@@ -2346,6 +2359,7 @@ forw_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop,
 	  *incomment_ptr = nesting;
 	  *charpos_ptr = from;
 	  *bytepos_ptr = from_byte;
+          *last_syntax_ptr = syntax;
 	  return 0;
 	}
       c = FETCH_CHAR_AS_MULTIBYTE (from_byte);
@@ -2415,6 +2429,7 @@ forw_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop,
     }
   *charpos_ptr = from;
   *bytepos_ptr = from_byte;
+  *last_syntax_ptr = syntax;
   return 1;
 }
 
@@ -2436,6 +2451,7 @@ between them, return t; otherwise return nil.  */)
   EMACS_INT count1;
   ptrdiff_t out_charpos, out_bytepos;
   EMACS_INT dummy;
+  int dummy2;
 
   CHECK_NUMBER (count);
   count1 = XINT (count);
@@ -2499,7 +2515,7 @@ between them, return t; otherwise return nil.  */)
 	}
       /* We're at the start of a comment.  */
       found = forw_comment (from, from_byte, stop, comnested, comstyle, 0,
-			    &out_charpos, &out_bytepos, &dummy);
+			    &out_charpos, &out_bytepos, &dummy, &dummy2);
       from = out_charpos; from_byte = out_bytepos;
       if (!found)
 	{
@@ -2659,6 +2675,7 @@ scan_lists (EMACS_INT from, EMACS_INT count, EMACS_INT depth, bool sexpflag)
   ptrdiff_t from_byte;
   ptrdiff_t out_bytepos, out_charpos;
   EMACS_INT dummy;
+  int dummy2;
   bool multibyte_symbol_p = sexpflag && multibyte_syntax_as_symbol;
 
   if (depth > 0) min_depth = 0;
@@ -2755,7 +2772,8 @@ scan_lists (EMACS_INT from, EMACS_INT count, EMACS_INT depth, bool sexpflag)
 	      UPDATE_SYNTAX_TABLE_FORWARD (from);
 	      found = forw_comment (from, from_byte, stop,
 				    comnested, comstyle, 0,
-				    &out_charpos, &out_bytepos, &dummy);
+				    &out_charpos, &out_bytepos, &dummy,
+                                    &dummy2);
 	      from = out_charpos, from_byte = out_bytepos;
 	      if (!found)
 		{
@@ -3119,7 +3137,7 @@ the prefix syntax flag (p).  */)
 }
 
 /* Parse forward from FROM / FROM_BYTE to END,
-   assuming that FROM has state OLDSTATE (nil means FROM is start of function),
+   assuming that FROM has state STATE,
    and return a description of the state of the parse at END.
    If STOPBEFORE, stop at the start of an atom.
    If COMMENTSTOP is 1, stop at the start of a comment.
@@ -3127,12 +3145,11 @@ the prefix syntax flag (p).  */)
    after the beginning of a string, or after the end of a string.  */
 
 static void
-scan_sexps_forward (struct lisp_parse_state *stateptr,
+scan_sexps_forward (struct lisp_parse_state *state,
 		    ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t end,
 		    EMACS_INT targetdepth, bool stopbefore,
-		    Lisp_Object oldstate, int commentstop)
+		    int commentstop)
 {
-  struct lisp_parse_state state;
   enum syntaxcode code;
   int c1;
   bool comnested;
@@ -3148,7 +3165,7 @@ scan_sexps_forward (struct lisp_parse_state *stateptr,
   Lisp_Object tem;
   ptrdiff_t prev_from;		/* Keep one character before FROM.  */
   ptrdiff_t prev_from_byte;
-  int prev_from_syntax;
+  int prev_from_syntax, prev_prev_from_syntax;
   bool boundary_stop = commentstop == -1;
   bool nofence;
   bool found;
@@ -3165,6 +3182,7 @@ scan_sexps_forward (struct lisp_parse_state *stateptr,
 do { prev_from = from;				\
      prev_from_byte = from_byte; 		\
      temp = FETCH_CHAR_AS_MULTIBYTE (prev_from_byte);	\
+     prev_prev_from_syntax = prev_from_syntax;  \
      prev_from_syntax = SYNTAX_WITH_FLAGS (temp); \
      INC_BOTH (from, from_byte);		\
      if (from < end)				\
@@ -3174,88 +3192,38 @@ do { prev_from = from;				\
   immediate_quit = 1;
   QUIT;
 
-  if (NILP (oldstate))
-    {
-      depth = 0;
-      state.instring = -1;
-      state.incomment = 0;
-      state.comstyle = 0;	/* comment style a by default.  */
-      state.comstr_start = -1;	/* no comment/string seen.  */
-    }
-  else
-    {
-      tem = Fcar (oldstate);
-      if (!NILP (tem))
-	depth = XINT (tem);
-      else
-	depth = 0;
-
-      oldstate = Fcdr (oldstate);
-      oldstate = Fcdr (oldstate);
-      oldstate = Fcdr (oldstate);
-      tem = Fcar (oldstate);
-      /* Check whether we are inside string_fence-style string: */
-      state.instring = (!NILP (tem)
-			? (CHARACTERP (tem) ? XFASTINT (tem) : ST_STRING_STYLE)
-			: -1);
+  depth = state->depth;
+  start_quoted = state->quoted;
+  prev_prev_from_syntax = Smax;
+  prev_from_syntax = state->prev_syntax;
 
-      oldstate = Fcdr (oldstate);
-      tem = Fcar (oldstate);
-      state.incomment = (!NILP (tem)
-			 ? (INTEGERP (tem) ? XINT (tem) : -1)
-			 : 0);
-
-      oldstate = Fcdr (oldstate);
-      tem = Fcar (oldstate);
-      start_quoted = !NILP (tem);
-
-      /* if the eighth element of the list is nil, we are in comment
-	 style a.  If it is non-nil, we are in comment style b */
-      oldstate = Fcdr (oldstate);
-      oldstate = Fcdr (oldstate);
-      tem = Fcar (oldstate);
-      state.comstyle = (NILP (tem)
-			? 0
-			: (RANGED_INTEGERP (0, tem, ST_COMMENT_STYLE)
-			   ? XINT (tem)
-			   : ST_COMMENT_STYLE));
-
-      oldstate = Fcdr (oldstate);
-      tem = Fcar (oldstate);
-      state.comstr_start =
-	RANGED_INTEGERP (PTRDIFF_MIN, tem, PTRDIFF_MAX) ? XINT (tem) : -1;
-      oldstate = Fcdr (oldstate);
-      tem = Fcar (oldstate);
-      while (!NILP (tem))		/* >= second enclosing sexps.  */
-	{
-	  Lisp_Object temhd = Fcar (tem);
-	  if (RANGED_INTEGERP (PTRDIFF_MIN, temhd, PTRDIFF_MAX))
-	    curlevel->last = XINT (temhd);
-	  if (++curlevel == endlevel)
-	    curlevel--; /* error ("Nesting too deep for parser"); */
-	  curlevel->prev = -1;
-	  curlevel->last = -1;
-	  tem = Fcdr (tem);
-	}
+  tem = state->levelstarts;
+  while (!NILP (tem))		/* >= second enclosing sexps.  */
+    {
+      Lisp_Object temhd = Fcar (tem);
+      if (RANGED_INTEGERP (PTRDIFF_MIN, temhd, PTRDIFF_MAX))
+        curlevel->last = XINT (temhd);
+      if (++curlevel == endlevel)
+        curlevel--; /* error ("Nesting too deep for parser"); */
+      curlevel->prev = -1;
+      curlevel->last = -1;
+      tem = Fcdr (tem);
     }
-  state.quoted = 0;
-  mindepth = depth;
-
   curlevel->prev = -1;
   curlevel->last = -1;
 
-  SETUP_SYNTAX_TABLE (prev_from, 1);
-  temp = FETCH_CHAR (prev_from_byte);
-  prev_from_syntax = SYNTAX_WITH_FLAGS (temp);
-  UPDATE_SYNTAX_TABLE_FORWARD (from);
+  state->quoted = 0;
+  mindepth = depth;
+
+  SETUP_SYNTAX_TABLE (from, 1);
 
   /* Enter the loop at a place appropriate for initial state.  */
 
-  if (state.incomment)
+  if (state->incomment)
     goto startincomment;
-  if (state.instring >= 0)
+  if (state->instring >= 0)
     {
-      nofence = state.instring != ST_STRING_STYLE;
+      nofence = state->instring != ST_STRING_STYLE;
       if (start_quoted)
 	goto startquotedinstring;
       goto startinstring;
@@ -3266,11 +3234,8 @@ do { prev_from = from;				\
   while (from < end)
     {
       int syntax;
-      INC_FROM;
-      code = prev_from_syntax & 0xff;
 
-      if (from < end
-	  && SYNTAX_FLAGS_COMSTART_FIRST (prev_from_syntax)
+      if (SYNTAX_FLAGS_COMSTART_FIRST (prev_from_syntax)
 	  && (c1 = FETCH_CHAR (from_byte),
 	      syntax = SYNTAX_WITH_FLAGS (c1),
 	      SYNTAX_FLAGS_COMSTART_SECOND (syntax)))
@@ -3280,32 +3245,39 @@ do { prev_from = from;				\
 	  /* Record the comment style we have entered so that only
 	     the comment-end sequence of the same style actually
 	     terminates the comment section.  */
-	  state.comstyle
+	  state->comstyle
 	    = SYNTAX_FLAGS_COMMENT_STYLE (syntax, prev_from_syntax);
 	  comnested = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax)
 		       | SYNTAX_FLAGS_COMMENT_NESTED (syntax));
-	  state.incomment = comnested ? 1 : -1;
-	  state.comstr_start = prev_from;
+	  state->incomment = comnested ? 1 : -1;
+	  state->comstr_start = prev_from;
 	  INC_FROM;
+          prev_from_syntax = Smax; /* the syntax has already been
+                                      "used up". */
 	  code = Scomment;
 	}
-      else if (code == Scomment_fence)
-	{
-	  /* Record the comment style we have entered so that only
-	     the comment-end sequence of the same style actually
-	     terminates the comment section.  */
-	  state.comstyle = ST_COMMENT_STYLE;
-	  state.incomment = -1;
-	  state.comstr_start = prev_from;
-	  code = Scomment;
-	}
-      else if (code == Scomment)
-	{
-	  state.comstyle = SYNTAX_FLAGS_COMMENT_STYLE (prev_from_syntax, 0);
-	  state.incomment = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax) ?
-			     1 : -1);
-	  state.comstr_start = prev_from;
-	}
+      else
+        {
+          INC_FROM;
+          code = prev_from_syntax & 0xff;
+          if (code == Scomment_fence)
+            {
+              /* Record the comment style we have entered so that only
+                 the comment-end sequence of the same style actually
+                 terminates the comment section.  */
+              state->comstyle = ST_COMMENT_STYLE;
+              state->incomment = -1;
+              state->comstr_start = prev_from;
+              code = Scomment;
+            }
+          else if (code == Scomment)
+            {
+              state->comstyle = SYNTAX_FLAGS_COMMENT_STYLE (prev_from_syntax, 0);
+              state->incomment = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax) ?
+                                 1 : -1);
+              state->comstr_start = prev_from;
+            }
+        }
 
       if (SYNTAX_FLAGS_PREFIX (prev_from_syntax))
 	continue;
@@ -3350,25 +3322,28 @@ do { prev_from = from;				\
 
 	case Scomment_fence: /* Can't happen because it's handled above.  */
 	case Scomment:
-	  if (commentstop || boundary_stop) goto done;
+          if (commentstop || boundary_stop) goto done;
 	startincomment:
 	  /* The (from == BEGV) test was to enter the loop in the middle so
 	     that we find a 2-char comment ender even if we start in the
 	     middle of it.  We don't want to do that if we're just at the
 	     beginning of the comment (think of (*) ... (*)).  */
 	  found = forw_comment (from, from_byte, end,
-				state.incomment, state.comstyle,
-				(from == BEGV || from < state.comstr_start + 3)
-				? 0 : prev_from_syntax,
-				&out_charpos, &out_bytepos, &state.incomment);
+				state->incomment, state->comstyle,
+				from == BEGV ? 0 : prev_from_syntax,
+				&out_charpos, &out_bytepos, &state->incomment,
+                                &prev_from_syntax);
 	  from = out_charpos; from_byte = out_bytepos;
-	  /* Beware!  prev_from and friends are invalid now.
-	     Luckily, the `done' doesn't use them and the INC_FROM
-	     sets them to a sane value without looking at them. */
+	  /* Beware!  prev_from and friends (except prev_from_syntax)
+	     are invalid now.  Luckily, the `done' doesn't use them
+	     and the INC_FROM sets them to a sane value without
+	     looking at them. */
 	  if (!found) goto done;
 	  INC_FROM;
-	  state.incomment = 0;
-	  state.comstyle = 0;	/* reset the comment style */
+	  state->incomment = 0;
+	  state->comstyle = 0;	/* reset the comment style */
+          prev_from_syntax = Smax; /* Ensure "*|*" can't open a spurious new
+                                      comment. */
 	  if (boundary_stop) goto done;
 	  break;
 
@@ -3396,16 +3371,16 @@ do { prev_from = from;				\
 
 	case Sstring:
 	case Sstring_fence:
-	  state.comstr_start = from - 1;
+	  state->comstr_start = from - 1;
 	  if (stopbefore) goto stop;  /* this arg means stop at sexp start */
 	  curlevel->last = prev_from;
-	  state.instring = (code == Sstring
+	  state->instring = (code == Sstring
 			    ? (FETCH_CHAR_AS_MULTIBYTE (prev_from_byte))
 			    : ST_STRING_STYLE);
 	  if (boundary_stop) goto done;
 	startinstring:
 	  {
-	    nofence = state.instring != ST_STRING_STYLE;
+	    nofence = state->instring != ST_STRING_STYLE;
 
 	    while (1)
 	      {
@@ -3419,7 +3394,7 @@ do { prev_from = from;				\
 		/* Check C_CODE here so that if the char has
 		   a syntax-table property which says it is NOT
 		   a string character, it does not end the string.  */
-		if (nofence && c == state.instring && c_code == Sstring)
+		if (nofence && c == state->instring && c_code == Sstring)
 		  break;
 
 		switch (c_code)
@@ -3442,7 +3417,7 @@ do { prev_from = from;				\
 	      }
 	  }
 	string_end:
-	  state.instring = -1;
+	  state->instring = -1;
 	  curlevel->prev = curlevel->last;
 	  INC_FROM;
 	  if (boundary_stop) goto done;
@@ -3461,25 +3436,96 @@ do { prev_from = from;				\
  stop:   /* Here if stopping before start of sexp. */
   from = prev_from;    /* We have just fetched the char that starts it; */
   from_byte = prev_from_byte;
+  prev_from_syntax = prev_prev_from_syntax;
   goto done; /* but return the position before it. */
 
  endquoted:
-  state.quoted = 1;
+  state->quoted = 1;
  done:
-  state.depth = depth;
-  state.mindepth = mindepth;
-  state.thislevelstart = curlevel->prev;
-  state.prevlevelstart
+  state->depth = depth;
+  state->mindepth = mindepth;
+  state->thislevelstart = curlevel->prev;
+  state->prevlevelstart
     = (curlevel == levelstart) ? -1 : (curlevel - 1)->last;
-  state.location = from;
-  state.location_byte = from_byte;
-  state.levelstarts = Qnil;
+  state->location = from;
+  state->location_byte = from_byte;
+  state->levelstarts = Qnil;
   while (curlevel > levelstart)
-    state.levelstarts = Fcons (make_number ((--curlevel)->last),
-			       state.levelstarts);
+    state->levelstarts = Fcons (make_number ((--curlevel)->last),
+                                state->levelstarts);
+  state->prev_syntax = (SYNTAX_FLAGS_COMSTARTEND_FIRST (prev_from_syntax)
+                        || state->quoted) ? prev_from_syntax : Smax;
   immediate_quit = 0;
+}
+
+/* Convert a (lisp) parse state to the internal form used in
+   scan_sexps_forward.  */
+static void
+internalize_parse_state (Lisp_Object external, struct lisp_parse_state *state)
+{
+  Lisp_Object tem;
+
+  if (NILP (external))
+    {
+      state->depth = 0;
+      state->instring = -1;
+      state->incomment = 0;
+      state->quoted = 0;
+      state->comstyle = 0;	/* comment style a by default.  */
+      state->comstr_start = -1;	/* no comment/string seen.  */
+      state->levelstarts = Qnil;
+      state->prev_syntax = Smax;
+    }
+  else
+    {
+      tem = Fcar (external);
+      if (!NILP (tem))
+	state->depth = XINT (tem);
+      else
+	state->depth = 0;
+
+      external = Fcdr (external);
+      external = Fcdr (external);
+      external = Fcdr (external);
+      tem = Fcar (external);
+      /* Check whether we are inside string_fence-style string: */
+      state->instring = (!NILP (tem)
+                         ? (CHARACTERP (tem) ? XFASTINT (tem) : ST_STRING_STYLE)
+                         : -1);
+
+      external = Fcdr (external);
+      tem = Fcar (external);
+      state->incomment = (!NILP (tem)
+                          ? (INTEGERP (tem) ? XINT (tem) : -1)
+                          : 0);
+
+      external = Fcdr (external);
+      tem = Fcar (external);
+      state->quoted = !NILP (tem);
 
-  *stateptr = state;
+      /* if the eighth element of the list is nil, we are in comment
+	 style a.  If it is non-nil, we are in comment style b */
+      external = Fcdr (external);
+      external = Fcdr (external);
+      tem = Fcar (external);
+      state->comstyle = (NILP (tem)
+                         ? 0
+                         : (RANGED_INTEGERP (0, tem, ST_COMMENT_STYLE)
+                            ? XINT (tem)
+                            : ST_COMMENT_STYLE));
+
+      external = Fcdr (external);
+      tem = Fcar (external);
+      state->comstr_start =
+	RANGED_INTEGERP (PTRDIFF_MIN, tem, PTRDIFF_MAX) ? XINT (tem) : -1;
+      external = Fcdr (external);
+      tem = Fcar (external);
+      state->levelstarts = tem;
+
+      external = Fcdr (external);
+      tem = Fcar (external);
+      state->prev_syntax = NILP (tem) ? Smax : XINT (tem);
+    }
 }
 
 DEFUN ("parse-partial-sexp", Fparse_partial_sexp, Sparse_partial_sexp, 2, 6, 0,
@@ -3488,6 +3534,7 @@ Parsing stops at TO or when certain criteria are met;
  point is set to where parsing stops.
 If fifth arg OLDSTATE is omitted or nil,
  parsing assumes that FROM is the beginning of a function.
+
 Value is a list of elements describing final state of parsing:
  0. depth in parens.
  1. character address of start of innermost containing list; nil if none.
@@ -3501,16 +3548,22 @@ Value is a list of elements describing final state of parsing:
  6. the minimum paren-depth encountered during this scan.
  7. style of comment, if any.
  8. character address of start of comment or string; nil if not in one.
- 9. Intermediate data for continuation of parsing (subject to change).
+ 9. List of positions of currently open parens, outermost first.
+10. When the last position scanned holds the first character of a
+    (potential) two character construct, the syntax of that position,
+    otherwise nil.  That construct can be a two character comment
+    delimiter or an Escaped or Char-quoted character.
+11..... Possible further internal information used by `parse-partial-sexp'.
+
 If third arg TARGETDEPTH is non-nil, parsing stops if the depth
 in parentheses becomes equal to TARGETDEPTH.
-Fourth arg STOPBEFORE non-nil means stop when come to
+Fourth arg STOPBEFORE non-nil means stop when we come to
  any character that starts a sexp.
 Fifth arg OLDSTATE is a list like what this function returns.
  It is used to initialize the state of the parse.  Elements number 1, 2, 6
  are ignored.
-Sixth arg COMMENTSTOP non-nil means stop at the start of a comment.
- If it is symbol `syntax-table', stop after the start of a comment or a
+Sixth arg COMMENTSTOP non-nil means stop after the start of a comment.
+ If it is the symbol `syntax-table', stop after the start of a comment or a
  string, or after end of a comment or a string.  */)
   (Lisp_Object from, Lisp_Object to, Lisp_Object targetdepth,
    Lisp_Object stopbefore, Lisp_Object oldstate, Lisp_Object commentstop)
@@ -3527,15 +3580,17 @@ Sixth arg COMMENTSTOP non-nil means stop at the start of a comment.
     target = TYPE_MINIMUM (EMACS_INT);	/* We won't reach this depth.  */
 
   validate_region (&from, &to);
+  internalize_parse_state (oldstate, &state);
   scan_sexps_forward (&state, XINT (from), CHAR_TO_BYTE (XINT (from)),
 		      XINT (to),
-		      target, !NILP (stopbefore), oldstate,
+		      target, !NILP (stopbefore),
 		      (NILP (commentstop)
 		       ? 0 : (EQ (commentstop, Qsyntax_table) ? -1 : 1)));
 
   SET_PT_BOTH (state.location, state.location_byte);
 
-  return Fcons (make_number (state.depth),
+  return
+    Fcons (make_number (state.depth),
 	   Fcons (state.prevlevelstart < 0
 		  ? Qnil : make_number (state.prevlevelstart),
 	     Fcons (state.thislevelstart < 0
@@ -3553,11 +3608,15 @@ Sixth arg COMMENTSTOP non-nil means stop at the start of a comment.
 				  ? Qsyntax_table
 				  : make_number (state.comstyle))
 			       : Qnil),
-			      Fcons (((state.incomment
-				       || (state.instring >= 0))
-				      ? make_number (state.comstr_start)
-				      : Qnil),
-				     Fcons (state.levelstarts, Qnil))))))))));
+		         Fcons (((state.incomment
+                                  || (state.instring >= 0))
+                                 ? make_number (state.comstr_start)
+                                 : Qnil),
+			   Fcons (state.levelstarts,
+                             Fcons (state.prev_syntax == Smax
+                                    ? Qnil
+                                    : make_number (state.prev_syntax),
+                                Qnil)))))))))));
 }
 
 void


-- 
Alan Mackenzie (Nuremberg, Germany).


From debbugs-submit-bounces@debbugs.gnu.org Fri Mar 18 15:13:51 2016
Received: (at 23019) by debbugs.gnu.org; 18 Mar 2016 19:13:51 +0000
Received: from localhost ([127.0.0.1]:52954 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces@debbugs.gnu.org>)
	id 1agzqJ-0000mU-8I
	for submit@debbugs.gnu.org; Fri, 18 Mar 2016 15:13:51 -0400
Received: from mail.muc.de ([193.149.48.3]:19511)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <acm@muc.de>) id 1agzqH-0000mM-CF
 for 23019@debbugs.gnu.org; Fri, 18 Mar 2016 15:13:49 -0400
Received: (qmail 84555 invoked by uid 3782); 18 Mar 2016 19:13:48 -0000
Received: from acm.muc.de (p548A53B1.dip0.t-ipconnect.de [84.138.83.177]) by
 colin.muc.de (tmda-ofmipd) with ESMTP;
 Fri, 18 Mar 2016 20:13:47 +0100
Received: (qmail 11916 invoked by uid 1000); 18 Mar 2016 19:16:33 -0000
Date: Fri, 18 Mar 2016 19:16:33 +0000
To: Stefan Monnier <monnier@iro.umontreal.ca>
Subject: Re: bug#23019: parse-partial-sexp doesn't output the full state
 needed for its continuance.
Message-ID: <20160318191633.GC9433@acm.fritz.box>
References: <20160315091355.GA2263@acm.fritz.box>
 <jwvvb4lclp0.fsf-monnier+emacsbug@gnu.org>
 <20160317214934.GB9038@acm.fritz.box>
 <jwv37rn92lg.fsf-monnier+emacsbugs@gnu.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <jwv37rn92lg.fsf-monnier+emacsbugs@gnu.org>
User-Agent: Mutt/1.5.24 (2015-08-30)
X-Delivery-Agent: TMDA/1.1.12 (Macallan)
From: Alan Mackenzie <acm@muc.de>
X-Primary-Address: acm@muc.de
X-Spam-Score: -0.0 (/)
X-Debbugs-Envelope-To: 23019
Cc: 23019@debbugs.gnu.org
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit@debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request@debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
X-Spam-Score: -0.0 (/)

Hello, Stefan.

On Fri, Mar 18, 2016 at 12:27:36PM -0400, Stefan Monnier wrote:
> > (scan_sexps_forward): Remove a redundant state parameter.  Access all `state'
> > information via the address parameter `state'.

> Have you taken a look at the performance impact of this part of the change?
> I don't expect it will make much difference, but I'm actually wondering
> whether it makes things slower or faster.

I didn't give all that much thought to it.  With a "local" state,
state.field will be addressed as a constant offset from the stack frame
base register.  With a "remote" state, state->field will be addressed as
a constant offset from some address register.  Provided the processor
has enough registers available, it shouldn't make a difference.  But on
an architecture with a restricted set of registers (?old 80x86), it might
make things slower if an address register needs to be repeatedly loaded,
or even repeatedly stacked around function calls.

I'm going to try timing it both ways:  (parse-partial-sexp (point-min)
(point-max)) on xdisp.c (what else?):

Code with "->": 0.03793740272521973 seconds.
Code with "." : 0.03828787803649902 seconds.

So, at least on my machine, the "indirect" version is faster, by around
1%.  Not a great difference, but I'm surprised by the way it went.

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).


From debbugs-submit-bounces@debbugs.gnu.org Fri Mar 18 15:36:50 2016
Received: (at 23019) by debbugs.gnu.org; 18 Mar 2016 19:36:50 +0000
Received: from localhost ([127.0.0.1]:52959 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces@debbugs.gnu.org>)
	id 1ah0CY-0001JU-61
	for submit@debbugs.gnu.org; Fri, 18 Mar 2016 15:36:50 -0400
Received: from ironport2-out.teksavvy.com ([206.248.154.181]:50229)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <monnier@iro.umontreal.ca>) id 1ah0CW-0001JG-EH
 for 23019@debbugs.gnu.org; Fri, 18 Mar 2016 15:36:49 -0400
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A0A+FgA731xV/xSQs2tcgxCEAoVVwwsEAgKBPD0QAQEBAQEBAYEKQQWDXQEBAwFWIwULCw4mEhQYDSSINwjPIwEBAQEGAQEBAR6LOoUFB4QtBbUEI4I7gVkigngBAQE
X-IPAS-Result: A0A+FgA731xV/xSQs2tcgxCEAoVVwwsEAgKBPD0QAQEBAQEBAYEKQQWDXQEBAwFWIwULCw4mEhQYDSSINwjPIwEBAQEGAQEBAR6LOoUFB4QtBbUEI4I7gVkigngBAQE
X-IronPort-AV: E=Sophos;i="5.13,465,1427774400"; d="scan'208";a="196706370"
Received: from 107-179-144-20.cpe.teksavvy.com (HELO pastel.home)
 ([107.179.144.20])
 by ironport2-out.teksavvy.com with ESMTP; 18 Mar 2016 15:36:41 -0400
Received: by pastel.home (Postfix, from userid 20848)
 id 066055FE67; Fri, 18 Mar 2016 15:36:41 -0400 (EDT)
From: Stefan Monnier <monnier@iro.umontreal.ca>
To: Alan Mackenzie <acm@muc.de>
Subject: Re: bug#23019: parse-partial-sexp doesn't output the full state
 needed for its continuance.
Message-ID: <jwvoaab60mh.fsf-monnier+emacsbugs@gnu.org>
References: <20160315091355.GA2263@acm.fritz.box>
 <jwvvb4lclp0.fsf-monnier+emacsbug@gnu.org>
 <20160317214934.GB9038@acm.fritz.box>
 <jwvmvpwcsn1.fsf-monnier+emacsbugs@gnu.org>
 <20160318151154.GA9433@acm.fritz.box>
 <jwv8u1f930c.fsf-monnier+emacsbugs@gnu.org>
 <20160318182547.GB9433@acm.fritz.box>
Date: Fri, 18 Mar 2016 15:36:40 -0400
In-Reply-To: <20160318182547.GB9433@acm.fritz.box> (Alan Mackenzie's message
 of "Fri, 18 Mar 2016 18:25:47 +0000")
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1.50 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: 0.3 (/)
X-Debbugs-Envelope-To: 23019
Cc: 23019@debbugs.gnu.org
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit@debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request@debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
X-Spam-Score: 0.3 (/)

> We also have Scharquote, which scan_sexps_forward handles identically to
> Sescape.

Yes, it's two syntax codes which are 100% equivalent.  An accident of
history I guess.

> I have bad feelings about that.  Is it really worth the risk, just to
> save one cons cell on a list that not that many instances of exist at
> any time?

As you know, I like to take short term risks for long term benefits.

>> But even if we don't re-use element 5, I would actually much prefer to
>> render element 5 redundant.
> OK.  Here's an updated patch which does just that.  Comments would be
> welcome.

I'll take a closer look later, thanks.


        Stefan


From debbugs-submit-bounces@debbugs.gnu.org Fri Mar 18 15:40:41 2016
Received: (at 23019) by debbugs.gnu.org; 18 Mar 2016 19:40:41 +0000
Received: from localhost ([127.0.0.1]:52963 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces@debbugs.gnu.org>)
	id 1ah0GH-0001Ox-M9
	for submit@debbugs.gnu.org; Fri, 18 Mar 2016 15:40:41 -0400
Received: from ironport2-out.teksavvy.com ([206.248.154.181]:51628)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <monnier@iro.umontreal.ca>) id 1ah0GF-0001Ok-R9
 for 23019@debbugs.gnu.org; Fri, 18 Mar 2016 15:40:40 -0400
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A0A3FgA731xV/xSQs2tcgxCEAoVVwwsEAgKBPDwRAQEBAQEBAYEKQQWDXQEBAwEnLyMFCwsOBCISFBgNEBSINwjPIwEBAQEGAQEBAR6LOoUFB4QtBbUEI4I7gVkigngBAQE
X-IPAS-Result: A0A3FgA731xV/xSQs2tcgxCEAoVVwwsEAgKBPDwRAQEBAQEBAYEKQQWDXQEBAwEnLyMFCwsOBCISFBgNEBSINwjPIwEBAQEGAQEBAR6LOoUFB4QtBbUEI4I7gVkigngBAQE
X-IronPort-AV: E=Sophos;i="5.13,465,1427774400"; d="scan'208";a="196706977"
Received: from 107-179-144-20.cpe.teksavvy.com (HELO pastel.home)
 ([107.179.144.20])
 by ironport2-out.teksavvy.com with ESMTP; 18 Mar 2016 15:40:34 -0400
Received: by pastel.home (Postfix, from userid 20848)
 id 3BC065FE67; Fri, 18 Mar 2016 15:40:34 -0400 (EDT)
From: Stefan Monnier <monnier@iro.umontreal.ca>
To: Alan Mackenzie <acm@muc.de>
Subject: Re: bug#23019: parse-partial-sexp doesn't output the full state
 needed for its continuance.
Message-ID: <jwvio0j60if.fsf-monnier+emacsbugs@gnu.org>
References: <20160315091355.GA2263@acm.fritz.box>
 <jwvvb4lclp0.fsf-monnier+emacsbug@gnu.org>
 <20160317214934.GB9038@acm.fritz.box>
 <jwv37rn92lg.fsf-monnier+emacsbugs@gnu.org>
 <20160318191633.GC9433@acm.fritz.box>
Date: Fri, 18 Mar 2016 15:40:34 -0400
In-Reply-To: <20160318191633.GC9433@acm.fritz.box> (Alan Mackenzie's message
 of "Fri, 18 Mar 2016 19:16:33 +0000")
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1.50 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: 0.3 (/)
X-Debbugs-Envelope-To: 23019
Cc: 23019@debbugs.gnu.org
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit@debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request@debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
X-Spam-Score: 0.3 (/)

> I didn't give all that much thought to it.  With a "local" state,
> state.field will be addressed as a constant offset from the stack frame
> base register.  With a "remote" state, state->field will be addressed as
> a constant offset from some address register.  Provided the processor
> has enough registers available, it shouldn't make a difference.  But on
> an architecture with a restricted set of registers (?old 80x86), it might
> make things slower if an address register needs to be repeatedly loaded,
> or even repeatedly stacked around function calls.

That was my first reaction as well.  But my other self was telling me "I
can't say why, but my gut feeling says that this code is "cleaner"
and should hence be easier to optimize".

> So, at least on my machine, the "indirect" version is faster, by
> around 1%.  Not a great difference, but I'm surprised by the way
> it went.

Thanks for the test.  As expected, it's a wash, but it's good to confirm
that the cleaner version is at least no slower,


        Stefan


From debbugs-submit-bounces@debbugs.gnu.org Sat Mar 19 13:03:42 2016
Received: (at 23019) by debbugs.gnu.org; 19 Mar 2016 17:03:42 +0000
Received: from localhost ([127.0.0.1]:53871 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces@debbugs.gnu.org>)
	id 1ahKHu-00008C-59
	for submit@debbugs.gnu.org; Sat, 19 Mar 2016 13:03:42 -0400
Received: from mail.muc.de ([193.149.48.3]:61345)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <acm@muc.de>) id 1ahKHs-000083-EW
 for 23019@debbugs.gnu.org; Sat, 19 Mar 2016 13:03:40 -0400
Received: (qmail 14430 invoked by uid 3782); 19 Mar 2016 17:03:39 -0000
Received: from acm.muc.de (p548A5545.dip0.t-ipconnect.de [84.138.85.69]) by
 colin.muc.de (tmda-ofmipd) with ESMTP;
 Sat, 19 Mar 2016 18:03:37 +0100
Received: (qmail 5044 invoked by uid 1000); 19 Mar 2016 17:06:24 -0000
Date: Sat, 19 Mar 2016 17:06:24 +0000
To: Stefan Monnier <monnier@iro.umontreal.ca>
Subject: Re: bug#23019: parse-partial-sexp doesn't output the full state
 needed for its continuance.
Message-ID: <20160319170624.GC2644@acm.fritz.box>
References: <20160315091355.GA2263@acm.fritz.box>
 <jwvvb4lclp0.fsf-monnier+emacsbug@gnu.org>
 <20160317214934.GB9038@acm.fritz.box>
 <jwvmvpwcsn1.fsf-monnier+emacsbugs@gnu.org>
 <20160318151154.GA9433@acm.fritz.box>
 <jwv8u1f930c.fsf-monnier+emacsbugs@gnu.org>
 <20160318182547.GB9433@acm.fritz.box>
 <jwvoaab60mh.fsf-monnier+emacsbugs@gnu.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <jwvoaab60mh.fsf-monnier+emacsbugs@gnu.org>
User-Agent: Mutt/1.5.24 (2015-08-30)
X-Delivery-Agent: TMDA/1.1.12 (Macallan)
From: Alan Mackenzie <acm@muc.de>
X-Primary-Address: acm@muc.de
X-Spam-Score: -0.0 (/)
X-Debbugs-Envelope-To: 23019
Cc: 23019@debbugs.gnu.org
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit@debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request@debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
X-Spam-Score: -0.0 (/)

Hello, Stefan.

On Fri, Mar 18, 2016 at 03:36:40PM -0400, Stefan Monnier wrote:

> > OK.  Here's an updated patch which does just that.  Comments would be
> > welcome.

> I'll take a closer look later, thanks.

I found some problems at ends of comments.  The upshot is that
forw_comment must inform scan_sexps_forward, on a failed search, whether
the last character it scanned is still "syntactically live", or whether
that last character's syntax was "used up" in closing or opening a
comment.  On a successful search, that character's syntax is always
"used up" in closing the comment.

Would you like to see the patch again, or should I just commit it?

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).


From debbugs-submit-bounces@debbugs.gnu.org Sat Mar 19 21:30:38 2016
Received: (at 23019) by debbugs.gnu.org; 20 Mar 2016 01:30:38 +0000
Received: from localhost ([127.0.0.1]:54040 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces@debbugs.gnu.org>)
	id 1ahSCT-0001e9-ST
	for submit@debbugs.gnu.org; Sat, 19 Mar 2016 21:30:38 -0400
Received: from chene.dit.umontreal.ca ([132.204.246.20]:37431)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <monnier@iro.umontreal.ca>) id 1ahSCR-0001dz-AW
 for 23019@debbugs.gnu.org; Sat, 19 Mar 2016 21:30:36 -0400
Received: from fmsmemgm.homelinux.net (lechon.iro.umontreal.ca
 [132.204.27.242])
 by chene.dit.umontreal.ca (8.14.1/8.14.1) with ESMTP id u2K1V0QX010508;
 Sat, 19 Mar 2016 21:31:01 -0400
Received: by fmsmemgm.homelinux.net (Postfix, from userid 20848)
 id 23452AE665; Sat, 19 Mar 2016 21:30:32 -0400 (EDT)
From: Stefan Monnier <monnier@IRO.UMontreal.CA>
To: Alan Mackenzie <acm@muc.de>
Subject: Re: bug#23019: parse-partial-sexp doesn't output the full state
 needed for its continuance.
Message-ID: <jwvbn6a0we3.fsf-monnier+emacsbugs@gnu.org>
References: <20160315091355.GA2263@acm.fritz.box>
 <jwvvb4lclp0.fsf-monnier+emacsbug@gnu.org>
 <20160317214934.GB9038@acm.fritz.box>
 <jwvmvpwcsn1.fsf-monnier+emacsbugs@gnu.org>
 <20160318151154.GA9433@acm.fritz.box>
 <jwv8u1f930c.fsf-monnier+emacsbugs@gnu.org>
 <20160318182547.GB9433@acm.fritz.box>
 <jwvoaab60mh.fsf-monnier+emacsbugs@gnu.org>
 <20160319170624.GC2644@acm.fritz.box>
Date: Sat, 19 Mar 2016 21:30:32 -0400
In-Reply-To: <20160319170624.GC2644@acm.fritz.box> (Alan Mackenzie's message
 of "Sat, 19 Mar 2016 17:06:24 +0000")
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1.50 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-NAI-Spam-Flag: NO
X-NAI-Spam-Level: 
X-NAI-Spam-Threshold: 5
X-NAI-Spam-Score: 0.2
X-NAI-Spam-Rules: 2 Rules triggered
	GEN_SPAM_FEATRE=0.2, RV5615=0
X-NAI-Spam-Version: 2.3.0.9418 : core <5615> : inlines <4535> : streams
 <1605715> : uri <2170232>
X-Spam-Score: -1.3 (-)
X-Debbugs-Envelope-To: 23019
Cc: 23019@debbugs.gnu.org
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit@debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request@debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
X-Spam-Score: -1.3 (-)

> Would you like to see the patch again, or should I just commit it?

I'd like to hear what John thinks about the idea of re-using "nth 5"
instead of adding a new entry, but other than that, I think it's OK
to commit, thanks.


        Stefan


From debbugs-submit-bounces@debbugs.gnu.org Sun Mar 20 09:38:50 2016
Received: (at 23019-done) by debbugs.gnu.org; 20 Mar 2016 13:38:50 +0000
Received: from localhost ([127.0.0.1]:54297 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces@debbugs.gnu.org>)
	id 1ahdZC-0001Wq-6o
	for submit@debbugs.gnu.org; Sun, 20 Mar 2016 09:38:50 -0400
Received: from mail.muc.de ([193.149.48.3]:10130)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <acm@muc.de>) id 1ahdZA-0001Wh-Da
 for 23019-done@debbugs.gnu.org; Sun, 20 Mar 2016 09:38:48 -0400
Received: (qmail 23619 invoked by uid 3782); 20 Mar 2016 13:38:47 -0000
Received: from acm.muc.de (p5B146DE7.dip0.t-ipconnect.de [91.20.109.231]) by
 colin.muc.de (tmda-ofmipd) with ESMTP;
 Sun, 20 Mar 2016 14:38:46 +0100
Received: (qmail 3746 invoked by uid 1000); 20 Mar 2016 13:41:34 -0000
Date: Sun, 20 Mar 2016 13:41:34 +0000
To: 23019-done@debbugs.gnu.org
Subject: Re: bug#23019: parse-partial-sexp doesn't output the full state
 needed for its continuance.
Message-ID: <20160320134134.GA3603@acm.fritz.box>
References: <20160315091355.GA2263@acm.fritz.box>
 <jwvvb4lclp0.fsf-monnier+emacsbug@gnu.org>
 <20160317214934.GB9038@acm.fritz.box>
 <jwvmvpwcsn1.fsf-monnier+emacsbugs@gnu.org>
 <20160318151154.GA9433@acm.fritz.box>
 <jwv8u1f930c.fsf-monnier+emacsbugs@gnu.org>
 <20160318182547.GB9433@acm.fritz.box>
 <jwvoaab60mh.fsf-monnier+emacsbugs@gnu.org>
 <20160319170624.GC2644@acm.fritz.box>
 <jwvbn6a0we3.fsf-monnier+emacsbugs@gnu.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <jwvbn6a0we3.fsf-monnier+emacsbugs@gnu.org>
User-Agent: Mutt/1.5.24 (2015-08-30)
X-Delivery-Agent: TMDA/1.1.12 (Macallan)
From: Alan Mackenzie <acm@muc.de>
X-Primary-Address: acm@muc.de
X-Spam-Score: -0.0 (/)
X-Debbugs-Envelope-To: 23019-done
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit@debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request@debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
X-Spam-Score: -0.0 (/)

Bug fixed.

On Sat, Mar 19, 2016 at 09:30:32PM -0400, Stefan Monnier wrote:

-- 
Alan Mackenzie (Nuremberg, Germany).


From debbugs-submit-bounces@debbugs.gnu.org Sun Apr 03 18:53:33 2016
Received: (at 23019) by debbugs.gnu.org; 3 Apr 2016 22:53:33 +0000
Received: from localhost ([127.0.0.1]:50396 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces@debbugs.gnu.org>)
	id 1amqth-0004QT-LM
	for submit@debbugs.gnu.org; Sun, 03 Apr 2016 18:53:33 -0400
Received: from mail-oi0-f49.google.com ([209.85.218.49]:35020)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <jwiegley@gmail.com>) id 1amqtf-0004QF-5I
 for 23019@debbugs.gnu.org; Sun, 03 Apr 2016 18:53:31 -0400
Received: by mail-oi0-f49.google.com with SMTP id p188so145115771oih.2
 for <23019@debbugs.gnu.org>; Sun, 03 Apr 2016 15:53:31 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=from:to:cc:subject:in-reply-to:date:message-id:references
 :user-agent:mime-version;
 bh=ixXGX4/AV+yNnE9dKhFehN5LVS/GFLn4UijDr8MVL90=;
 b=kQdjXSyd9K9/11rSEPpW8rTb/UB7nBkk8UpViulaIOX9TRPq4dw3zwwoTI78Xh7a5+
 cJYjf5ks4rQjP1FEEz2NdYNwtaU+fbEJT1zp8JGqH1FighQ0yjwsCHMK3kzp/EZNWRPc
 plY5DE5Gk6TXfWwdeCQPPrqmeRelqqBj39HM7z7KcXoX3xCZaecAb8oYddLuYKNQCRbU
 SEE+6i+WCsrJGlSRNDBhg77crb3ZBiO9oSAqhO//qP45p3JqaxxKBGDJEAb0NnI3ropu
 Fkj1Sa2F55OoAxIwPOhjtzx9PAg0s7nXpQ5cV/zJ4OSVbq4fmyVDxnSgYQaUIGzJHB8C
 dWOg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:from:to:cc:subject:in-reply-to:date:message-id
 :references:user-agent:mime-version;
 bh=ixXGX4/AV+yNnE9dKhFehN5LVS/GFLn4UijDr8MVL90=;
 b=RI7r95VlZK6QZen7pL2IbZp2N0hPggEKfs8gQF4+k3gM7ZgogidKymeEJCM0eagGjz
 IBrSPzJ1ggEAd2EiMhbaTUl6rP3H0kwteh1j7b1IepQUOSPC0hINQKZ08KZ1xRjruuD2
 q+ZcnKVtGmMI0Mc+OIl01HqN1Suf/Q6gs0mj46O+1puDseF7vomIOFdgtgiSQBVst0WW
 ulWA3UORB/c6ZZWko7txD8VxPQusVPJLcQgvQQL6OnzF58eZJUR+qU6gGRIqxcqyc41i
 TkkZl2sN3yrCqsIVo4kDUpbEOGQLwVkz7vmd/HzgfE6eSneqE3eY9z3kvuoVpmtPDABa
 gs5g==
X-Gm-Message-State: AD7BkJIH/lxnVpc91QUJU464+KtAgP8RKB/x0h37O9Nh7qbXqSVfM/0losSVvIDzOydRjw==
X-Received: by 10.157.14.7 with SMTP id c7mr1927381otc.106.1459724005617;
 Sun, 03 Apr 2016 15:53:25 -0700 (PDT)
Received: from Vulcan.local (76-234-68-79.lightspeed.frokca.sbcglobal.net.
 [76.234.68.79])
 by smtp.gmail.com with ESMTPSA id yn3sm7642058obc.27.2016.04.03.15.53.24
 (version=TLS1 cipher=AES128-SHA bits=128/128);
 Sun, 03 Apr 2016 15:53:24 -0700 (PDT)
From: John Wiegley <jwiegley@gmail.com>
X-Google-Original-From: "John Wiegley" <johnw@gnu.org>
Received: by Vulcan.local (Postfix, from userid 501)
 id AEB6D13DAE07E; Sun,  3 Apr 2016 15:53:23 -0700 (PDT)
To: Stefan Monnier <monnier@IRO.UMontreal.CA>
Subject: Re: bug#23019: parse-partial-sexp doesn't output the full state
 needed for its continuance.
In-Reply-To: <jwvbn6a0we3.fsf-monnier+emacsbugs@gnu.org> (Stefan Monnier's
 message of "Sat, 19 Mar 2016 21:30:32 -0400")
Date: Sun, 03 Apr 2016 15:53:02 -0700
Message-ID: <m2r3emgv8x.fsf@newartisans.com>
References: <20160315091355.GA2263@acm.fritz.box>
 <jwvvb4lclp0.fsf-monnier+emacsbug@gnu.org>
 <20160317214934.GB9038@acm.fritz.box>
 <jwvmvpwcsn1.fsf-monnier+emacsbugs@gnu.org>
 <20160318151154.GA9433@acm.fritz.box>
 <jwv8u1f930c.fsf-monnier+emacsbugs@gnu.org>
 <20160318182547.GB9433@acm.fritz.box>
 <jwvoaab60mh.fsf-monnier+emacsbugs@gnu.org>
 <20160319170624.GC2644@acm.fritz.box>
 <jwvbn6a0we3.fsf-monnier+emacsbugs@gnu.org>
User-Agent: Gnus/5.130014 (Ma Gnus v0.14) Emacs/25.1.50 (darwin)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 23019
Cc: Alan Mackenzie <acm@muc.de>, 23019@debbugs.gnu.org
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit@debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request@debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
X-Spam-Score: -0.7 (/)

>>>>> Stefan Monnier <monnier@IRO.UMontreal.CA> writes:

> I'd like to hear what John thinks about the idea of re-using "nth 5" instead
> of adding a new entry, but other than that, I think it's OK to commit,
> thanks.

How long has this stuff been out in the field?  Do you think it's well known
enough that anyone is depending on the earlier behavior of the nth 5 value?  I
have a feeling it's OK to re-use it.

-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2


From debbugs-submit-bounces@debbugs.gnu.org Mon Apr 04 08:16:01 2016
Received: (at 23019) by debbugs.gnu.org; 4 Apr 2016 12:16:01 +0000
Received: from localhost ([127.0.0.1]:50679 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces@debbugs.gnu.org>)
	id 1an3QH-0007a5-4A
	for submit@debbugs.gnu.org; Mon, 04 Apr 2016 08:16:01 -0400
Received: from ironport2-out.teksavvy.com ([206.248.154.181]:12491)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <monnier@iro.umontreal.ca>) id 1an3QD-0007Zq-M2
 for 23019@debbugs.gnu.org; Mon, 04 Apr 2016 08:15:59 -0400
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A0AyFgA731xV/0+KpUVcgxCEAoVVu0CHSwQCAoE8OhMBAQEBAQEBgQpBBYNdAQEDAVYjBQsLDiYSFBgNJIg3CM8jAQEBAQYBAQEBHos6hQUHhC0FkDSjC4FFI4I7gVkigngBAQE
X-IPAS-Result: A0AyFgA731xV/0+KpUVcgxCEAoVVu0CHSwQCAoE8OhMBAQEBAQEBgQpBBYNdAQEDAVYjBQsLDiYSFBgNJIg3CM8jAQEBAQYBAQEBHos6hQUHhC0FkDSjC4FFI4I7gVkigngBAQE
X-IronPort-AV: E=Sophos;i="5.13,465,1427774400"; d="scan'208";a="204805780"
Received: from 69-165-138-79.dsl.teksavvy.com (HELO pastel.home)
 ([69.165.138.79])
 by ironport2-out.teksavvy.com with ESMTP; 04 Apr 2016 08:15:52 -0400
Received: by pastel.home (Postfix, from userid 20848)
 id 06ADD6226D; Mon,  4 Apr 2016 08:15:52 -0400 (EDT)
From: Stefan Monnier <monnier@iro.umontreal.ca>
To: John Wiegley <jwiegley@gmail.com>
Subject: Re: bug#23019: parse-partial-sexp doesn't output the full state
 needed for its continuance.
Message-ID: <jwvoa9p1t35.fsf-monnier+emacsbugs@gnu.org>
References: <20160315091355.GA2263@acm.fritz.box>
 <jwvvb4lclp0.fsf-monnier+emacsbug@gnu.org>
 <20160317214934.GB9038@acm.fritz.box>
 <jwvmvpwcsn1.fsf-monnier+emacsbugs@gnu.org>
 <20160318151154.GA9433@acm.fritz.box>
 <jwv8u1f930c.fsf-monnier+emacsbugs@gnu.org>
 <20160318182547.GB9433@acm.fritz.box>
 <jwvoaab60mh.fsf-monnier+emacsbugs@gnu.org>
 <20160319170624.GC2644@acm.fritz.box>
 <jwvbn6a0we3.fsf-monnier+emacsbugs@gnu.org>
 <m2r3emgv8x.fsf@newartisans.com>
Date: Mon, 04 Apr 2016 08:15:52 -0400
In-Reply-To: <m2r3emgv8x.fsf@newartisans.com> (John Wiegley's message of "Sun, 
 03 Apr 2016 15:53:02 -0700")
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1.50 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: 0.3 (/)
X-Debbugs-Envelope-To: 23019
Cc: Alan Mackenzie <acm@muc.de>, 23019@debbugs.gnu.org
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit@debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request@debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
X-Spam-Score: 0.3 (/)

>> I'd like to hear what John thinks about the idea of re-using "nth 5" instead
>> of adding a new entry, but other than that, I think it's OK to commit,
>> thanks.
> How long has this stuff been out in the field?

Many many years.

> Do you think it's well known enough that anyone is depending on the
> earlier behavior of the nth 5 value?

There are definitely packages which use the (nth 5 ..) value returned
from parse-partial-sexp.  E.g. cperl-mode does:

			    state (parse-partial-sexp pre-B p))
		      (or (nth 3 state)
			  (nth 4 state)
			  (nth 5 state)
			  (error "`%s' inside `%s' BLOCK" A if-string))
as well as

	     (let ((pps (parse-partial-sexp (point) found)))
	       (or (nth 3 pps) (nth 4 pps) (nth 5 pps)))))

and verilog-mode does:

		 (setq state (parse-partial-sexp (point) end-mod-point 0 t nil))
		 (or (> (car state) 0)	; in parens
		     (nth 5 state)		; comment
		     ))

sh-script also uses it, along with perl-mode.

> I have a feeling it's OK to re-use it.

That's also my feeling.  All the uses I've found would be unaffected
(e.g. because they're in modes where there are no 2-char comment
markers, so there is really no change in behavior; or because it's only
used at positions which can't be in the middle of a 2-char comment
marker).
It's a "natural extension" of the previous meaning of "nth 5".

But admittedly, it's hard/impossible to find all uses, so I can't claim
with confidence that it won't break some code somewhere.


        Stefan


From debbugs-submit-bounces@debbugs.gnu.org Tue Apr 05 08:51:16 2016
Received: (at 23019) by debbugs.gnu.org; 5 Apr 2016 12:51:16 +0000
Received: from localhost ([127.0.0.1]:51793 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces@debbugs.gnu.org>)
	id 1anQRw-00010C-6H
	for submit@debbugs.gnu.org; Tue, 05 Apr 2016 08:51:16 -0400
Received: from mail.muc.de ([193.149.48.3]:62402)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <acm@muc.de>) id 1anQRs-000102-SQ
 for 23019@debbugs.gnu.org; Tue, 05 Apr 2016 08:51:14 -0400
Received: (qmail 63018 invoked by uid 3782); 5 Apr 2016 12:51:10 -0000
Received: from acm.muc.de (p548A5A8B.dip0.t-ipconnect.de [84.138.90.139]) by
 colin.muc.de (tmda-ofmipd) with ESMTP;
 Tue, 05 Apr 2016 14:51:08 +0200
Received: (qmail 4502 invoked by uid 1000); 5 Apr 2016 12:54:09 -0000
Date: Tue, 5 Apr 2016 12:54:09 +0000
To: John Wiegley <jwiegley@gmail.com>
Subject: Re: bug#23019: parse-partial-sexp doesn't output the full state
 needed for its continuance.
Message-ID: <20160405125409.GB3463@acm.fritz.box>
References: <jwvvb4lclp0.fsf-monnier+emacsbug@gnu.org>
 <20160317214934.GB9038@acm.fritz.box>
 <jwvmvpwcsn1.fsf-monnier+emacsbugs@gnu.org>
 <20160318151154.GA9433@acm.fritz.box>
 <jwv8u1f930c.fsf-monnier+emacsbugs@gnu.org>
 <20160318182547.GB9433@acm.fritz.box>
 <jwvoaab60mh.fsf-monnier+emacsbugs@gnu.org>
 <20160319170624.GC2644@acm.fritz.box>
 <jwvbn6a0we3.fsf-monnier+emacsbugs@gnu.org>
 <m2r3emgv8x.fsf@newartisans.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <m2r3emgv8x.fsf@newartisans.com>
User-Agent: Mutt/1.5.24 (2015-08-30)
X-Delivery-Agent: TMDA/1.1.12 (Macallan)
From: Alan Mackenzie <acm@muc.de>
X-Primary-Address: acm@muc.de
X-Spam-Score: -1.0 (-)
X-Debbugs-Envelope-To: 23019
Cc: 23019@debbugs.gnu.org, Stefan Monnier <monnier@IRO.UMontreal.CA>
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit@debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request@debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

Hello, John.

On Sun, Apr 03, 2016 at 03:53:02PM -0700, John Wiegley wrote:
> >>>>> Stefan Monnier <monnier@IRO.UMontreal.CA> writes:

> > I'd like to hear what John thinks about the idea of re-using "nth 5" instead
> > of adding a new entry, but other than that, I think it's OK to commit,
> > thanks.

> How long has this stuff been out in the field?  Do you think it's well known
> enough that anyone is depending on the earlier behavior of the nth 5 value?  I
> have a feeling it's OK to re-use it.

My feeling is that it would be better not to change the definition of
the fifth element, but it's not a strong feeling.

One concern I have is that there is code out there which compensates for
the previous inadequate behaviour (I know there is in CC Mode), and it
may be more difficult to switch off this compensation if there isn't an
easy way to distinguish new from old, such as (> (length state) 10).

> -- 
> John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
> http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2

-- 
Alan Mackenzie (Nuremberg, Germany).


From debbugs-submit-bounces@debbugs.gnu.org Tue Apr 05 09:50:48 2016
Received: (at 23019) by debbugs.gnu.org; 5 Apr 2016 13:50:48 +0000
Received: from localhost ([127.0.0.1]:51812 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces@debbugs.gnu.org>)
	id 1anRNY-0002Pr-Ig
	for submit@debbugs.gnu.org; Tue, 05 Apr 2016 09:50:48 -0400
Received: from ironport2-out.teksavvy.com ([206.248.154.181]:56207)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <monnier@iro.umontreal.ca>) id 1anRNX-0002Pb-FJ
 for 23019@debbugs.gnu.org; Tue, 05 Apr 2016 09:50:48 -0400
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A0A2FgA731xV/0+KpUVcgxCEAoVVwwsEAgKBPDwRAQEBAQEBAYEKQQWDXQEBAwFWIwULCw4mEhQYDSSINwjPIwEBAQEGAQEBAR6LOoUFB4QtBbM/gUUjgjuBWSKCeAEBAQ
X-IPAS-Result: A0A2FgA731xV/0+KpUVcgxCEAoVVwwsEAgKBPDwRAQEBAQEBAYEKQQWDXQEBAwFWIwULCw4mEhQYDSSINwjPIwEBAQEGAQEBAR6LOoUFB4QtBbM/gUUjgjuBWSKCeAEBAQ
X-IronPort-AV: E=Sophos;i="5.13,465,1427774400"; d="scan'208";a="204959017"
Received: from 69-165-138-79.dsl.teksavvy.com (HELO pastel.home)
 ([69.165.138.79])
 by ironport2-out.teksavvy.com with ESMTP; 05 Apr 2016 09:50:41 -0400
Received: by pastel.home (Postfix, from userid 20848)
 id 12D196225E; Tue,  5 Apr 2016 09:50:41 -0400 (EDT)
From: Stefan Monnier <monnier@iro.umontreal.ca>
To: Alan Mackenzie <acm@muc.de>
Subject: Re: bug#23019: parse-partial-sexp doesn't output the full state
 needed for its continuance.
Message-ID: <jwvoa9oxj74.fsf-monnier+Inbox@gnu.org>
References: <jwvvb4lclp0.fsf-monnier+emacsbug@gnu.org>
 <20160317214934.GB9038@acm.fritz.box>
 <jwvmvpwcsn1.fsf-monnier+emacsbugs@gnu.org>
 <20160318151154.GA9433@acm.fritz.box>
 <jwv8u1f930c.fsf-monnier+emacsbugs@gnu.org>
 <20160318182547.GB9433@acm.fritz.box>
 <jwvoaab60mh.fsf-monnier+emacsbugs@gnu.org>
 <20160319170624.GC2644@acm.fritz.box>
 <jwvbn6a0we3.fsf-monnier+emacsbugs@gnu.org>
 <m2r3emgv8x.fsf@newartisans.com> <20160405125409.GB3463@acm.fritz.box>
Date: Tue, 05 Apr 2016 09:50:41 -0400
In-Reply-To: <20160405125409.GB3463@acm.fritz.box> (Alan Mackenzie's message
 of "Tue, 5 Apr 2016 12:54:09 +0000")
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1.50 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: 1.8 (+)
X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org",
 has NOT identified this incoming email as spam.  The original
 message has been attached to this so you can view it or label
 similar future email.  If you have any questions, see
 the administrator of that system for details.
 Content preview:  > One concern I have is that there is code out there which
 compensates for > the previous inadequate behaviour (I know there is in CC
 Mode), and it > may be more difficult to switch off this compensation if
 there isn't an > easy way to distinguish new from old, such as (> (length
 state) 10). [...] 
 Content analysis details:   (1.8 points, 10.0 required)
 pts rule name              description
 ---- ---------------------- --------------------------------------------------
 -0.0 RCVD_IN_MSPIKE_H3      RBL: Good reputation (+3)
 [206.248.154.181 listed in wl.mailspike.net]
 -0.7 RCVD_IN_DNSWL_LOW      RBL: Sender listed at http://www.dnswl.org/, low
 trust [206.248.154.181 listed in list.dnswl.org]
 1.0 SPF_SOFTFAIL           SPF: sender does not match SPF record (softfail)
 -0.0 RCVD_IN_MSPIKE_WL      Mailspike good senders
 1.5 COMPENSATION           "Compensation"
X-Debbugs-Envelope-To: 23019
Cc: John Wiegley <jwiegley@gmail.com>, 23019@debbugs.gnu.org
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit@debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request@debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
X-Spam-Score: 0.3 (/)

> One concern I have is that there is code out there which compensates for
> the previous inadequate behaviour (I know there is in CC Mode), and it
> may be more difficult to switch off this compensation if there isn't an
> easy way to distinguish new from old, such as (> (length state) 10).

I'd be very surprised if other packages went to that trouble, but if
needed you can still distinguish the new from the old with something like:

   (defconst pps-is-new
     (let ((st (make-syntax-table)))
       (modify-syntax-entry ?/ ". 14" st)
       (with-temp-buffer
         (with-syntax-table st
           (insert "/")
           (nth 5 (parse-partial-sexp (point-min) (point-max)))))))


-- Stefan


From debbugs-submit-bounces@debbugs.gnu.org Tue Apr 05 10:41:57 2016
Received: (at 23019) by debbugs.gnu.org; 5 Apr 2016 14:41:57 +0000
Received: from localhost ([127.0.0.1]:52428 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces@debbugs.gnu.org>)
	id 1anSB3-0003tO-Bg
	for submit@debbugs.gnu.org; Tue, 05 Apr 2016 10:41:57 -0400
Received: from mail.muc.de ([193.149.48.3]:63008)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <acm@muc.de>) id 1anSB1-0003tF-HB
 for 23019@debbugs.gnu.org; Tue, 05 Apr 2016 10:41:56 -0400
Received: (qmail 86609 invoked by uid 3782); 5 Apr 2016 14:41:53 -0000
Received: from acm.muc.de (p548A5A8B.dip0.t-ipconnect.de [84.138.90.139]) by
 colin.muc.de (tmda-ofmipd) with ESMTP;
 Tue, 05 Apr 2016 16:41:52 +0200
Received: (qmail 4863 invoked by uid 1000); 5 Apr 2016 14:44:53 -0000
Date: Tue, 5 Apr 2016 14:44:53 +0000
To: Stefan Monnier <monnier@iro.umontreal.ca>
Subject: Re: bug#23019: parse-partial-sexp doesn't output the full state
 needed for its continuance.
Message-ID: <20160405144453.GC3463@acm.fritz.box>
References: <jwvmvpwcsn1.fsf-monnier+emacsbugs@gnu.org>
 <20160318151154.GA9433@acm.fritz.box>
 <jwv8u1f930c.fsf-monnier+emacsbugs@gnu.org>
 <20160318182547.GB9433@acm.fritz.box>
 <jwvoaab60mh.fsf-monnier+emacsbugs@gnu.org>
 <20160319170624.GC2644@acm.fritz.box>
 <jwvbn6a0we3.fsf-monnier+emacsbugs@gnu.org>
 <m2r3emgv8x.fsf@newartisans.com>
 <20160405125409.GB3463@acm.fritz.box>
 <jwvoa9oxj74.fsf-monnier+Inbox@gnu.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <jwvoa9oxj74.fsf-monnier+Inbox@gnu.org>
User-Agent: Mutt/1.5.24 (2015-08-30)
X-Delivery-Agent: TMDA/1.1.12 (Macallan)
From: Alan Mackenzie <acm@muc.de>
X-Primary-Address: acm@muc.de
X-Spam-Score: -1.0 (-)
X-Debbugs-Envelope-To: 23019
Cc: John Wiegley <jwiegley@gmail.com>, 23019@debbugs.gnu.org
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit@debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request@debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request@debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

Hello, Stefan.

On Tue, Apr 05, 2016 at 09:50:41AM -0400, Stefan Monnier wrote:
> > One concern I have is that there is code out there which compensates for
> > the previous inadequate behaviour (I know there is in CC Mode), and it
> > may be more difficult to switch off this compensation if there isn't an
> > easy way to distinguish new from old, such as (> (length state) 10).

> I'd be very surprised if other packages went to that trouble, but if
> needed you can still distinguish the new from the old with something like:

>    (defconst pps-is-new
>      (let ((st (make-syntax-table)))
>        (modify-syntax-entry ?/ ". 14" st)
>        (with-temp-buffer
>          (with-syntax-table st
>            (insert "/")
>            (nth 5 (parse-partial-sexp (point-min) (point-max)))))))

It can certainly be done, yes, but that way it can only really be done
at set up time, wherease (> (length state) 10) could be done more or
less at any time.

It was just a small point, really.

> -- Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).


From unknown Mon Aug 18 14:19:41 2025
Received: (at fakecontrol) by fakecontrolmessage;
To: internal_control@debbugs.gnu.org
From: Debbugs Internal Request <help-debbugs@gnu.org>
Subject: Internal Control
Message-Id: bug archived.
Date: Wed, 04 May 2016 11:24:04 +0000
User-Agent: Fakemail v42.6.9

# This is a fake control message.
#
# The action:
# bug archived.
thanks
# This fakemail brought to you by your local debbugs
# administrator