GNU bug report logs -
#13541
24.2.92; awk-mode: wrong font locking regexp literals
Previous Next
Reported by: Leo Liu <sdl.web <at> gmail.com>
Date: Thu, 24 Jan 2013 11:44:02 UTC
Severity: minor
Found in version 24.2.92
Done: Alan Mackenzie <acm <at> muc.de>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 13541 in the body.
You can then email your comments to 13541 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-cc-mode <at> gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#13541
; Package
emacs
.
(Thu, 24 Jan 2013 11:44:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Leo Liu <sdl.web <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-cc-mode <at> gnu.org, bug-gnu-emacs <at> gnu.org
.
(Thu, 24 Jan 2013 11:44:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
In an awk buffer having the following text:
#--BEGIN--
NF { /xyz/ }
NF {
/xyz/
}
#--END--
I have the second regexp properly font-locked but not the first one.
(tested in GNU Emacs 24.2.92.1 of 2013-01-13)
[awk-mode-bug.png (image/png, inline)]
[Message part 3 (text/plain, inline)]
Leo
Information forwarded
to
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:
bug#13541
; Package
emacs,cc-mode
.
(Thu, 24 Jan 2013 18:29:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 13541 <at> debbugs.gnu.org (full text, mbox):
Leo Liu wrote:
> In an awk buffer having the following text:
>
> #--BEGIN--
> NF { /xyz/ }
>
> NF {
> /xyz/
> }
> #--END--
>
> I have the second regexp properly font-locked but not the first one.
Do you have an example of an actual useful awk script showing the issue,
because this one seems like a pointless no-op?
Information forwarded
to
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:
bug#13541
; Package
emacs,cc-mode
.
(Thu, 24 Jan 2013 22:23:03 GMT)
Full text and
rfc822 format available.
Message #11 received at 13541 <at> debbugs.gnu.org (full text, mbox):
Hi, Glenn,
On Thu, Jan 24, 2013 at 01:28:33PM -0500, Glenn Morris wrote:
> Leo Liu wrote:
> > In an awk buffer having the following text:
> > #--BEGIN--
> > NF { /xyz/ }
> > NF {
> > /xyz/
> > }
> > #--END--
> > I have the second regexp properly font-locked but not the first one.
> Do you have an example of an actual useful awk script showing the issue,
> because this one seems like a pointless no-op?
This is a real bug, perhaps not a difficult one. "/regexp/" is an
expression with value 1 iff the current input line matches the regexp.
So a line like
NF { print /xyz/ }
is perfectly legitimate, printing 1 if there's an "xyz" on the line.
I'm looking at this bug at the moment.
--
Alan Mackenzie (Nuremberg, Germany).
Information forwarded
to
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:
bug#13541
; Package
emacs,cc-mode
.
(Fri, 25 Jan 2013 01:21:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 13541 <at> debbugs.gnu.org (full text, mbox):
On 2013-01-25 06:16 +0800, Alan Mackenzie wrote:
> This is a real bug, perhaps not a difficult one. "/regexp/" is an
> expression with value 1 iff the current input line matches the regexp.
> So a line like
>
> NF { print /xyz/ }
>
> is perfectly legitimate, printing 1 if there's an "xyz" on the line.
>
> I'm looking at this bug at the moment.
Thanks to all for chiming in.
Alan, I also have another seemingly buglet about indentation.
Every line after a pattern-action pair like the following one (where
action is omitted) is indented to column 4, i.e. it doesn't recognise a
newline terminates a pattern.
$0 == "Emacs"
|
all following lines indented here
(this might be regression, I seem to recall reporting something along
these lines some while ago.)
Leo
Information forwarded
to
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:
bug#13541
; Package
emacs,cc-mode
.
(Fri, 25 Jan 2013 01:34:02 GMT)
Full text and
rfc822 format available.
Message #17 received at 13541 <at> debbugs.gnu.org (full text, mbox):
Leo Liu wrote:
> (this might be regression, I seem to recall reporting something along
> these lines some while ago.)
No, it is the never addressed
http://debbugs.gnu.org/12274
Information forwarded
to
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:
bug#13541
; Package
emacs,cc-mode
.
(Fri, 25 Jan 2013 01:45:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 13541 <at> debbugs.gnu.org (full text, mbox):
> Leo Liu wrote:
>
>> (this might be regression,
PS henceforth it is prohibited to use the word "regression" except in
the form "this is a regression against Emacs XX.YY, where it works as
desired".
Information forwarded
to
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:
bug#13541
; Package
emacs,cc-mode
.
(Fri, 25 Jan 2013 17:58:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 13541 <at> debbugs.gnu.org (full text, mbox):
Hi, Leo.
On Thu, Jan 24, 2013 at 07:43:06PM +0800, Leo Liu wrote:
> In an awk buffer having the following text:
> #--BEGIN--
> NF { /xyz/ }
> NF {
> /xyz/
> }
> #--END--
> I have the second regexp properly font-locked but not the first one.
Yes.
Could you please try out, fairly thoroughly, the following patch, and let
me know how it goes. It aims to fontify a /regexp/ wherever one might
occur.
=== modified file 'lisp/progmodes/cc-awk.el'
*** lisp/progmodes/cc-awk.el 2013-01-01 09:11:05 +0000
--- lisp/progmodes/cc-awk.el 2013-01-25 17:47:38 +0000
***************
*** 211,217 ****
;; division sign.
(defconst c-awk-neutral-re
; "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)+") ; changed, 2003/6/7
! "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)")
;; A "neutral" char(pair). Doesn't change the "state" of a subsequent /.
;; This is space/tab, braces, an auto-increment/decrement operator or an
;; escaped character. Or one of the (invalid) characters @ or `. But NOT an
--- 211,217 ----
;; division sign.
(defconst c-awk-neutral-re
; "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)+") ; changed, 2003/6/7
! "\\([}@` \t]\\|\\+\\+\\|--\\|\\\\\\(.\\|[\n\r]\\)\\)")
;; A "neutral" char(pair). Doesn't change the "state" of a subsequent /.
;; This is space/tab, braces, an auto-increment/decrement operator or an
;; escaped character. Or one of the (invalid) characters @ or `. But NOT an
***************
*** 231,237 ****
;; will only work when there won't be a preceding " or / before the sought /
;; to foul things up.
(defconst c-awk-non-arith-op-bra-re
! "[[\(&=:!><,?;'~|]")
;; Matches an opening BRAcket, round or square, or any operator character
;; apart from +,-,/,*,%. For the purpose at hand (detecting a / which is a
;; regexp bracket) these arith ops are unnecessary and a pain, because of "++"
--- 231,237 ----
;; will only work when there won't be a preceding " or / before the sought /
;; to foul things up.
(defconst c-awk-non-arith-op-bra-re
! "[[\({&=:!><,?;'~|]")
;; Matches an opening BRAcket, round or square, or any operator character
;; apart from +,-,/,*,%. For the purpose at hand (detecting a / which is a
;; regexp bracket) these arith ops are unnecessary and a pain, because of "++"
***************
*** 242,247 ****
--- 242,257 ----
;; bracket, in a context where an immediate / would be a division sign. This
;; will only work when there won't be a preceding " or / before the sought /
;; to foul things up.
+ (defconst c-awk-pre-exp-alphanum-kwd-re
+ (concat "\\(^\\|[^_\n\r]\\)\\<"
+ (regexp-opt '("print" "return" "case") t)
+ "\\>\\([^_\n\r]\\|$\\)"))
+ ;; Matches all AWK keywords which can precede expressions (including
+ ;; /regexp/).
+ (defconst c-awk-kwd-regexp-sign-re
+ (concat c-awk-pre-exp-alphanum-kwd-re c-awk-neutrals*-re "/"))
+ ;; Matches a piece of AWK buffer ending in <kwd> /, where <kwd> is a keyword
+ ;; which can precede an expression.
;; REGEXPS USED FOR FINDING THE POSITION OF A "virtual semicolon"
(defconst c-awk-_-harmless-nonws-char-re "[^#/\"\\\\\n\r \t]")
***************
*** 721,729 ****
(goto-char anchor)
;; Analyze the line to find out what the / is.
(if (if anchor-state-/div
! (not (search-forward-regexp c-awk-regexp-sign-re (1+ /point) t))
! (search-forward-regexp c-awk-div-sign-re (1+ /point) t))
! ;; A division sign.
(progn (goto-char (1+ /point)) nil)
;; A regexp opener
;; Jump over the regexp innards, setting the match data.
--- 731,740 ----
(goto-char anchor)
;; Analyze the line to find out what the / is.
(if (if anchor-state-/div
! (not (search-forward-regexp c-awk-regexp-sign-re (1+ /point) t))
! (and (not (search-forward-regexp c-awk-kwd-regexp-sign-re (1+ /point) t))
! (search-forward-regexp c-awk-div-sign-re (1+ /point) t)))
! ;; A division sign.
(progn (goto-char (1+ /point)) nil)
;; A regexp opener
;; Jump over the regexp innards, setting the match data.
> Leo
--
Alan Mackenzie (Nuremberg, Germany).
Information forwarded
to
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:
bug#13541
; Package
emacs,cc-mode
.
(Fri, 25 Jan 2013 21:34:02 GMT)
Full text and
rfc822 format available.
Message #26 received at 13541 <at> debbugs.gnu.org (full text, mbox):
If a a nasty and cruel bug appears just before the release, is that
"regression to the mean"?
--
Dr Richard Stallman
President, Free Software Foundation
51 Franklin St
Boston MA 02110
USA
www.fsf.org www.gnu.org
Skype: No way! That's nonfree (freedom-denying) software.
Use Ekiga or an ordinary phone call
Information forwarded
to
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:
bug#13541
; Package
emacs,cc-mode
.
(Sat, 26 Jan 2013 11:16:02 GMT)
Full text and
rfc822 format available.
Message #29 received at 13541 <at> debbugs.gnu.org (full text, mbox):
On 2013-01-26 01:50 +0800, Alan Mackenzie wrote:
> Could you please try out, fairly thoroughly, the following patch, and let
> me know how it goes. It aims to fontify a /regexp/ wherever one might
> occur.
The second regexp is not font-locked in this case:
/a/ { print /abc/ }
Leo
Information forwarded
to
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:
bug#13541
; Package
emacs,cc-mode
.
(Sun, 27 Jan 2013 19:07:01 GMT)
Full text and
rfc822 format available.
Message #32 received at 13541 <at> debbugs.gnu.org (full text, mbox):
Hi, Leo.
On Sat, Jan 26, 2013 at 07:14:49PM +0800, Leo Liu wrote:
> On 2013-01-26 01:50 +0800, Alan Mackenzie wrote:
> > Could you please try out, fairly thoroughly, the following patch, and let
> > me know how it goes. It aims to fontify a /regexp/ wherever one might
> > occur.
> The second regexp is not font-locked in this case:
> /a/ { print /abc/ }
Yes, thanks for spotting this. The situation was more complicated than I
thought. I think this replacement patch fixes that case (together with a
few others). Would you try it out again, please.
=== modified file 'lisp/progmodes/cc-awk.el'
*** lisp/progmodes/cc-awk.el 2013-01-01 09:11:05 +0000
--- lisp/progmodes/cc-awk.el 2013-01-27 18:23:59 +0000
***************
*** 127,148 ****
;; escaped EOL.
;; REGEXPS FOR "HARMLESS" STRINGS/LINES.
- (defconst c-awk-harmless-char-re "[^_#/\"\\\\\n\r]")
- ;; Matches any character but a _, #, /, ", \, or newline. N.B. _" starts a
- ;; localization string in gawk 3.1
(defconst c-awk-harmless-_ "_\\([^\"]\\|\\'\\)")
;; Matches an underline NOT followed by ".
(defconst c-awk-harmless-string*-re
(concat "\\(" c-awk-harmless-char-re "\\|" c-awk-esc-pair-re "\\|" c-awk-harmless-_ "\\)*"))
! ;; Matches a (possibly empty) sequence of chars without unescaped /, ", \,
! ;; #, or newlines.
(defconst c-awk-harmless-string*-here-re
(concat "\\=" c-awk-harmless-string*-re))
! ;; Matches the (possibly empty) sequence of chars without unescaped /, ", \,
! ;; at point.
(defconst c-awk-harmless-line-re
! (concat c-awk-harmless-string*-re
! "\\(" c-awk-comment-without-nl "\\)?" c-awk-nl-or-eob))
;; Matches (the tail of) an AWK \"logical\" line not containing an unescaped
;; " or /. "logical" means "possibly containing escaped newlines". A comment
;; is matched as part of the line even if it contains a " or a /. The End of
--- 127,155 ----
;; escaped EOL.
;; REGEXPS FOR "HARMLESS" STRINGS/LINES.
(defconst c-awk-harmless-_ "_\\([^\"]\\|\\'\\)")
;; Matches an underline NOT followed by ".
+ (defconst c-awk-harmless-char-re "[^_#/\"{}();\\\\\n\r]")
+ ;; Mathches any character not significant in the state machine applying
+ ;; syntax-table properties to "s and /s.
(defconst c-awk-harmless-string*-re
(concat "\\(" c-awk-harmless-char-re "\\|" c-awk-esc-pair-re "\\|" c-awk-harmless-_ "\\)*"))
! ;; Matches a (possibly empty) sequence of characters insignificant in the
! ;; state machine applying syntax-table properties to "s and /s.
(defconst c-awk-harmless-string*-here-re
(concat "\\=" c-awk-harmless-string*-re))
! ;; Matches the (possibly empty) sequence of "insignificant" chars at point.
!
! (defconst c-awk-harmless-line-char-re "[^_#/\"\\\\\n\r]")
! ;; Matches any character but a _, #, /, ", \, or newline. N.B. _" starts a
! ;; localisation string in gawk 3.1
! (defconst c-awk-harmless-line-string*-re
! (concat "\\(" c-awk-harmless-line-char-re "\\|" c-awk-esc-pair-re "\\|" c-awk-harmless-_ "\\)*"))
! ;; Matches a (possibly empty) sequence of chars without unescaped /, ", \,
! ;; #, or newlines.
(defconst c-awk-harmless-line-re
! (concat c-awk-harmless-line-string*-re
! "\\(" c-awk-comment-without-nl "\\)?" c-awk-nl-or-eob))
;; Matches (the tail of) an AWK \"logical\" line not containing an unescaped
;; " or /. "logical" means "possibly containing escaped newlines". A comment
;; is matched as part of the line even if it contains a " or a /. The End of
***************
*** 211,217 ****
;; division sign.
(defconst c-awk-neutral-re
; "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)+") ; changed, 2003/6/7
! "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)")
;; A "neutral" char(pair). Doesn't change the "state" of a subsequent /.
;; This is space/tab, braces, an auto-increment/decrement operator or an
;; escaped character. Or one of the (invalid) characters @ or `. But NOT an
--- 218,224 ----
;; division sign.
(defconst c-awk-neutral-re
; "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)+") ; changed, 2003/6/7
! "\\([}@` \t]\\|\\+\\+\\|--\\|\\\\\\(.\\|[\n\r]\\)\\)")
;; A "neutral" char(pair). Doesn't change the "state" of a subsequent /.
;; This is space/tab, braces, an auto-increment/decrement operator or an
;; escaped character. Or one of the (invalid) characters @ or `. But NOT an
***************
*** 231,238 ****
;; will only work when there won't be a preceding " or / before the sought /
;; to foul things up.
(defconst c-awk-non-arith-op-bra-re
! "[[\(&=:!><,?;'~|]")
! ;; Matches an opening BRAcket, round or square, or any operator character
;; apart from +,-,/,*,%. For the purpose at hand (detecting a / which is a
;; regexp bracket) these arith ops are unnecessary and a pain, because of "++"
;; and "--".
--- 238,245 ----
;; will only work when there won't be a preceding " or / before the sought /
;; to foul things up.
(defconst c-awk-non-arith-op-bra-re
! "[[\({&=:!><,?;'~|]")
! ;; Matches an openeing BRAcket ,round or square, or any operator character
;; apart from +,-,/,*,%. For the purpose at hand (detecting a / which is a
;; regexp bracket) these arith ops are unnecessary and a pain, because of "++"
;; and "--".
***************
*** 242,247 ****
--- 249,264 ----
;; bracket, in a context where an immediate / would be a division sign. This
;; will only work when there won't be a preceding " or / before the sought /
;; to foul things up.
+ (defconst c-awk-pre-exp-alphanum-kwd-re
+ (concat "\\(^\\|[^_\n\r]\\)\\<"
+ (regexp-opt '("print" "return" "case") t)
+ "\\>\\([^_\n\r]\\|$\\)"))
+ ;; Matches all AWK keywords which can precede expressions (including
+ ;; /regexp/).
+ (defconst c-awk-kwd-regexp-sign-re
+ (concat c-awk-pre-exp-alphanum-kwd-re c-awk-neutrals*-re "/"))
+ ;; Matches a piece of AWK buffer ending in <kwd> /, where <kwd> is a keyword
+ ;; which can precede an expression.
;; REGEXPS USED FOR FINDING THE POSITION OF A "virtual semicolon"
(defconst c-awk-_-harmless-nonws-char-re "[^#/\"\\\\\n\r \t]")
***************
*** 721,729 ****
(goto-char anchor)
;; Analyze the line to find out what the / is.
(if (if anchor-state-/div
! (not (search-forward-regexp c-awk-regexp-sign-re (1+ /point) t))
! (search-forward-regexp c-awk-div-sign-re (1+ /point) t))
! ;; A division sign.
(progn (goto-char (1+ /point)) nil)
;; A regexp opener
;; Jump over the regexp innards, setting the match data.
--- 738,747 ----
(goto-char anchor)
;; Analyze the line to find out what the / is.
(if (if anchor-state-/div
! (not (search-forward-regexp c-awk-regexp-sign-re (1+ /point) t))
! (and (not (search-forward-regexp c-awk-kwd-regexp-sign-re (1+ /point) t))
! (search-forward-regexp c-awk-div-sign-re (1+ /point) t)))
! ;; A division sign.
(progn (goto-char (1+ /point)) nil)
;; A regexp opener
;; Jump over the regexp innards, setting the match data.
***************
*** 776,787 ****
(< (point) lim))
(setq anchor (point))
(search-forward-regexp c-awk-harmless-string*-here-re nil t)
! ;; We are now looking at either a " or a /.
! ;; Do our thing on the string, regexp or division sign.
(setq anchor-state-/div
! (if (looking-at "_?\"")
! (c-awk-syntax-tablify-string)
! (c-awk-syntax-tablify-/ anchor anchor-state-/div))))
nil))
;; ACM, 2002/07/21: Thoughts: We need an AWK Mode after-change function to set
--- 794,813 ----
(< (point) lim))
(setq anchor (point))
(search-forward-regexp c-awk-harmless-string*-here-re nil t)
! ;; We are now looking at either a " or a / or a brace/paren/semicolon.
! ;; Do our thing on the string, regexp or divsion sign or update our state.
(setq anchor-state-/div
! (cond
! ((looking-at "_?\"")
! (c-awk-syntax-tablify-string))
! ((eq (char-after) ?/)
! (c-awk-syntax-tablify-/ anchor anchor-state-/div))
! ((memq (char-after) '(?{ ?} ?\( ?\;))
! (forward-char)
! nil)
! (t ; ?\)
! (forward-char)
! t))))
nil))
;; ACM, 2002/07/21: Thoughts: We need an AWK Mode after-change function to set
> Leo
--
Alan Mackenzie (Nuremberg, Germany).
Information forwarded
to
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:
bug#13541
; Package
emacs,cc-mode
.
(Mon, 28 Jan 2013 01:13:02 GMT)
Full text and
rfc822 format available.
Message #35 received at 13541 <at> debbugs.gnu.org (full text, mbox):
On 2013-01-28 02:59 +0800, Alan Mackenzie wrote:
> Yes, thanks for spotting this. The situation was more complicated than I
> thought. I think this replacement patch fixes that case (together with a
> few others). Would you try it out again, please.
Still fails with:
/a/ { (print /abc/) }
or
/a/ { p /abc/ } # incorrect awk so not sure a bug or feature
Leo
Information forwarded
to
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:
bug#13541
; Package
emacs,cc-mode
.
(Mon, 28 Jan 2013 11:22:01 GMT)
Full text and
rfc822 format available.
Message #38 received at 13541 <at> debbugs.gnu.org (full text, mbox):
Hi, Leo.
On Mon, Jan 28, 2013 at 09:12:01AM +0800, Leo Liu wrote:
> On 2013-01-28 02:59 +0800, Alan Mackenzie wrote:
> > Yes, thanks for spotting this. The situation was more complicated than I
> > thought. I think this replacement patch fixes that case (together with a
> > few others). Would you try it out again, please.
> Still fails with:
> /a/ { (print /abc/) }
Whoops! There's a slight glitch in one of the regexps in cc-awk.el. If
there were a space before "print", it would be "all right". I've sent a
corrected patch below.
> or
> /a/ { p /abc/ } # incorrect awk so not sure a bug or feature
That "/abc/" is two division signs with a variable between them. :-)
Compare your text with this:
BEGIN { a = 1 }
/a/ { print a /a/ a }
At the moment, after an alphanumeric token, /regexp/ is only a regexp
when the token is one of the keywords ("print" "case" "return"). There
might be more such keywords (I've not found any). In a way, "printf"
could be one too, except its first argument is always the format string,
so that wouldn't be useful.
Here's the amended patch:
=== modified file 'lisp/progmodes/cc-awk.el'
*** lisp/progmodes/cc-awk.el 2013-01-01 09:11:05 +0000
--- lisp/progmodes/cc-awk.el 2013-01-28 10:57:52 +0000
***************
*** 127,148 ****
;; escaped EOL.
;; REGEXPS FOR "HARMLESS" STRINGS/LINES.
- (defconst c-awk-harmless-char-re "[^_#/\"\\\\\n\r]")
- ;; Matches any character but a _, #, /, ", \, or newline. N.B. _" starts a
- ;; localization string in gawk 3.1
(defconst c-awk-harmless-_ "_\\([^\"]\\|\\'\\)")
;; Matches an underline NOT followed by ".
(defconst c-awk-harmless-string*-re
(concat "\\(" c-awk-harmless-char-re "\\|" c-awk-esc-pair-re "\\|" c-awk-harmless-_ "\\)*"))
! ;; Matches a (possibly empty) sequence of chars without unescaped /, ", \,
! ;; #, or newlines.
(defconst c-awk-harmless-string*-here-re
(concat "\\=" c-awk-harmless-string*-re))
! ;; Matches the (possibly empty) sequence of chars without unescaped /, ", \,
! ;; at point.
(defconst c-awk-harmless-line-re
! (concat c-awk-harmless-string*-re
! "\\(" c-awk-comment-without-nl "\\)?" c-awk-nl-or-eob))
;; Matches (the tail of) an AWK \"logical\" line not containing an unescaped
;; " or /. "logical" means "possibly containing escaped newlines". A comment
;; is matched as part of the line even if it contains a " or a /. The End of
--- 127,155 ----
;; escaped EOL.
;; REGEXPS FOR "HARMLESS" STRINGS/LINES.
(defconst c-awk-harmless-_ "_\\([^\"]\\|\\'\\)")
;; Matches an underline NOT followed by ".
+ (defconst c-awk-harmless-char-re "[^_#/\"{}();\\\\\n\r]")
+ ;; Mathches any character not significant in the state machine applying
+ ;; syntax-table properties to "s and /s.
(defconst c-awk-harmless-string*-re
(concat "\\(" c-awk-harmless-char-re "\\|" c-awk-esc-pair-re "\\|" c-awk-harmless-_ "\\)*"))
! ;; Matches a (possibly empty) sequence of characters insignificant in the
! ;; state machine applying syntax-table properties to "s and /s.
(defconst c-awk-harmless-string*-here-re
(concat "\\=" c-awk-harmless-string*-re))
! ;; Matches the (possibly empty) sequence of "insignificant" chars at point.
!
! (defconst c-awk-harmless-line-char-re "[^_#/\"\\\\\n\r]")
! ;; Matches any character but a _, #, /, ", \, or newline. N.B. _" starts a
! ;; localisation string in gawk 3.1
! (defconst c-awk-harmless-line-string*-re
! (concat "\\(" c-awk-harmless-line-char-re "\\|" c-awk-esc-pair-re "\\|" c-awk-harmless-_ "\\)*"))
! ;; Matches a (possibly empty) sequence of chars without unescaped /, ", \,
! ;; #, or newlines.
(defconst c-awk-harmless-line-re
! (concat c-awk-harmless-line-string*-re
! "\\(" c-awk-comment-without-nl "\\)?" c-awk-nl-or-eob))
;; Matches (the tail of) an AWK \"logical\" line not containing an unescaped
;; " or /. "logical" means "possibly containing escaped newlines". A comment
;; is matched as part of the line even if it contains a " or a /. The End of
***************
*** 211,217 ****
;; division sign.
(defconst c-awk-neutral-re
; "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)+") ; changed, 2003/6/7
! "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)")
;; A "neutral" char(pair). Doesn't change the "state" of a subsequent /.
;; This is space/tab, braces, an auto-increment/decrement operator or an
;; escaped character. Or one of the (invalid) characters @ or `. But NOT an
--- 218,224 ----
;; division sign.
(defconst c-awk-neutral-re
; "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)+") ; changed, 2003/6/7
! "\\([}@` \t]\\|\\+\\+\\|--\\|\\\\\\(.\\|[\n\r]\\)\\)")
;; A "neutral" char(pair). Doesn't change the "state" of a subsequent /.
;; This is space/tab, braces, an auto-increment/decrement operator or an
;; escaped character. Or one of the (invalid) characters @ or `. But NOT an
***************
*** 231,238 ****
;; will only work when there won't be a preceding " or / before the sought /
;; to foul things up.
(defconst c-awk-non-arith-op-bra-re
! "[[\(&=:!><,?;'~|]")
! ;; Matches an opening BRAcket, round or square, or any operator character
;; apart from +,-,/,*,%. For the purpose at hand (detecting a / which is a
;; regexp bracket) these arith ops are unnecessary and a pain, because of "++"
;; and "--".
--- 238,245 ----
;; will only work when there won't be a preceding " or / before the sought /
;; to foul things up.
(defconst c-awk-non-arith-op-bra-re
! "[[\({&=:!><,?;'~|]")
! ;; Matches an openeing BRAcket ,round or square, or any operator character
;; apart from +,-,/,*,%. For the purpose at hand (detecting a / which is a
;; regexp bracket) these arith ops are unnecessary and a pain, because of "++"
;; and "--".
***************
*** 242,247 ****
--- 249,264 ----
;; bracket, in a context where an immediate / would be a division sign. This
;; will only work when there won't be a preceding " or / before the sought /
;; to foul things up.
+ (defconst c-awk-pre-exp-alphanum-kwd-re
+ (concat "\\(^\\|\\=\\|[^_\n\r]\\)\\<"
+ (regexp-opt '("print" "return" "case") t)
+ "\\>\\([^_\n\r]\\|$\\)"))
+ ;; Matches all AWK keywords which can precede expressions (including
+ ;; /regexp/).
+ (defconst c-awk-kwd-regexp-sign-re
+ (concat c-awk-pre-exp-alphanum-kwd-re c-awk-neutrals*-re "/"))
+ ;; Matches a piece of AWK buffer ending in <kwd> /, where <kwd> is a keyword
+ ;; which can precede an expression.
;; REGEXPS USED FOR FINDING THE POSITION OF A "virtual semicolon"
(defconst c-awk-_-harmless-nonws-char-re "[^#/\"\\\\\n\r \t]")
***************
*** 721,729 ****
(goto-char anchor)
;; Analyze the line to find out what the / is.
(if (if anchor-state-/div
! (not (search-forward-regexp c-awk-regexp-sign-re (1+ /point) t))
! (search-forward-regexp c-awk-div-sign-re (1+ /point) t))
! ;; A division sign.
(progn (goto-char (1+ /point)) nil)
;; A regexp opener
;; Jump over the regexp innards, setting the match data.
--- 738,747 ----
(goto-char anchor)
;; Analyze the line to find out what the / is.
(if (if anchor-state-/div
! (not (search-forward-regexp c-awk-regexp-sign-re (1+ /point) t))
! (and (not (search-forward-regexp c-awk-kwd-regexp-sign-re (1+ /point) t))
! (search-forward-regexp c-awk-div-sign-re (1+ /point) t)))
! ;; A division sign.
(progn (goto-char (1+ /point)) nil)
;; A regexp opener
;; Jump over the regexp innards, setting the match data.
***************
*** 776,787 ****
(< (point) lim))
(setq anchor (point))
(search-forward-regexp c-awk-harmless-string*-here-re nil t)
! ;; We are now looking at either a " or a /.
! ;; Do our thing on the string, regexp or division sign.
(setq anchor-state-/div
! (if (looking-at "_?\"")
! (c-awk-syntax-tablify-string)
! (c-awk-syntax-tablify-/ anchor anchor-state-/div))))
nil))
;; ACM, 2002/07/21: Thoughts: We need an AWK Mode after-change function to set
--- 794,813 ----
(< (point) lim))
(setq anchor (point))
(search-forward-regexp c-awk-harmless-string*-here-re nil t)
! ;; We are now looking at either a " or a / or a brace/paren/semicolon.
! ;; Do our thing on the string, regexp or divsion sign or update our state.
(setq anchor-state-/div
! (cond
! ((looking-at "_?\"")
! (c-awk-syntax-tablify-string))
! ((eq (char-after) ?/)
! (c-awk-syntax-tablify-/ anchor anchor-state-/div))
! ((memq (char-after) '(?{ ?} ?\( ?\;))
! (forward-char)
! nil)
! (t ; ?\)
! (forward-char)
! t))))
nil))
;; ACM, 2002/07/21: Thoughts: We need an AWK Mode after-change function to set
> Leo
--
Alan Mackenzie (Nuremberg, Germany).
Information forwarded
to
bug-gnu-emacs <at> gnu.org, bug-cc-mode <at> gnu.org
:
bug#13541
; Package
emacs,cc-mode
.
(Mon, 28 Jan 2013 12:12:01 GMT)
Full text and
rfc822 format available.
Message #41 received at 13541 <at> debbugs.gnu.org (full text, mbox):
On 2013-01-28 19:14 +0800, Alan Mackenzie wrote:
> Whoops! There's a slight glitch in one of the regexps in cc-awk.el. If
> there were a space before "print", it would be "all right". I've sent a
> corrected patch below.
OK, I have no further complaints ;)
Leo
Reply sent
to
Alan Mackenzie <acm <at> muc.de>
:
You have taken responsibility.
(Tue, 29 Jan 2013 21:07:01 GMT)
Full text and
rfc822 format available.
Notification sent
to
Leo Liu <sdl.web <at> gmail.com>
:
bug acknowledged by developer.
(Tue, 29 Jan 2013 21:07:02 GMT)
Full text and
rfc822 format available.
Message #46 received at 13541-done <at> debbugs.gnu.org (full text, mbox):
Bug fixed.
--
Alan Mackenzie (Nuremberg, Germany).
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Wed, 27 Feb 2013 12:24:02 GMT)
Full text and
rfc822 format available.
This bug report was last modified 12 years and 178 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.