GNU bug report logs - #20629
25.0.50; Regression: TAGS broken, can't find anything in C++ files.

Package: emacs;

Reported by: "Jan D." <jan.h.d <at> swipnet.se>

Date: Fri, 22 May 2015 05:59:02 UTC

Severity: normal

Found in version 25.0.50

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 20629 in the body.
You can then email your comments to 20629 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox

Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Fri, 22 May 2015 05:59:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to "Jan D." <jan.h.d <at> swipnet.se>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Fri, 22 May 2015 05:59:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: "Jan D." <jan.h.d <at> swipnet.se>
To: bug-gnu-emacs <at> gnu.org
Subject: 25.0.50; Regression: TAGS broken, can't find anything in C++ files.
Date: Fri, 22 May 2015 07:57:38 +0200

Hello.

Create this file as x.cc
-----------------------------------------------------------------------
class XX
{
public:
    int foo();
    void bar();
};

int
XX::foo()
{
    return 1;
}

void
XX::bar()
{
    foo();
}

int
main(int argc, char *argv[])
{
    XX xx;
    xx.bar();
    return 0;
}

-----------------------------------------------------------------------

Run etags on it:

% etags x.cc

In emacs, load the TAGS file, put the cursor on bar in xx.bar in main 
and press ESC .
Emacs says: "No known definitions for: bar".
Put cursor on foo in foo(); in XX::bar(), press ESC .
Emacs says: "No known definitions for: foo".

If you do the same thing in 24.5, it works as expected, i.e. the cursor
jumps to respective member definition.
The TAGS file produced by trunk etags and 24.5 etags are exactly the same.

This regression makes the etags feature totally useless for C++.

     Jan D.




In GNU Emacs 25.0.50.1 (x86_64-apple-darwin14.3.0, NS appkit-1347.57 
Version 10.10.3 (Build 14D136))
 of 2015-05-22 on <anon>
Windowing system distributor `Apple', version 10.3.1347
Configured using:
 `configure --enable-checking --verbose --with-ns --without-x CFLAGS=-g'

Configured features:
ACL LIBXML2 ZLIB

Important settings:
  value of $LC_COLLATE: C
  value of $LANG: sv_SE.UTF-8
  locale-coding-system: utf-8-unix

Major mode: C++/l

Minor modes in effect:
  tooltip-mode: t
  global-eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  abbrev-mode: t

Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.
Starting a new list of tags tables
user-error: No known definitions for: bar
user-error: No known definitions for: foo

Load-path shadows:
None found.

Features:
(shadow sort gnus-util mail-extr emacsbug message dired format-spec
rfc822 mml mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231
mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums
mm-util help-fns mail-prsvr mail-utils etags thingatpt xref cl-seq ring
eieio byte-opt gv bytecomp byte-compile cl-extra seq cconv eieio-core
cl-loaddefs pcase cl-lib cc-mode cc-fonts easymenu cc-guess cc-menus
cc-cmds cc-styles cc-align cc-engine cc-vars cc-defs time-date mule-util
tooltip eldoc electric uniquify ediff-hook vc-hooks lisp-float-type
mwheel ns-win term/common-win tool-bar dnd fontset image regexp-opt
fringe tabulated-list newcomment elisp-mode lisp-mode prog-mode register
page menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock
font-lock syntax facemenu font-core frame cl-generic cham georgian
utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean
japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european
ethiopic indian cyrillic chinese case-table epa-hook jka-cmpr-hook help
simple abbrev minibuffer cl-preloaded nadvice loaddefs button faces
cus-face macroexp files text-properties overlay sha1 md5 base64 format
env code-pages mule custom widget hashtable-print-readable backquote
cocoa ns multi-tty make-network-process emacs)

Memory information:
((conses 16 107206 5916)
 (symbols 48 21504 1)
 (miscs 40 49 143)
 (strings 32 21487 4677)
 (string-bytes 1 738790)
 (vectors 16 15321)
 (vector-slots 8 438795 3143)
 (floats 8 151 24)
 (intervals 56 245 4)
 (buffers 976 13))

Added indication that bug 20629 blocks19759 Request was from Glenn Morris <rgm <at> gnu.org> to control <at> debbugs.gnu.org. (Fri, 22 May 2015 16:07:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 23 May 2015 11:56:03 GMT) Full text and rfc822 format available.

Message #10 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Jan Djärv <jan.h.d <at> swipnet.se>
To: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Sat, 23 May 2015 13:54:55 +0200

This is the bad commit:

commit c7d601adefe130b773c1622a5aa8722d80709c1c
Author: Dmitry Gutov <dgutov <at> yandex.ru>
Date:   Sun May 10 00:36:46 2015 +0300

    Remove tag-symbol-match-p from etags-xref-find-definitions-tag-order

    * lisp/progmodes/etags.el (etags-xref-find-definitions-tag-order):
    Remove tag-symbol-match-p from the default value
    (http://lists.gnu.org/archive/html/emacs-devel/2015-05/msg00292.html).

The URL given in the description doesn't actually say why this change was 
made.  It just asks if anyone has objections.
So if there is no justification for this change, I will revert it soonish.
C++ support in Emacs is kind of a big deal.

	Jan D.

Den 2015-05-22 07:57, Jan D. skrev:
> Hello.
>
> Create this file as x.cc
> -----------------------------------------------------------------------
> class XX
> {
> public:
>      int foo();
>      void bar();
> };
>
> int
> XX::foo()
> {
>      return 1;
> }
>
> void
> XX::bar()
> {
>      foo();
> }
>
> int
> main(int argc, char *argv[])
> {
>      XX xx;
>      xx.bar();
>      return 0;
> }
>
> -----------------------------------------------------------------------
>
> Run etags on it:
>
> % etags x.cc
>
> In emacs, load the TAGS file, put the cursor on bar in xx.bar in main and
> press ESC .
> Emacs says: "No known definitions for: bar".
> Put cursor on foo in foo(); in XX::bar(), press ESC .
> Emacs says: "No known definitions for: foo".
>
> If you do the same thing in 24.5, it works as expected, i.e. the cursor
> jumps to respective member definition.
> The TAGS file produced by trunk etags and 24.5 etags are exactly the same.
>
> This regression makes the etags feature totally useless for C++.
>
>       Jan D.
>
>
>
>
> In GNU Emacs 25.0.50.1 (x86_64-apple-darwin14.3.0, NS appkit-1347.57 Version
> 10.10.3 (Build 14D136))
>   of 2015-05-22 on <anon>
> Windowing system distributor `Apple', version 10.3.1347
> Configured using:
>   `configure --enable-checking --verbose --with-ns --without-x CFLAGS=-g'
>
> Configured features:
> ACL LIBXML2 ZLIB
>
> Important settings:
>    value of $LC_COLLATE: C
>    value of $LANG: sv_SE.UTF-8
>    locale-coding-system: utf-8-unix
>
> Major mode: C++/l
>
> Minor modes in effect:
>    tooltip-mode: t
>    global-eldoc-mode: t
>    electric-indent-mode: t
>    mouse-wheel-mode: t
>    tool-bar-mode: t
>    menu-bar-mode: t
>    file-name-shadow-mode: t
>    global-font-lock-mode: t
>    font-lock-mode: t
>    blink-cursor-mode: t
>    auto-composition-mode: t
>    auto-encryption-mode: t
>    auto-compression-mode: t
>    line-number-mode: t
>    abbrev-mode: t
>
> Recent messages:
> For information about GNU Emacs and the GNU system, type C-h C-a.
> Starting a new list of tags tables
> user-error: No known definitions for: bar
> user-error: No known definitions for: foo
>
> Load-path shadows:
> None found.
>
> Features:
> (shadow sort gnus-util mail-extr emacsbug message dired format-spec
> rfc822 mml mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231
> mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums
> mm-util help-fns mail-prsvr mail-utils etags thingatpt xref cl-seq ring
> eieio byte-opt gv bytecomp byte-compile cl-extra seq cconv eieio-core
> cl-loaddefs pcase cl-lib cc-mode cc-fonts easymenu cc-guess cc-menus
> cc-cmds cc-styles cc-align cc-engine cc-vars cc-defs time-date mule-util
> tooltip eldoc electric uniquify ediff-hook vc-hooks lisp-float-type
> mwheel ns-win term/common-win tool-bar dnd fontset image regexp-opt
> fringe tabulated-list newcomment elisp-mode lisp-mode prog-mode register
> page menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock
> font-lock syntax facemenu font-core frame cl-generic cham georgian
> utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean
> japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european
> ethiopic indian cyrillic chinese case-table epa-hook jka-cmpr-hook help
> simple abbrev minibuffer cl-preloaded nadvice loaddefs button faces
> cus-face macroexp files text-properties overlay sha1 md5 base64 format
> env code-pages mule custom widget hashtable-print-readable backquote
> cocoa ns multi-tty make-network-process emacs)
>
> Memory information:
> ((conses 16 107206 5916)
>   (symbols 48 21504 1)
>   (miscs 40 49 143)
>   (strings 32 21487 4677)
>   (string-bytes 1 738790)
>   (vectors 16 15321)
>   (vector-slots 8 438795 3143)
>   (floats 8 151 24)
>   (intervals 56 245 4)
>   (buffers 976 13))
>
>

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 23 May 2015 12:05:02 GMT) Full text and rfc822 format available.

Message #13 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Jan Djärv <jan.h.d <at> swipnet.se>, 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Sat, 23 May 2015 15:04:23 +0300

On 05/23/2015 02:54 PM, Jan Djärv wrote:

> The URL given in the description doesn't actually say why this change
> was made.  It just asks if anyone has objections.
> So if there is no justification for this change, I will revert it soonish.
> C++ support in Emacs is kind of a big deal.

Maybe the URL doesn't, but it links to a thread.

See the bottom of this message: 
http://lists.gnu.org/archive/html/emacs-devel/2015-05/msg00286.html

Reply sent to Jan Djärv <jan.h.d <at> swipnet.se>:
You have taken responsibility. (Sat, 23 May 2015 12:16:03 GMT) Full text and rfc822 format available.

Notification sent to "Jan D." <jan.h.d <at> swipnet.se>:
bug acknowledged by developer. (Sat, 23 May 2015 12:16:04 GMT) Full text and rfc822 format available.

Message #18 received at 20629-done <at> debbugs.gnu.org (full text, mbox):

From: Jan Djärv <jan.h.d <at> swipnet.se>
To: Dmitry Gutov <dgutov <at> yandex.ru>, 20629-done <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Sat, 23 May 2015 14:15:44 +0200

Den 2015-05-23 14:04, Dmitry Gutov skrev:
> On 05/23/2015 02:54 PM, Jan Djärv wrote:
>
>> The URL given in the description doesn't actually say why this change
>> was made.  It just asks if anyone has objections.
>> So if there is no justification for this change, I will revert it soonish.
>> C++ support in Emacs is kind of a big deal.
>
> Maybe the URL doesn't, but it links to a thread.
>
> See the bottom of this message:
> http://lists.gnu.org/archive/html/emacs-devel/2015-05/msg00286.html

There is just theoretical ramblings about precision without any bug or user 
visible impact that the change solves.  Definitly not anything that motivates 
breaking C++ support.
I reverted it.

	Jan D.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 23 May 2015 12:20:03 GMT) Full text and rfc822 format available.

Message #21 received at 20629-done <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Jan Djärv <jan.h.d <at> swipnet.se>, 20629-done <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Sat, 23 May 2015 15:18:53 +0300

On 05/23/2015 03:15 PM, Jan Djärv wrote:

> There is just theoretical ramblings about precision without any bug or
> user visible impact that the change solves.  Definitly not anything that
> motivates breaking C++ support.
> I reverted it.

Theoretical? It's got an easy practical example.

Someone should look into making etags C++ support work with the notions 
of explicit and implicit tags. As it is, the indexer is broken, judging 
by your experience.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 23 May 2015 12:30:05 GMT) Full text and rfc822 format available.

Message #24 received at 20629-done <at> debbugs.gnu.org (full text, mbox):

From: Jan Djärv <jan.h.d <at> swipnet.se>
To: Dmitry Gutov <dgutov <at> yandex.ru>, 20629-done <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Sat, 23 May 2015 14:28:46 +0200

Den 2015-05-23 14:18, Dmitry Gutov skrev:
> On 05/23/2015 03:15 PM, Jan Djärv wrote:
>
>> There is just theoretical ramblings about precision without any bug or
>> user visible impact that the change solves.  Definitly not anything that
>> motivates breaking C++ support.
>> I reverted it.
>
> Theoretical? It's got an easy practical example.

It has an example of how etags has behaved since forever.  Thats nothing new.

>
> Someone should look into making etags C++ support work with the notions of
> explicit and implicit tags. As it is, the indexer is broken, judging by your
> experience.

Perhaps, but until someone does, there is no need to break things that work.

	Jan D.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 23 May 2015 12:41:03 GMT) Full text and rfc822 format available.

Message #27 received at 20629-done <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Jan Djärv <jan.h.d <at> swipnet.se>, 20629-done <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Sat, 23 May 2015 15:39:59 +0300

On 05/23/2015 03:28 PM, Jan Djärv wrote:

> It has an example of how etags has behaved since forever.  Thats nothing
> new.

Being decades-old doesn't mean it's not a bug.

> Perhaps, but until someone does, there is no need to break things that
> work.

Someone already has: using 'ctags -e' (version 5.9~svn20110310, as 
distributed by Ubuntu), your scenario works even when tag-symbol-match-p 
is not in etags-xref-find-definitions-tag-order.

Here's its output:

x.cc,96
class XXXX1,0
XX::foo()foo9,58
XX::bar()bar15,92
main(int argc, char *argv[])main21,122

And here's etags' output:

x.cc,55
class XX1,0
XX::foo(9,58
XX::bar(15,92
main(21,122

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 23 May 2015 13:51:03 GMT) Full text and rfc822 format available.

Message #30 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Jan Djärv <jan.h.d <at> swipnet.se>
Cc: jan.h.d <at> swipnet.se, 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Sat, 23 May 2015 16:50:11 +0300

> Date: Sat, 23 May 2015 14:15:44 +0200
> From: Jan Djärv <jan.h.d <at> swipnet.se>
> 
> Den 2015-05-23 14:04, Dmitry Gutov skrev:
> > On 05/23/2015 02:54 PM, Jan Djärv wrote:
> >
> >> The URL given in the description doesn't actually say why this change
> >> was made.  It just asks if anyone has objections.
> >> So if there is no justification for this change, I will revert it soonish.
> >> C++ support in Emacs is kind of a big deal.
> >
> > Maybe the URL doesn't, but it links to a thread.
> >
> > See the bottom of this message:
> > http://lists.gnu.org/archive/html/emacs-devel/2015-05/msg00286.html
> 
> There is just theoretical ramblings about precision without any bug or user 
> visible impact that the change solves.  Definitly not anything that motivates 
> breaking C++ support.
> I reverted it.

I reverted the revert.

Sorry, but please don't be so hasty in reverting other people's work.
At the very least, wait for some discussions about that to run their
way.

We should try to fix bugs without re-introducing previously solved
ones.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 23 May 2015 13:53:01 GMT) Full text and rfc822 format available.

Message #33 received at 20629-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Jan Djärv <jan.h.d <at> swipnet.se>
Cc: 20629-done <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Sat, 23 May 2015 16:51:45 +0300

> Date: Sat, 23 May 2015 14:28:46 +0200
> From: Jan Djärv <jan.h.d <at> swipnet.se>
> 
> Perhaps, but until someone does, there is no need to break things that work.

No one has knowingly broken C++.  This is development, where breakage
should be expected.  If someone does production work with development
snapshots, they should be ready for this.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 23 May 2015 14:47:02 GMT) Full text and rfc822 format available.

Message #36 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: jan.h.d <at> swipnet.se
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Sat, 23 May 2015 17:46:18 +0300

> Date: Sat, 23 May 2015 16:50:11 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: 20629 <at> debbugs.gnu.org
> 
> We should try to fix bugs without re-introducing previously solved
> ones.

Does the patch below give good results in real-life C++ usage?

Please also consider whether this change could cause trouble in other
C++ use cases.  (I've ran the modified version on the etags test
suite, and didn't spot any problems in the differences with the
previous results, but I don't consider myself an expert on C++
syntax.)

Thanks.

diff --git a/lib-src/etags.c b/lib-src/etags.c
index 28729da..cb96f06 100644
--- a/lib-src/etags.c
+++ b/lib-src/etags.c
@@ -3681,7 +3681,29 @@ enum,		0,			st_C_enum
 	  switch (fvdef)
 	    {
 	    case flistseen:
-	      make_C_tag (true);    /* a function */
+	      if ((c_ext & C_PLPL) != 0)
+		{
+		  /* Tag C++ member function names, excluding the class and
+		     namespace instances, if any.  */
+		  char *colon_colon = strstr (token_name.buffer, "::");
+		  char *colon_colon2 =
+		    colon_colon
+		    ? strstr (colon_colon + 2, "::")
+		    : NULL;
+
+		  if (colon_colon2 != NULL)
+		    colon_colon = colon_colon2;
+
+		  if (colon_colon != NULL)
+		    {
+		      memmove (token_name.buffer, colon_colon + 2,
+			       strlen (colon_colon + 2) + 1);
+		      token_name.len = strlen (token_name.buffer);
+		    }
+		  make_C_tag (true); /* a member function */
+		}
+	      else
+		make_C_tag (true);    /* a function */
 	      /* FALLTHRU */
 	    case fignore:
 	      fvdef = fvnone;

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 23 May 2015 15:57:02 GMT) Full text and rfc822 format available.

Message #39 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: jan.h.d <at> swipnet.se, Francesco Potortì <pot <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Sat, 23 May 2015 18:56:04 +0300

> Date: Sat, 23 May 2015 17:46:18 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: 20629 <at> debbugs.gnu.org
> 
> > Date: Sat, 23 May 2015 16:50:11 +0300
> > From: Eli Zaretskii <eliz <at> gnu.org>
> > Cc: 20629 <at> debbugs.gnu.org
> > 
> > We should try to fix bugs without re-introducing previously solved
> > ones.
> 
> Does the patch below give good results in real-life C++ usage?
> 
> Please also consider whether this change could cause trouble in other
> C++ use cases.  (I've ran the modified version on the etags test
> suite, and didn't spot any problems in the differences with the
> previous results, but I don't consider myself an expert on C++
> syntax.)

I see that etags deliberately produces explicitly named tags of the
form CLASS::MEMBER, whenever it sees a declaration of MEMBER inside a
class declaration of CLASS.  Why is that useful?  It is another
instance that defeats the change which removed tag-symbol-match-p from
the "order" functions used by etags.el when invoked from xref.  Does
anyone see a problem with removing this feature from etags?

Did not alter fixed versions and reopened. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 23 May 2015 16:18:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Mon, 25 May 2015 15:17:01 GMT) Full text and rfc822 format available.

Message #44 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: jan.h.d <at> swipnet.se
Cc: pot <at> gnu.org, 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Mon, 25 May 2015 18:15:33 +0300

> Date: Sat, 23 May 2015 18:56:04 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: 20629 <at> debbugs.gnu.org
> 
> > Does the patch below give good results in real-life C++ usage?
> > 
> > Please also consider whether this change could cause trouble in other
> > C++ use cases.  (I've ran the modified version on the etags test
> > suite, and didn't spot any problems in the differences with the
> > previous results, but I don't consider myself an expert on C++
> > syntax.)
> 
> I see that etags deliberately produces explicitly named tags of the
> form CLASS::MEMBER, whenever it sees a declaration of MEMBER inside a
> class declaration of CLASS.  Why is that useful?  It is another
> instance that defeats the change which removed tag-symbol-match-p from
> the "order" functions used by etags.el when invoked from xref.  Does
> anyone see a problem with removing this feature from etags?

I've attempted to fix this and other underlying problems by suitable
changes in etags.c in commit 9c66c5a.  The feature whereby etags
qualifies class members by their class names in TAGS is now optional,
off by default, which creates tag names that are more accurate, and
xref should now work much better with C-like object-oriented
languages.

Please give it a try, including in real-life use cases.  I'm not yet
closing the bug on account of possible complications.

Thanks.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Mon, 25 May 2015 21:18:02 GMT) Full text and rfc822 format available.

Message #47 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>, jan.h.d <at> swipnet.se
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Tue, 26 May 2015 00:17:41 +0300

On 05/25/2015 06:15 PM, Eli Zaretskii wrote:

> I've attempted to fix this and other underlying problems by suitable
> changes in etags.c in commit 9c66c5a.  The feature whereby etags
> qualifies class members by their class names in TAGS is now optional,
> off by default, which creates tag names that are more accurate, and
> xref should now work much better with C-like object-oriented
> languages.

I think it's unfortunate that we can't keep the precision, and at the 
same time have tags-completion-table return the qualified names (try C-u 
M-. TAB).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Tue, 26 May 2015 02:37:02 GMT) Full text and rfc822 format available.

Message #50 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: jan.h.d <at> swipnet.se, 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Tue, 26 May 2015 05:35:48 +0300

> Cc: 20629 <at> debbugs.gnu.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Tue, 26 May 2015 00:17:41 +0300
> 
> I think it's unfortunate that we can't keep the precision, and at the 
> same time have tags-completion-table return the qualified names (try C-u 
> M-. TAB).

Given the structure of TAGS and the way xref picks up the symbol at
point, what else can we do?  Can you suggest how this could work
better even in principle?  Does any other version of ctags produce
better results with the same structure of TAGS?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Tue, 26 May 2015 10:17:01 GMT) Full text and rfc822 format available.

Message #53 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Tue, 26 May 2015 13:16:15 +0300

On 05/26/2015 05:35 AM, Eli Zaretskii wrote:

> Does any other version of ctags produce> better results with the same 
structure of TAGS?

No, 'ctags -e' gives pretty much the same output that 'etags' does now. 
So it's definitely acceptable.

> Given the structure of TAGS and the way xref picks up the symbol at
> point, what else can we do?  Can you suggest how this could work
> better even in principle?

I'm not sure.

One direction would be to add `:' to NONAM, so that a method name would 
implicitly match a qualified tag as well. Not sure if it will be a 
problem in some languages (but in, say, Elisp `:' can be a part of an 
identifier).

Another - to make etags-tags-completion-table include both the pattern 
and the explicit tagname in the returned obarray.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Tue, 26 May 2015 15:08:01 GMT) Full text and rfc822 format available.

Message #56 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Tue, 26 May 2015 18:06:50 +0300

> Cc: 20629 <at> debbugs.gnu.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Tue, 26 May 2015 13:16:15 +0300
> 
> One direction would be to add `:' to NONAM, so that a method name would 
> implicitly match a qualified tag as well. Not sure if it will be a 
> problem in some languages (but in, say, Elisp `:' can be a part of an 
> identifier).

That'd mean either some very invasive change in the insane state
machine that runs C_entries, or, more likely, throwing it away and
re-writing it in a very different way.  I don't volunteer.

> Another - to make etags-tags-completion-table include both the pattern 
> and the explicit tagname in the returned obarray.

Yes, I thought about this as well.  I think this is our best bet, and
shouldn't be too hard, as we already do similar things elsewhere.
Patches from completion experts are welcome.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Tue, 26 May 2015 19:01:03 GMT) Full text and rfc822 format available.

Message #59 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Tue, 26 May 2015 22:00:23 +0300

On 05/26/2015 06:06 PM, Eli Zaretskii wrote:

> That'd mean either some very invasive change in the insane state
> machine that runs C_entries, or, more likely, throwing it away and
> re-writing it in a very different way.  I don't volunteer.

What if we only make that change in tag-implicit-name-match-p, then? But 
the concern about false positives still applies.

> Yes, I thought about this as well.  I think this is our best bet, and
> shouldn't be too hard, as we already do similar things elsewhere.

Example?

> Patches from completion experts are welcome.

Not an expert, but this should do it. I wonder if we'll get many junk 
completions this way, in certain situations:

diff --git a/lisp/progmodes/etags.el b/lisp/progmodes/etags.el
index 9ff164e..f026560 100644
--- a/lisp/progmodes/etags.el
+++ b/lisp/progmodes/etags.el
@@ -1276,13 +1276,16 @@ buffer-local values of tags table format variables."
 \\([-a-zA-Z0-9_+*$?:]+\\)[^-a-zA-Z0-9_+*$?:\177]*\\)\177\
 \\(\\([^\n\001]+\\)\001\\)?\\([0-9]+\\)?,\\([0-9]+\\)?\n"
 	      nil t)
-	(intern	(prog1 (if (match-beginning 5)
-			   ;; There is an explicit tag name.
-			   (buffer-substring (match-beginning 5) (match-end 5))
-			 ;; No explicit tag name.  Best guess.
-			 (buffer-substring (match-beginning 3) (match-end 3)))
-		  (progress-reporter-update progress-reporter (point)))
-		table)))
+        ;; Implicit tag name.
+        (intern
+         (buffer-substring (match-beginning 3) (match-end 3))
+         table)
+        (when (match-beginning 5)
+          (intern
+           ;; There is an explicit tag name.
+           (buffer-substring (match-beginning 5) (match-end 5))
+           table))
+        (progress-reporter-update progress-reporter (point))))
     table))

 (defun etags-snarf-tag (&optional use-explicit) ; Doc string?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Tue, 26 May 2015 19:25:02 GMT) Full text and rfc822 format available.

Message #62 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Tue, 26 May 2015 22:23:56 +0300

> Cc: 20629 <at> debbugs.gnu.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Tue, 26 May 2015 22:00:23 +0300
> 
> On 05/26/2015 06:06 PM, Eli Zaretskii wrote:
> 
> > That'd mean either some very invasive change in the insane state
> > machine that runs C_entries, or, more likely, throwing it away and
> > re-writing it in a very different way.  I don't volunteer.
> 
> What if we only make that change in tag-implicit-name-match-p, then?

I don't see how that could be possible: tag-implicit-name-match-p is
language-agnostic.  You'd need to make it language-aware before it
could do such stuff for languages that need it.

> > Yes, I thought about this as well.  I think this is our best bet, and
> > shouldn't be too hard, as we already do similar things elsewhere.
> 
> Example?

It slips my mind for a moment, but there's some command whose
completion shows <f> for functions, <v> for variables, etc.

And it's not the only one, I'm quite sure I saw longer text after each
candidate, perhaps somewhere in 'company'?

> > Patches from completion experts are welcome.
> 
> Not an expert, but this should do it. I wonder if we'll get many junk 
> completions this way, in certain situations:

What kind of junk?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Tue, 26 May 2015 21:02:02 GMT) Full text and rfc822 format available.

Message #65 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Tue, 26 May 2015 17:01:36 -0400

>> > Patches from completion experts are welcome.
>> Not an expert, but this should do it. I wonder if we'll get many junk 
>> completions this way, in certain situations:
> What kind of junk?

BTW, it might be worthwhile to try and replace the obarray with
a function which directly searches the corresponding tags buffers.
Searching those buffers might not be significantly slower than searching
the obarray, with the advantage that we avoid the "building the
completion table" step.


        Stefan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Tue, 26 May 2015 23:57:02 GMT) Full text and rfc822 format available.

Message #68 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Wed, 27 May 2015 02:56:02 +0300

On 05/26/2015 10:23 PM, Eli Zaretskii wrote:

> I don't see how that could be possible: tag-implicit-name-match-p is
> language-agnostic.  You'd need to make it language-aware before it
> could do such stuff for languages that need it.

Well, by including ()=,; in that constant, it already makes certain 
assumptions that aren't necessarily true (for instance, `=' can be, and 
often is, a part of a method name in Ruby). Adding a colon would be 
another one of those.

Not that I necessarily advocate for it, mind you.

> It slips my mind for a moment, but there's some command whose
> completion shows <f> for functions, <v> for variables, etc.

elisp-completion-at-point can do that.

> And it's not the only one, I'm quite sure I saw longer text after each
> candidate, perhaps somewhere in 'company'?

That's possible. But either way, these are annotations for existing 
completions. What the suggested patch would do, though, is add new 
completions.

> What kind of junk?

Among patterns for tags with explicit names, there could be some odd 
ones. We never showed them before, I think, anywhere.

You should try the patch and see how it goes.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Wed, 27 May 2015 14:29:02 GMT) Full text and rfc822 format available.

Message #71 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Wed, 27 May 2015 17:28:01 +0300

> Cc: 20629 <at> debbugs.gnu.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Wed, 27 May 2015 02:56:02 +0300
> 
>     I don't see how that could be possible: tag-implicit-name-match-p is
>     language-agnostic.  You'd need to make it language-aware before it
>     could do such stuff for languages that need it.
> 
> Well, by including ()=,; in that constant, it already makes certain assumptions that aren't necessarily true (for instance, `=' can be, and often is, a part of a method name in Ruby). Adding a colon would be another one of those.

That's not the same situation: [()=,;] are used only if there's no
explicit tag name; explicit tag names are used without any processing,
and the language-specific parsing in etags.c is expected to extract
the tag name according to the language-specific rules.  The idea
behind tag-implicit-name-match-p is an observation that in many
practical cases [()=,;] delimit the tag name, and when it does,
etags.c could refrain from putting an explicit tag name in TAGS.  IOW,
this is just an optimization, meant to keep TAGS smaller.

By contrast, what you are suggesting (AFAIU) is process an explicit
tag name, such as "foo::bar::baz", to deduce that it matches "baz".

Or maybe I don't understand the suggestion, since you were talking
about tag-implicit-name-match-p, which doesn't look at the explicit
tag name at all, and the explicit tag name is the root cause here.

> You should try the patch and see how it goes.

I will, thanks.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Wed, 27 May 2015 15:29:02 GMT) Full text and rfc822 format available.

Message #74 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Wed, 27 May 2015 18:28:10 +0300

On 05/27/2015 05:28 PM, Eli Zaretskii wrote:

> That's not the same situation: [()=,;] are used only if there's no
> explicit tag name;

tag-implicit-name-match-p is used either way.

> The idea
> behind tag-implicit-name-match-p is an observation that in many
> practical cases [()=,;] delimit the tag name, and when it does,
> etags.c could refrain from putting an explicit tag name in TAGS.  IOW,
> this is just an optimization, meant to keep TAGS smaller.

That was my understanding as well. However, whether explicit tag names 
are included or not, doesn't have a lot of effect on my alternative 
suggestion.

> By contrast, what you are suggesting (AFAIU) is process an explicit
> tag name, such as "foo::bar::baz", to deduce that it matches "baz".

No, to process patterns. I don't think we've ever had qualified explicit 
tag names, did we?

> Or maybe I don't understand the suggestion, since you were talking
> about tag-implicit-name-match-p, which doesn't look at the explicit
> tag name at all, and the explicit tag name is the root cause here.

Running 'etags -Q', and updating tag-implicit-name-match-p to also 
include : in NONAM should both show us the qualified names in the 
completion table, as well match the unqualified names when asked for tags.

>> You should try the patch and see how it goes.
>
> I will, thanks.

Let us continue this discussion when there's some feedback on it.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Wed, 27 May 2015 15:48:02 GMT) Full text and rfc822 format available.

Message #77 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Wed, 27 May 2015 18:46:54 +0300

> Cc: 20629 <at> debbugs.gnu.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Wed, 27 May 2015 18:28:10 +0300
> 
> On 05/27/2015 05:28 PM, Eli Zaretskii wrote:
> 
> > That's not the same situation: [()=,;] are used only if there's no
> > explicit tag name;
> 
> tag-implicit-name-match-p is used either way.

Maybe I'm confused, but what about tag-exact-match-p?

> > By contrast, what you are suggesting (AFAIU) is process an explicit
> > tag name, such as "foo::bar::baz", to deduce that it matches "baz".
> 
> No, to process patterns. I don't think we've ever had qualified explicit 
> tag names, did we?

Yes, we did.  That's what the -Q switch controls.

> > Or maybe I don't understand the suggestion, since you were talking
> > about tag-implicit-name-match-p, which doesn't look at the explicit
> > tag name at all, and the explicit tag name is the root cause here.
> 
> Running 'etags -Q', and updating tag-implicit-name-match-p to also 
> include : in NONAM should both show us the qualified names in the 
> completion table, as well match the unqualified names when asked for tags.

I guess I really don't understand your suggestion, then.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Wed, 27 May 2015 15:55:03 GMT) Full text and rfc822 format available.

Message #80 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Wed, 27 May 2015 18:54:19 +0300

On 05/27/2015 06:46 PM, Eli Zaretskii wrote:

> Maybe I'm confused, but what about tag-exact-match-p?

We collect matches that satisfy either tag-exact-match-p, or 
tag-implicit-name-match-p.

> Yes, we did.  That's what the -Q switch controls.

Okay, but we match those tags with tag-implicit-name-match-p, don't we?

>> Running 'etags -Q', and updating tag-implicit-name-match-p to also
>> include : in NONAM should both show us the qualified names in the
>> completion table, as well match the unqualified names when asked for tags.
>
> I guess I really don't understand your suggestion, then.

The result would be that the completion table only returns qualified 
method names, but xref-find-definitions (or find-tag) also shows matches 
for unqualified method names.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Wed, 27 May 2015 16:25:03 GMT) Full text and rfc822 format available.

Message #83 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Wed, 27 May 2015 19:23:53 +0300

> Cc: 20629 <at> debbugs.gnu.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Wed, 27 May 2015 18:54:19 +0300
> 
> On 05/27/2015 06:46 PM, Eli Zaretskii wrote:
> 
> > Maybe I'm confused, but what about tag-exact-match-p?
> 
> We collect matches that satisfy either tag-exact-match-p, or 
> tag-implicit-name-match-p.

Yes, but they look at different parts of the tag's record.

> > Yes, we did.  That's what the -Q switch controls.
> 
> Okay, but we match those tags with tag-implicit-name-match-p, don't we?

No, we match them with tag-exact-match-p, AFAICS.

> The result would be that the completion table only returns qualified 
> method names, but xref-find-definitions (or find-tag) also shows matches 
> for unqualified method names.

That'd be great.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Wed, 27 May 2015 23:51:02 GMT) Full text and rfc822 format available.

Message #86 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Thu, 28 May 2015 02:50:23 +0300

On 05/27/2015 07:23 PM, Eli Zaretskii wrote:

> Yes, but they look at different parts of the tag's record.

We check every tag against both functions, that's the point. 
tag-implicit-name-match-p is more flexible,

> No, we match them with tag-exact-match-p, AFAICS.

No. Unless there's an explicit tag name, this function will return nil. 
Check out the first comment in its implementation.

>> The result would be that the completion table only returns qualified
>> method names, but xref-find-definitions (or find-tag) also shows matches
>> for unqualified method names.
>
> That'd be great.

We'd get it at the cost of precision, though. In this case, foo:bar (a 
valid Elisp name) would become an implicit match for "bar".

Anyway, this thread of thought is probably not worth pursuing: we'd want 
to be compatible with 'ctags -e' (not least because it supports more 
languages), and there's no option to generate qualified C++ method names 
there.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Thu, 28 May 2015 02:51:02 GMT) Full text and rfc822 format available.

Message #89 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Thu, 28 May 2015 05:50:19 +0300

> Cc: 20629 <at> debbugs.gnu.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Thu, 28 May 2015 02:50:23 +0300
> 
> On 05/27/2015 07:23 PM, Eli Zaretskii wrote:
> 
> > Yes, but they look at different parts of the tag's record.
> 
> We check every tag against both functions, that's the point. 
> tag-implicit-name-match-p is more flexible,
> 
> > No, we match them with tag-exact-match-p, AFAICS.
> 
> No. Unless there's an explicit tag name, this function will return nil. 
> Check out the first comment in its implementation.

I _was_ talking about explicit tag names.

> Anyway, this thread of thought is probably not worth pursuing: we'd want 
> to be compatible with 'ctags -e' (not least because it supports more 
> languages), and there's no option to generate qualified C++ method names 
> there.

Exuberant ctags does have such an option: --extra=+q.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Thu, 28 May 2015 10:23:01 GMT) Full text and rfc822 format available.

Message #92 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Thu, 28 May 2015 13:22:01 +0300

On 05/28/2015 05:50 AM, Eli Zaretskii wrote:

> I _was_ talking about explicit tag names.

AFAICS, 'etags -Q' doesn't generate explicit tag names for C++ (for 
cases we're currently discussing). Only patterns, to be matched implicitly.

In any case, tag-exact-match-p is designed to be inflexible, so it's not 
something we should change.

> Exuberant ctags does have such an option: --extra=+q.

This brings us to the third option. Here's what the 'ctags -e 
--extra=+q' output looks like:

x.cc,210
class XXXX1,0
class YYYY8,54
XX::foo()foo16,98
XX::foo()XX::foo16,98
XX::bar()bar22,132
XX::bar()XX::bar22,132
YY::bar()bar28,163
YY::bar()YY::bar28,163
main(int argc, char *argv[])main34,193

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Thu, 28 May 2015 11:36:02 GMT) Full text and rfc822 format available.

Message #95 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Francesco Potortì <pot <at> gnu.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#20629: 25.0.50;	Regression: TAGS broken,
 can't find anything in C++ files.
Date: Thu, 28 May 2015 13:35:47 +0200

Two point that are maybe useful for clarifying something.


Explicit vs implicit tag:

As far as etags.c is concerned, there is no *logical* difference between
an explicit tag and an implicit tag.  Both are tags and should be viewed
and interpreted as such.  The fact that a tag is explicit or implicit is
*only* an optimization, intended to reduce the size of the TAGS file and
the time needed to load it from disk.  There should be *no* difference
between the treatment of implicit and explicit tags when parsing TAGS
file entries.  Given that in the 15+ years since implicit tags where
introduced the trade-offs between disk space and CPU time have changed,
it could maybe make sense to remove the implicit tag concept altogether,
and only have explicit tags, should this make things easier.


Tagged vs non-tagged entries:

An entry is tagged only when necessary, that is, when it would be
ambiguous or difficult to match without a tags.  Again, this is only an
optimization, but this one has logical consequences.  For example, for a
function declaration it can be useful to make it clear what is the
identifier to be matched, so there is a tag.  In class-based programs,
like C++, it can be useful to provide a fully-qualified name for an
identifier, so there is a class::id tag.  Here again, it may make sense
to tag all entries, if it makes TAGS parsing easier or more accurate.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Thu, 28 May 2015 11:47:02 GMT) Full text and rfc822 format available.

Message #98 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Francesco Potortì <pot <at> gnu.org>,
 Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Thu, 28 May 2015 14:46:13 +0300

On 05/28/2015 02:35 PM, Francesco Potortì wrote:
> Given that in the 15+ years since implicit tags where
> introduced the trade-offs between disk space and CPU time have changed,
> it could maybe make sense to remove the implicit tag concept altogether,
> and only have explicit tags, should this make things easier.

Maybe so, but:

> In class-based programs,
> like C++, it can be useful to provide a fully-qualified name for an
> identifier, so there is a class::id tag.  Here again, it may make sense
> to tag all entries, if it makes TAGS parsing easier or more accurate.

The question at hand is how Emacs should go from a non-qualified tag 
name (because it's a method call in the buffer, and we don't know which 
class the object belongs to) to the tag location. Either we use some 
implicit matching, or each method tag should have two entries: a 
qualified one, and a non-qualified one.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Thu, 28 May 2015 11:55:02 GMT) Full text and rfc822 format available.

Message #101 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Stefan Monnier <monnier <at> IRO.UMontreal.CA>, Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Thu, 28 May 2015 14:54:49 +0300

On 05/27/2015 12:01 AM, Stefan Monnier wrote:

> BTW, it might be worthwhile to try and replace the obarray with
> a function which directly searches the corresponding tags buffers.
> Searching those buffers might not be significantly slower than searching
> the obarray, with the advantage that we avoid the "building the
> completion table" step.

Having the table always up-to-date would be nice. But here's some 
numbers with a rough patch.

The project is of moderate size: Linux kernel.

TAGS is 159097 lines long.

Pre-built tags-completion-table:

Build it -> 1.34 seconds
(all-completions "" (tags-completion-table)) after that -> 0.02 seconds

Dynamic completion table:

(all-completions "" (tags-completion-table)) -> 0.78 seconds
                 ^-- same with any longer prefix, in this implementation

Patch:

diff --git a/lisp/progmodes/etags.el b/lisp/progmodes/etags.el
index 9ff164e..19de126 100644
--- a/lisp/progmodes/etags.el
+++ b/lisp/progmodes/etags.el
@@ -753,31 +753,18 @@ Assumes the tags table is the current buffer."
       (setq tags-included-tables (funcall 
tags-included-tables-function))))
 
 (defun tags-completion-table ()
-  "Build `tags-completion-table' on demand.
+  "Return tags completion table.
 The tags included in the completion table are those in the current
 tags table and its (recursively) included tags tables."
-  (or tags-completion-table
-      ;; No cached value for this buffer.
-      (condition-case ()
-	  (let (current-table combined-table)
-	    (message "Making tags completion table for %s..." buffer-file-name)
-	    (save-excursion
-	      ;; Iterate over the current list of tags tables.
-	      (while (visit-tags-table-buffer (and combined-table t))
-		;; Find possible completions in this table.
-		(setq current-table (funcall tags-completion-table-function))
-		;; Merge this buffer's completions into the combined table.
-		(if combined-table
-		    (mapatoms
-		     (lambda (sym) (intern (symbol-name sym) combined-table))
-		     current-table)
-		  (setq combined-table current-table))))
-	    (message "Making tags completion table for %s...done"
-		     buffer-file-name)
-	    ;; Cache the result in a buffer-local variable.
-	    (setq tags-completion-table combined-table))
-	(quit (message "Tags completion table construction aborted.")
-	      (setq tags-completion-table nil)))))
+  (completion-table-with-cache
+   (lambda (_string)
+     (let (cont tables)
+       (save-excursion
+         ;; Iterate over the current list of tags tables.
+         (while (visit-tags-table-buffer (or cont (progn (setq cont t) 
nil)))
+           ;; Find possible completions in this table.
+           (push (funcall tags-completion-table-function) tables)))
+       (nreverse (apply #'nconc tables))))))

 ;;;###autoload
 (defun tags-lazy-completion-table ()
@@ -1256,11 +1243,7 @@ buffer-local values of tags table format variables."


 (defun etags-tags-completion-table () ; Doc string?
-  (let ((table (make-vector 511 0))
-	(progress-reporter
-	 (make-progress-reporter
-	  (format "Making tags completion table for %s..." buffer-file-name)
-	  (point-min) (point-max))))
+  (let ((table nil))
     (save-excursion
       (goto-char (point-min))
       ;; This monster regexp matches an etags tag line.
@@ -1276,12 +1259,11 @@ buffer-local values of tags table format variables."
 \\([-a-zA-Z0-9_+*$?:]+\\)[^-a-zA-Z0-9_+*$?:\177]*\\)\177\
 \\(\\([^\n\001]+\\)\001\\)?\\([0-9]+\\)?,\\([0-9]+\\)?\n"
 	      nil t)
-	(intern	(prog1 (if (match-beginning 5)
+	(push	(prog1 (if (match-beginning 5)
 			   ;; There is an explicit tag name.
 			   (buffer-substring (match-beginning 5) (match-end 5))
 			 ;; No explicit tag name.  Best guess.
-			 (buffer-substring (match-beginning 3) (match-end 3)))
-		  (progress-reporter-update progress-reporter (point)))
+			 (buffer-substring (match-beginning 3) (match-end 3))))
 		table)))
     table))

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Thu, 28 May 2015 12:17:02 GMT) Full text and rfc822 format available.

Message #104 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Francesco Potortì <pot <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken,
 can't find anything in C++ files.
Date: Thu, 28 May 2015 14:16:21 +0200

>> In class-based programs,
>> like C++, it can be useful to provide a fully-qualified name for an
>> identifier, so there is a class::id tag.  Here again, it may make sense
>> to tag all entries, if it makes TAGS parsing easier or more accurate.
>
>The question at hand is how Emacs should go from a non-qualified tag 
>name (because it's a method call in the buffer, and we don't know which 
>class the object belongs to) to the tag location. Either we use some 
>implicit matching, or each method tag should have two entries: a 
>qualified one, and a non-qualified one.

Well, I'd say, when matching NAME:

first, match against a tag NAME
second, if appropriate, match against a tag ::NAME
third, regex match against a tag .*::NAME$
fourth, match against the entry, without looking at the tag

I would have said that it is already implemented like this, but in fact
I cared about etags.c, not much about etags.el.

As I said previously, whether the tag is implicit or explicit makes no
*logical* difference, but it can impact performance.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Thu, 28 May 2015 13:00:04 GMT) Full text and rfc822 format available.

Message #107 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Stefan Monnier <monnier <at> IRO.UMontreal.CA>, Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Thu, 28 May 2015 15:59:20 +0300

And here's an attempt to simplify the regexp and use the input string.

It brings us down to 200ms in the best case (completions for "get_"), 
and that can be improved further, but the worst case (completions for 
"") gets considerably worse: 3 seconds.

diff --git a/lisp/progmodes/etags.el b/lisp/progmodes/etags.el
index 9ff164e..230fffa 100644
--- a/lisp/progmodes/etags.el
+++ b/lisp/progmodes/etags.el
@@ -753,31 +753,18 @@ Assumes the tags table is the current buffer."
       (setq tags-included-tables (funcall 
tags-included-tables-function))))
 
 (defun tags-completion-table ()
-  "Build `tags-completion-table' on demand.
+  "Return tags completion table.
 The tags included in the completion table are those in the current
 tags table and its (recursively) included tags tables."
-  (or tags-completion-table
-      ;; No cached value for this buffer.
-      (condition-case ()
-	  (let (current-table combined-table)
-	    (message "Making tags completion table for %s..." buffer-file-name)
-	    (save-excursion
-	      ;; Iterate over the current list of tags tables.
-	      (while (visit-tags-table-buffer (and combined-table t))
-		;; Find possible completions in this table.
-		(setq current-table (funcall tags-completion-table-function))
-		;; Merge this buffer's completions into the combined table.
-		(if combined-table
-		    (mapatoms
-		     (lambda (sym) (intern (symbol-name sym) combined-table))
-		     current-table)
-		  (setq combined-table current-table))))
-	    (message "Making tags completion table for %s...done"
-		     buffer-file-name)
-	    ;; Cache the result in a buffer-local variable.
-	    (setq tags-completion-table combined-table))
-	(quit (message "Tags completion table construction aborted.")
-	      (setq tags-completion-table nil)))))
+  (completion-table-with-cache
+   (lambda (string)
+     (let (cont tables)
+       (save-excursion
+         ;; Iterate over the current list of tags tables.
+         (while (visit-tags-table-buffer (or cont (progn (setq cont t) 
nil)))
+           ;; Find possible completions in this table.
+           (push (funcall tags-completion-table-function string) tables)))
+       (nreverse (apply #'nconc tables))))))

 ;;;###autoload
 (defun tags-lazy-completion-table ()
@@ -1218,7 +1205,7 @@ buffer-local values of tags table format variables."
        (mapc (lambda (elt) (set (make-local-variable (car elt)) (cdr 
elt)))
 	     '((file-of-tag-function . etags-file-of-tag)
 	       (tags-table-files-function . etags-tags-table-files)
-	       (tags-completion-table-function . etags-tags-completion-table)
+	       (tags-completion-table-function . etags-tags-completions)
 	       (snarf-tag-function . etags-snarf-tag)
 	       (goto-tag-location-function . etags-goto-tag-location)
 	       (find-tag-regexp-search-function . re-search-forward)
@@ -1255,12 +1242,9 @@ buffer-local values of tags table format variables."
 	(expand-file-name str (file-truename default-directory))))))


-(defun etags-tags-completion-table () ; Doc string?
-  (let ((table (make-vector 511 0))
-	(progress-reporter
-	 (make-progress-reporter
-	  (format "Making tags completion table for %s..." buffer-file-name)
-	  (point-min) (point-max))))
+(defun etags-tags-completions (string) ; Doc string?
+  (let ((table nil)
+        (re (format "[\n \t()=,;\177]%s" (regexp-quote string))))
     (save-excursion
       (goto-char (point-min))
       ;; This monster regexp matches an etags tag line.
@@ -1271,18 +1255,16 @@ buffer-local values of tags table format variables."
       ;;   \5 is the explicitly-specified tag name.
       ;;   \6 is the line to start searching at;
       ;;   \7 is the char to start searching at.
-      (while (re-search-forward
-	      "^\\(\\([^\177]+[^-a-zA-Z0-9_+*$:\177]+\\)?\
-\\([-a-zA-Z0-9_+*$?:]+\\)[^-a-zA-Z0-9_+*$?:\177]*\\)\177\
-\\(\\([^\n\001]+\\)\001\\)?\\([0-9]+\\)?,\\([0-9]+\\)?\n"
-	      nil t)
-	(intern	(prog1 (if (match-beginning 5)
-			   ;; There is an explicit tag name.
-			   (buffer-substring (match-beginning 5) (match-end 5))
-			 ;; No explicit tag name.  Best guess.
-			 (buffer-substring (match-beginning 3) (match-end 3)))
-		  (progress-reporter-update progress-reporter (point)))
-		table)))
+      (while (re-search-forward re nil t)
+        (save-excursion
+          (goto-char (match-beginning 0))
+          (let ((match-re (if (eq (char-after) ?\177)
+                              ;; Explicit tag name.
+                              "\177\\([^\001]+\\)\001"
+                            ;; Implicit tag name.
+                            "[\n \t()=,;]\\([^\177 \t()=,;]+\\)\177")))
+            (when (looking-at match-re)
+              (push (match-string 1) table))))))
     table))

 (defun etags-snarf-tag (&optional use-explicit) ; Doc string?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Thu, 28 May 2015 13:02:02 GMT) Full text and rfc822 format available.

Message #110 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Francesco Potortì <pot <at> gnu.org>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Thu, 28 May 2015 16:00:54 +0300

On 05/28/2015 03:16 PM, Francesco Potortì wrote:

> second, if appropriate, match against a tag ::NAME
> third, regex match against a tag .*::NAME$

Why can we use colons? That implies some sort of knowledge about C++, 
whereas until now etags.el has remained language-agnostic.

> fourth, match against the entry, without looking at the tag

That brings us false positives.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Thu, 28 May 2015 13:13:02 GMT) Full text and rfc822 format available.

Message #113 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Francesco Potortì <pot <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken,
 can't find anything in C++ files.
Date: Thu, 28 May 2015 15:12:38 +0200

Francesco Potortì wrote:
>> second, if appropriate, match against a tag ::NAME
>> third, regex match against a tag .*::NAME$
>
>Why can we use colons? That implies some sort of knowledge about C++, 
>whereas until now etags.el has remained language-agnostic.

Mh.  I had taken it for given that each major-mode in fact added
something to the list of functions called when looking for a tag.
Doesn't it work that way?  If not, couldn't it be done for languages
where the language-agnostic behaviour of etags.el is not satisfactory?

>> fourth, match against the entry, without looking at the tag
>
>That brings us false positives.

Sure, and in fact I was suggesting it only as a last resort :)

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Thu, 28 May 2015 14:58:02 GMT) Full text and rfc822 format available.

Message #116 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Thu, 28 May 2015 17:56:41 +0300

> Cc: 20629 <at> debbugs.gnu.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Thu, 28 May 2015 13:22:01 +0300
> 
> On 05/28/2015 05:50 AM, Eli Zaretskii wrote:
> 
> > I _was_ talking about explicit tag names.
> 
> AFAICS, 'etags -Q' doesn't generate explicit tag names for C++ (for 
> cases we're currently discussing).

Yes, it does.  Try running it on test/etags/cp-src/c.C, for example.

> Only patterns, to be matched implicitly.

Whether "etags -Q" generates explicit tag names or not is orthogonal
to whether it qualifies class members.  The decision depends on the
text surrounding the pattern.

> > Exuberant ctags does have such an option: --extra=+q.
> 
> This brings us to the third option. Here's what the 'ctags -e 
> --extra=+q' output looks like:
> 
> x.cc,210
> class XXXX1,0
> class YYYY8,54
> XX::foo()foo16,98
> XX::foo()XX::foo16,98
> XX::bar()bar22,132
> XX::bar()XX::bar22,132
> YY::bar()bar28,163
> YY::bar()YY::bar28,163
> main(int argc, char *argv[])main34,193

If you mean that producing two entries instead of one under -Q will
produce better results both with xref-find-definitions and with
completion, making the above happen in etags is an easy change, I
think.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Thu, 28 May 2015 15:05:02 GMT) Full text and rfc822 format available.

Message #119 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Francesco Potortì <pot <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Thu, 28 May 2015 18:04:08 +0300

> Date: Thu, 28 May 2015 15:12:38 +0200
> From: Francesco Potortì <pot <at> gnu.org>
> Cc: Eli Zaretskii <eliz <at> gnu.org>, 20629 <at> debbugs.gnu.org
> 
> Francesco Potortì wrote:
> >> second, if appropriate, match against a tag ::NAME
> >> third, regex match against a tag .*::NAME$
> >
> >Why can we use colons? That implies some sort of knowledge about C++, 
> >whereas until now etags.el has remained language-agnostic.
> 
> Mh.  I had taken it for given that each major-mode in fact added
> something to the list of functions called when looking for a tag.
> Doesn't it work that way?

No.

In addition, doing so would not work if I tried to look up a symbol in
language A from a buffer whose major mode is tailored to language B.

> If not, couldn't it be done for languages where the
> language-agnostic behaviour of etags.el is not satisfactory?

etags.el relies on etags.c to know languages well enough to do that
part of the job.  I think we should keep this separation of
responsibilities.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Thu, 28 May 2015 15:15:03 GMT) Full text and rfc822 format available.

Message #122 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Francesco Potortì <pot <at> gnu.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken,
 can't find anything in C++ files.
Date: Thu, 28 May 2015 17:14:26 +0200

Eli Zaretskii:
>> Francesco Potortì wrote:
>> >> second, if appropriate, match against a tag ::NAME
>> >> third, regex match against a tag .*::NAME$
>> >
>> >Why can we use colons? That implies some sort of knowledge about C++, 
>> >whereas until now etags.el has remained language-agnostic.
>> 
>> Mh.  I had taken it for given that each major-mode in fact added
>> something to the list of functions called when looking for a tag.
>> Doesn't it work that way?
>
>No.
>
>In addition, doing so would not work if I tried to look up a symbol in
>language A from a buffer whose major mode is tailored to language B.
>
>> If not, couldn't it be done for languages where the
>> language-agnostic behaviour of etags.el is not satisfactory?
>
>etags.el relies on etags.c to know languages well enough to do that
>part of the job.  I think we should keep this separation of
>responsibilities.

I see.  Given these constraints, I see no other way than augmenting the
TAGS format to include an arbitrary number of tags per entry...

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Thu, 28 May 2015 15:30:05 GMT) Full text and rfc822 format available.

Message #125 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Francesco Potortì <pot <at> gnu.org>
To: dgutov <at> yandex.ru, 20629 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#20629: 25.0.50;	Regression: TAGS broken,
 can't find anything in C++ files.
Date: Thu, 28 May 2015 17:29:39 +0200

>Eli Zaretskii:
>>> Francesco Potortì wrote:
>>> >> second, if appropriate, match against a tag ::NAME
>>> >> third, regex match against a tag .*::NAME$
>>> >
>>> >Why can we use colons? That implies some sort of knowledge about C++, 
>>> >whereas until now etags.el has remained language-agnostic.
>>> 
>>> Mh.  I had taken it for given that each major-mode in fact added
>>> something to the list of functions called when looking for a tag.
>>> Doesn't it work that way?
>>
>>No.
>>
>>In addition, doing so would not work if I tried to look up a symbol in
>>language A from a buffer whose major mode is tailored to language B.
>>
>>> If not, couldn't it be done for languages where the
>>> language-agnostic behaviour of etags.el is not satisfactory?
>>
>>etags.el relies on etags.c to know languages well enough to do that
>>part of the job.  I think we should keep this separation of
>>responsibilities.
>
>I see.  Given these constraints, I see no other way than augmenting the
>TAGS format to include an arbitrary number of tags per entry...

Answering to myself: yes, Dmitry's suggestion would not even need
changing the TAGS format.  For class-based languages, in addition to the
currently generated entry which contains a fully-qualified tag, generate
an additional entry containing an unqualified tag (which most of the
time will be an implicit tag).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Thu, 28 May 2015 15:33:03 GMT) Full text and rfc822 format available.

Message #128 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Thu, 28 May 2015 18:32:46 +0300

On 05/28/2015 05:56 PM, Eli Zaretskii wrote:

> Yes, it does.  Try running it on test/etags/cp-src/c.C, for example.

Okay, sometimes it does. Point is, tags-explicit-match-p won't apply to 
the many entries where it doesn't.

> Whether "etags -Q" generates explicit tag names or not is orthogonal
> to whether it qualifies class members.  The decision depends on the
> text surrounding the pattern.

Yes, okay.

> If you mean that producing two entries instead of one under -Q will
> produce better results both with xref-find-definitions and with
> completion...

It should, though at the cost of larger file size (and completion table 
size, relative to the proposal where we don't include the unqualified 
tags in it).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Thu, 28 May 2015 16:36:03 GMT) Full text and rfc822 format available.

Message #131 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Thu, 28 May 2015 19:34:58 +0300

> Cc: 20629 <at> debbugs.gnu.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Thu, 28 May 2015 18:32:46 +0300
> 
> > If you mean that producing two entries instead of one under -Q will
> > produce better results both with xref-find-definitions and with
> > completion...
> 
> It should, though at the cost of larger file size (and completion table 
> size, relative to the proposal where we don't include the unqualified 
> tags in it).

But having just qualified tags is bad for accuracy, right?

So do we have a decision here?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Fri, 29 May 2015 00:11:03 GMT) Full text and rfc822 format available.

Message #134 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Fri, 29 May 2015 03:09:54 +0300

On 05/28/2015 07:34 PM, Eli Zaretskii wrote:

> But having just qualified tags is bad for accuracy, right?

Maybe. Depends on things we would add to the Lisp code.

> So do we have a decision here?

If you want my opinion (please keep in mind: not an etags user), 
following in Exuberant Ctags's footsteps sounds best. Nobody ever got 
fired for doing that.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Fri, 29 May 2015 06:49:02 GMT) Full text and rfc822 format available.

Message #137 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Glenn Morris <rgm <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Fri, 29 May 2015 02:48:57 -0400

Dmitry Gutov wrote:

> If you want my opinion (please keep in mind: not an etags user),
> following in Exuberant Ctags's footsteps sounds best.

I'm not one either, but I've been meaning to ask: why is etags in Emacs?
It does a generic job that isn't specific to Emacs, and other programs
that do this exist.

https://github.com/fishman/ctags seems active and has an Emacs developer
(Masatake YAMATO) as a contributor.

The question was asked before:
http://lists.gnu.org/archive/html/emacs-devel/2007-01/msg00075.html

It's my (superficial) impression that etags hasn't progressed much since
then. The majority of the changes seem to have been generic code-cleanup
stuff.

Is it that etags recognizes Emacs-specific C code that ctags does not?

My only motivation for asking is that it's good to reduce the number of
things that need to be maintained in Emacs, where possible.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Fri, 29 May 2015 08:10:03 GMT) Full text and rfc822 format available.

Message #140 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Glenn Morris <rgm <at> gnu.org>,
 Francesco Potortì <pot <at> gnu.org>,
 Richard Stallman <rms <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Fri, 29 May 2015 11:09:08 +0300

> From: Glenn Morris <rgm <at> gnu.org>
> Cc: Eli Zaretskii <eliz <at> gnu.org>,  20629 <at> debbugs.gnu.org
> Date: Fri, 29 May 2015 02:48:57 -0400
> 
> Dmitry Gutov wrote:
> 
> > If you want my opinion (please keep in mind: not an etags user),
> > following in Exuberant Ctags's footsteps sounds best.
> 
> I'm not one either, but I've been meaning to ask: why is etags in Emacs?

The answer to that is lost in history (for me).  Perhaps Richard and
Francesco (cc'ed) will remember.

But since it is here, it is, IMO, a Good Thing, because we can easily
affect its operation where it's important to us.  Especially lately,
when the front-end was changed, and the new one has different
expectations.

> It's my (superficial) impression that etags hasn't progressed much since
> then. The majority of the changes seem to have been generic code-cleanup
> stuff.

That's not true, there were a couple of non-trivial changes lately
that are not cleanups, and I think there will be one more soon.  This
thread discusses some of them, the other one is discussed here:

  http://lists.gnu.org/archive/html/emacs-devel/2015-05/msg00291.html

> Is it that etags recognizes Emacs-specific C code that ctags does not?

Which ctags do you allude to here?  There are quite a few of them out
there.

> My only motivation for asking is that it's good to reduce the number of
> things that need to be maintained in Emacs, where possible.

I don't think we should remove this one, no.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Fri, 29 May 2015 08:13:02 GMT) Full text and rfc822 format available.

Message #143 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Fri, 29 May 2015 11:12:31 +0300

> Cc: 20629 <at> debbugs.gnu.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Fri, 29 May 2015 03:09:54 +0300
> 
> On 05/28/2015 07:34 PM, Eli Zaretskii wrote:
> 
> > But having just qualified tags is bad for accuracy, right?
> 
> Maybe. Depends on things we would add to the Lisp code.

Can you elaborate?  Is there a way to get the same accuracy and
completion without having both qualified and unqualified tags?

> > So do we have a decision here?
> 
> If you want my opinion (please keep in mind: not an etags user), 
> following in Exuberant Ctags's footsteps sounds best. Nobody ever got 
> fired for doing that.

Yes, but I think if we change etags to create duplicate tags, we
should have this feature opt-out, unlike Exuberant, otherwise TAGS
created by default will be deficient with xref.  Do you agree?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Fri, 29 May 2015 09:28:02 GMT) Full text and rfc822 format available.

Message #146 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Glenn Morris <rgm <at> gnu.org>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Fri, 29 May 2015 12:27:26 +0300

On 05/29/2015 09:48 AM, Glenn Morris wrote:

> https://github.com/fishman/ctags seems active and has an Emacs developer
> (Masatake YAMATO) as a contributor.

It indeed seems to be progressing nicely (fixed a Ruby bug I've reported 
just recently), but this fork is not packaged by any distribution yet, 
AFAIK.

So that's an argument toward keeping etags, at least for the time being.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Fri, 29 May 2015 12:36:03 GMT) Full text and rfc822 format available.

Message #149 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Francesco Potortì <Potorti <at> isti.cnr.it>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Glenn Morris <rgm <at> gnu.org>, 20629 <at> debbugs.gnu.org,
 Richard Stallman <rms <at> gnu.org>, dgutov <at> yandex.ru
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken,
 can't find anything in C++ files.
Date: Fri, 29 May 2015 14:34:55 +0200

Eli Zaretskii:
>> From: Glenn Morris <rgm <at> gnu.org>
>> Cc: Eli Zaretskii <eliz <at> gnu.org>,  20629 <at> debbugs.gnu.org
>> Date: Fri, 29 May 2015 02:48:57 -0400
>> 
>> Dmitry Gutov wrote:
>> 
>> > If you want my opinion (please keep in mind: not an etags user),
>> > following in Exuberant Ctags's footsteps sounds best.
>> 
>> I'm not one either, but I've been meaning to ask: why is etags in Emacs?
>
>The answer to that is lost in history (for me).  Perhaps Richard and
>Francesco (cc'ed) will remember.

When Etags was written, the only alternative was the traditional Unix
Ctags, to which Etags was an improvement.  Since Etags is able to
produce traditional Ctags-style files, yuo can look at the macro CTAGS
in etags.c to spot the differences.  This is a historical summary:

 * 1983 Ctags originally by Ken Arnold.
 * 1984 Fortran added by Jim Kleckner.
 * 1984 Ed Pelegri-Llopart added C typedefs.
 * 1985 Emacs TAGS format by Richard Stallman.
 * 1989 Sam Kendall added C++.
 * 1992 Joseph B. Wells improved C and C++ parsing.
 * 1993 Francesco Potortì reorganized C and C++.
 * 1994 Line-by-line regexp tags by Tom Tromey.
 * 2001 Nested classes by Francesco Potortì (concept by Mykola Dzyuba).
 * 2002 #line directives by Francesco Potortì.

/* Define CTAGS to make the program "ctags" compatible with the usual one.
 Leave it undefined to make the program "etags", which makes emacs-style
 tag tables and tags typedefs, #defines and struct/union/enum by default. */

>But since it is here, it is, IMO, a Good Thing, because we can easily
>affect its operation where it's important to us.  Especially lately,
>when the front-end was changed, and the new one has different
>expectations.

Yes.  This is important.  Obviously, this could be done in any similar
program having an --emacs option (see for example ls --dired).

>> It's my (superficial) impression that etags hasn't progressed much since
>> then. The majority of the changes seem to have been generic code-cleanup
>> stuff.
>
>That's not true, there were a couple of non-trivial changes lately
>that are not cleanups, and I think there will be one more soon.  This
>thread discusses some of them, the other one is discussed here:
>
>  http://lists.gnu.org/archive/html/emacs-devel/2015-05/msg00291.html

There have been significant bug squashing, tagging improvements and
language supporting features added at least until 2004.  Very few after
that time from my part.

>> Is it that etags recognizes Emacs-specific C code that ctags does not?
>
>Which ctags do you allude to here?  There are quite a few of them out
>there.
>
>> My only motivation for asking is that it's good to reduce the number of
>> things that need to be maintained in Emacs, where possible.
>
>I don't think we should remove this one, no.

This is from an old mail, referring to around 2004:

>In fact, some years ago I run an in-depth comparison between etags and
>exhuberant ctags, with mixed results.  None excelled clearly with
>respect to the other.  Only two functionality I missed in etags: the
>ability to read directories (with optional recursion) and the ability to
>generate the new tags types introduced by ctags.

On the other hand, Ex-C is much more customisable on the command line
and has a much clearer code (even if I don't know whether this in fact
translates to easier code management).  At that time, I had even had an
email exchange with Ex-c authors to try and merge the code bases, but
this did not went on for lack of time.

So Etags was not bad at all some ten years ago.  I don't know if Ex-c or
others have significantly progressed in the meantime.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Fri, 29 May 2015 14:07:02 GMT) Full text and rfc822 format available.

Message #152 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Fri, 29 May 2015 17:05:53 +0300

On 05/29/2015 11:12 AM, Eli Zaretskii wrote:

>>> But having just qualified tags is bad for accuracy, right?
>>
>> Maybe. Depends on things we would add to the Lisp code.
>
> Can you elaborate?  Is there a way to get the same accuracy and
> completion without having both qualified and unqualified tags?

There'll have to be some compromise, but not necessarily in accuracy. 
The present default behavior is accurate enough, and by that I mean the 
user can navigate to a method call, press M-., and see all definitions 
of the methods with that name, without extra junk.

What we don't have by default, is completion, and navigation to, 
qualified method names. That's by itself, is a relatively advances 
feature (the user needs to know to press C-u and then either press TAB 
and look for qualified names, or type one out).

That can be mitigated by parsing out implicit tag names out of patterns, 
however they also don't always contain qualified names (which was my 
misunderstanding: they do in the toy example provided by Jan). So, 
having qualified names in tag completion reliably is out of the 
question, unless etags uses them in tag names.

And then we'd have to solve the question of how to get the unqualified 
names in both completion and navigation (continued below (*)).

> Yes, but I think if we change etags to create duplicate tags, we
> should have this feature opt-out, unlike Exuberant, otherwise TAGS
> created by default will be deficient with xref.  Do you agree?

I'd say no. First, there's value is simply being compatible.

Second, as the ctags man page warns, including both qualified and 
unqualified names in separate entries, "could potentially more than 
double the size of the tag file". Which increases the time it takes to 
load one, and might (if we make more progress on Stefan's suggestion not 
to pre-build tags completion table) also make completion slower, in 
projects of certain size.

(*) However, I don't really understand this choice:

"""
The actual form of the qualified tag depends upon the language from 
which the tag was derived (using a form that is most natural for how 
qualified calls are specified in the language). For C++, it is in the 
form "class::member"; for Eiffel and Java, it is in the form "class.member".
"""

If we posit that in each interesting language a qualified tag is of the 
form CONTEXT-CHAR-NAME, standardizing on CHAR would allow us to extract 
both qualified and unqualified tag names from a single entry, at a small 
cost in readability for users where the language traditionally uses a 
different separator than the one picked by etags.

For better uniqueness, I'd choose two of them: # before instance 
methods, and . before class (or static) methods. This notation is fairly 
popular and is used in Javadocs, as well as in different comment formats 
Ruby uses.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Fri, 29 May 2015 16:53:02 GMT) Full text and rfc822 format available.

Message #155 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Fri, 29 May 2015 12:51:44 -0400

> unqualified names in separate entries, "could potentially more than double
> the size of the tag file". Which increases the time it takes to load one,
> and might (if we make more progress on Stefan's suggestion not to pre-build
> tags completion table) also make completion slower, in projects of
> certain size.

FWIW, doubling the size of the TAGS file will also double the size of
the obarray and hence increase the completion time similarly regardless
of whether we keep using an obarray or if we switch to searching the
TAGS buffers.

Yet another alternative is to build a trie, which would speed up
prefix (and partial) completion (but not substring completion).


        Stefan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Fri, 29 May 2015 17:13:02 GMT) Full text and rfc822 format available.

Message #158 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Fri, 29 May 2015 20:12:34 +0300

On 05/29/2015 07:51 PM, Stefan Monnier wrote:

> FWIW, doubling the size of the TAGS file will also double the size of
> the obarray and hence increase the completion time similarly regardless
> of whether we keep using an obarray or if we switch to searching the
> TAGS buffers.

Completion using obarray is currently an order of magnitude (or two) 
faster than the proposed patch that just uses search. Doubling that 
won't be a problem.

And the size of obarray may grow more than twice as big, or only a 
little, depending on what we're comparing to. The latter - if we're 
dealing with a C++ codebase with a lot of methods with the same name 
(and comparing against the same codebase where all names are qualified).

In any case, we can choose which entries to include in the obarray, in 
Lisp. If we don't want the unqualified names in completions (or the 
qualified ones, for some reason), we won't put them in the obarray, but 
we'll still be able to find them if asked by the user.

> Yet another alternative is to build a trie, which would speed up
> prefix (and partial) completion (but not substring completion).

It doesn't seem warranted thus far. Displaying the completions currently 
takes considerably longer than finding them in the obarray.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Fri, 29 May 2015 17:14:02 GMT) Full text and rfc822 format available.

Message #161 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Francesco Potortì <pot <at> gnu.org>,
 Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Fri, 29 May 2015 20:13:24 +0300

On 05/28/2015 06:14 PM, Francesco Potortì wrote:

> I see.  Given these constraints, I see no other way than augmenting the
> TAGS format to include an arbitrary number of tags per entry...

That wouldn't be the worst feature to have, but it would break backward 
compatibility.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Fri, 29 May 2015 18:29:02 GMT) Full text and rfc822 format available.

Message #164 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Fri, 29 May 2015 21:28:39 +0300

> Cc: 20629 <at> debbugs.gnu.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Fri, 29 May 2015 17:05:53 +0300
> 
> That can be mitigated by parsing out implicit tag names out of patterns, 
> however they also don't always contain qualified names (which was my 
> misunderstanding: they do in the toy example provided by Jan).

At least in C++, class members can be qualified explicitly in the
source (which was what Jan's example did), or they can be qualified
implicitly, by including them inside braced constructs, for example.
For the latter, etags (now only under -Q) adds the qualifications when
it generates the tags.

The code I added _removes_ the explicit qualifications, and doesn't
add the ones for implicitly qualified members.

> So, having qualified names in tag completion reliably is out of the
> question, unless etags uses them in tag names.

Exactly.

> > Yes, but I think if we change etags to create duplicate tags, we
> > should have this feature opt-out, unlike Exuberant, otherwise TAGS
> > created by default will be deficient with xref.  Do you agree?
> 
> I'd say no. First, there's value is simply being compatible.

Compatibility aside, I think what most users will want should be the
default.  What Exuberant ctags does now might not yet reflect the
changes in Emacs, from etags.el's UI to xfer.  Once they learn about
that, they might turn that flag on by default as well.

> Second, as the ctags man page warns, including both qualified and 
> unqualified names in separate entries, "could potentially more than 
> double the size of the tag file".

Who cares?  Disk space is no longer at premium.

> Which increases the time it takes to load one, and might (if we make
> more progress on Stefan's suggestion not to pre-build tags
> completion table) also make completion slower, in projects of
> certain size.

For moderate-size projects, the obarray-based completion is
instantaneous, so I'm not afraid of this.

> """
> The actual form of the qualified tag depends upon the language from 
> which the tag was derived (using a form that is most natural for how 
> qualified calls are specified in the language). For C++, it is in the 
> form "class::member"; for Eiffel and Java, it is in the form "class.member".
> """
> 
> If we posit that in each interesting language a qualified tag is of the 
> form CONTEXT-CHAR-NAME, standardizing on CHAR would allow us to extract 
> both qualified and unqualified tag names from a single entry, at a small 
> cost in readability for users where the language traditionally uses a 
> different separator than the one picked by etags.

I don't think we can safely do that, since different characters can
appear in identifiers of different languages.  By using the qualifier
string that is natural for the language, we make sure we never get
conflicts with the identifiers themselves.

Also, these qualified tags are for human consumption, which is another
argument on favor of language-specific syntax.

> For better uniqueness, I'd choose two of them: # before instance 
> methods, and . before class (or static) methods. This notation is fairly 
> popular and is used in Javadocs, as well as in different comment formats 
> Ruby uses.

Which means C++ programmers will probably be confused by them.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Fri, 29 May 2015 19:21:03 GMT) Full text and rfc822 format available.

Message #167 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Fri, 29 May 2015 15:19:35 -0400

> Completion using obarray is currently an order of magnitude (or two) faster
> than the proposed patch that just uses search.

Oh, then it's a non-starter (unless the search can be sped up).


        Stefan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Fri, 29 May 2015 20:02:02 GMT) Full text and rfc822 format available.

Message #170 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Fri, 29 May 2015 23:01:13 +0300

On 05/29/2015 09:28 PM, Eli Zaretskii wrote:

> Compatibility aside, I think what most users will want should be the
> default.  What Exuberant ctags does now might not yet reflect the
> changes in Emacs, from etags.el's UI to xfer.  Once they learn about
> that, they might turn that flag on by default as well.

There's nothing particularly xref-specific in using the one or the other 
approach. xref output buffer doesn't display the tag names, only 
patterns (although printing the tag names as well can be added).

> For moderate-size projects, the obarray-based completion is
> instantaneous,

Yes. I explicitly didn't mention it. Only the time to build the obarray 
the first time, as well as non-obarray based completion. You might be 
better positioned to judge whether these are serious.

> I don't think we can safely do that, since different characters can
> appear in identifiers of different languages.  By using the qualifier
> string that is natural for the language, we make sure we never get
> conflicts with the identifiers themselves.

The name segments could be escaped WRT those two characters.

> Also, these qualified tags are for human consumption, which is another
> argument on favor of language-specific syntax.

Sure, it's a good argument.

> Which means C++ programmers will probably be confused by them.

They are not hard to learn. IMO, "::" is a bad separator for method 
qualifier, since the same operator is used for namespace resolution.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Fri, 29 May 2015 20:35:03 GMT) Full text and rfc822 format available.

Message #173 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Fri, 29 May 2015 23:33:58 +0300

On 05/29/2015 10:19 PM, Stefan Monnier wrote:

> Oh, then it's a non-starter (unless the search can be sped up).

I've posted the numbers along with the two versions of the patch:

http://debbugs.gnu.org/cgi/bugreport.cgi?bug=20629#101
http://debbugs.gnu.org/cgi/bugreport.cgi?bug=20629#107

Someone else might have better luck, but it seems to me that 
dramatically improving the performance there would have to involve work 
on the regexp engine and/or the Elisp interpreter.

Byte-compiling the new etags.el, as I measured later, improves the 
worst-case time of the second patch from 3s to 1.9s. No effect on the 
runtimes of the first patch (still 0.7s per any completion).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Fri, 29 May 2015 20:37:02 GMT) Full text and rfc822 format available.

Message #176 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Fri, 29 May 2015 23:35:56 +0300

> Cc: 20629 <at> debbugs.gnu.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Fri, 29 May 2015 23:01:13 +0300
> 
> On 05/29/2015 09:28 PM, Eli Zaretskii wrote:
> 
> > Compatibility aside, I think what most users will want should be the
> > default.  What Exuberant ctags does now might not yet reflect the
> > changes in Emacs, from etags.el's UI to xfer.  Once they learn about
> > that, they might turn that flag on by default as well.
> 
> There's nothing particularly xref-specific in using the one or the other 
> approach. xref output buffer doesn't display the tag names, only 
> patterns (although printing the tag names as well can be added).

xref expects more accurate results, because it shows them all at once,
instead of one by one, in some order that assures the users will only
ever see the few first ones.  So yes, I'd say the switch to xref puts
a different kind of pressure on what etags/ctags does.

> > Which means C++ programmers will probably be confused by them.
> 
> They are not hard to learn. IMO, "::" is a bad separator for method 
> qualifier, since the same operator is used for namespace resolution.

foo::bar::baz is standard C++, AFAIK, so the ambiguity is already
known to C++ programmers.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Fri, 29 May 2015 22:38:02 GMT) Full text and rfc822 format available.

Message #179 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Sat, 30 May 2015 01:36:56 +0300

On 05/29/2015 11:35 PM, Eli Zaretskii wrote:

> xref expects more accurate results, because it shows them all at once,
> instead of one by one, in some order that assures the users will only
> ever see the few first ones.  So yes, I'd say the switch to xref puts
> a different kind of pressure on what etags/ctags does.

It does exert some pressure, but mostly to ensure that when a user 
searches for a "symbol" that they see in a buffer, it should have an 
explicit or an implicit tag match.

Whether qualified tag names are included in the completion, and whether 
one can search for a qualified tag name reliably, that hasn't changed 
between find-tag and xref-find-definitions.

> foo::bar::baz is standard C++, AFAIK, so the ambiguity is already
> known to C++ programmers.

I'm not disputing that.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 30 May 2015 06:54:03 GMT) Full text and rfc822 format available.

Message #182 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Sat, 30 May 2015 09:52:53 +0300

> Cc: 20629 <at> debbugs.gnu.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Sat, 30 May 2015 01:36:56 +0300
> 
> On 05/29/2015 11:35 PM, Eli Zaretskii wrote:
> 
> > xref expects more accurate results, because it shows them all at once,
> > instead of one by one, in some order that assures the users will only
> > ever see the few first ones.  So yes, I'd say the switch to xref puts
> > a different kind of pressure on what etags/ctags does.
> 
> It does exert some pressure, but mostly to ensure that when a user 
> searches for a "symbol" that they see in a buffer, it should have an 
> explicit or an implicit tag match.

The crucial difference is that the number of matches must now be
small, something that required us to remove the method which could
cope with qualified tags when the symbol at point was unqualified.

> Whether qualified tag names are included in the completion, and whether 
> one can search for a qualified tag name reliably, that hasn't changed 
> between find-tag and xref-find-definitions.

True.  But the original arrangement worked well with both with
find-tag and with completions; now that we removed tag-symbol-match-p
and qualified names, completion is less user-friendly.

So I think we should default to having 2 entries for each such tag.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 30 May 2015 12:07:02 GMT) Full text and rfc822 format available.

Message #185 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Sat, 30 May 2015 15:06:29 +0300

> Cc: 20629 <at> debbugs.gnu.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Wed, 27 May 2015 02:56:02 +0300
> 
> You should try the patch and see how it goes.

I tried it, and I think the completion display is better now.

Should I install the change for emitting 2 lines in TAGS, for both
unqualified and qualified tag names?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 30 May 2015 12:32:02 GMT) Full text and rfc822 format available.

Message #188 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Sat, 30 May 2015 15:30:55 +0300

On 05/30/2015 03:06 PM, Eli Zaretskii wrote:

> I tried it, and I think the completion display is better now.

Better how?

> Should I install the change for emitting 2 lines in TAGS, for both
> unqualified and qualified tag names?

Err, these changes are orthogonal, if not to say complete opposites. If 
there are two lines in TAGS for each item, no change to 
etags-tags-completion-table should be necessary.

I was rather thinking to make tag-implicit-name-match-p more strict, so 
it doesn't match if the explicit tag name is present on that line.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 30 May 2015 12:47:02 GMT) Full text and rfc822 format available.

Message #191 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Sat, 30 May 2015 15:46:36 +0300

> Cc: 20629 <at> debbugs.gnu.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Sat, 30 May 2015 15:30:55 +0300
> 
> On 05/30/2015 03:06 PM, Eli Zaretskii wrote:
> 
> > I tried it, and I think the completion display is better now.
> 
> Better how?

It seemed to show both qualified and unqualified names, AFAICS.

> > Should I install the change for emitting 2 lines in TAGS, for both
> > unqualified and qualified tag names?
> 
> Err, these changes are orthogonal, if not to say complete opposites. If 
> there are two lines in TAGS for each item, no change to 
> etags-tags-completion-table should be necessary.

The question still stands.

> I was rather thinking to make tag-implicit-name-match-p more strict, so 
> it doesn't match if the explicit tag name is present on that line.

But tag-implicit-name-match-p is called after tag-exact-match-p, so
the latter cannot be the fallback for the former.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 30 May 2015 12:53:02 GMT) Full text and rfc822 format available.

Message #194 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Sat, 30 May 2015 15:52:19 +0300

On 05/30/2015 09:52 AM, Eli Zaretskii wrote:

> The crucial difference is that the number of matches must now be
> small, something that required us to remove the method which could
> cope with qualified tags when the symbol at point was unqualified.

I suppose so.

> True.  But the original arrangement worked well with both with
> find-tag and with completions; now that we removed tag-symbol-match-p
> and qualified names, completion is less user-friendly.

But it wasn't ideal either. For instance, with C++, completion couldn't 
offer unqualified method names, because the indexer always qualified 
them. It was up to the user to figure out that typing an unqualified 
method name and pressing RET would still yield something useful.

> So I think we should default to having 2 entries for each such tag.

Another thing to consider is the possibility of merging Ex-Ctags and 
Etags in the future. Compatible behaviors would make it easier on the users.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 30 May 2015 13:04:02 GMT) Full text and rfc822 format available.

Message #197 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Sat, 30 May 2015 16:03:24 +0300

> Cc: 20629 <at> debbugs.gnu.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Sat, 30 May 2015 15:52:19 +0300
> 
> Another thing to consider is the possibility of merging Ex-Ctags and 
> Etags in the future. Compatible behaviors would make it easier on the users.

Volunteers are welcome, but based on reading their sources, it will be
a formidable job: the general structure of the code, and the parser in
particular, are completely different.  I'm not even sure you can talk
about "merging".

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 30 May 2015 13:44:02 GMT) Full text and rfc822 format available.

Message #200 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Sat, 30 May 2015 16:42:57 +0300

On 05/30/2015 03:46 PM, Eli Zaretskii wrote:

> It seemed to show both qualified and unqualified names, AFAICS.

I'd rather the comparison was made when TAGS is using 2-lines-per-item. 
Having two different ways to obtain the qualified names doesn't sounds 
good to me, and using implicit tag names is faulty:

- Like you mentioned, it's not always that qualified name occurs in the 
pattern. Sometimes it's creates using curly braces and such, and thus 
can only occur in an explicit tag name (which the discussed patch won't 
account for). Thus, some qualified names would be present in 
completions, and some won't. This is bad.

- See below (*).

>> Err, these changes are orthogonal, if not to say complete opposites. If
>> there are two lines in TAGS for each item, no change to
>> etags-tags-completion-table should be necessary.
>
> The question still stands.

It's only the question of the default behavior that's still undecided. 
The user will have a way to see qualified names either way.

> But tag-implicit-name-match-p is called after tag-exact-match-p, so
> the latter cannot be the fallback for the former.

I'm not sure what you mean. Why fallback?

(*)

Consider this: if the explicit tag name is present, then the tag name we 
can guess from the pattern implicitly is _incorrect_, so it's wrong to 
match it.

For instance, visit lisp/TAGS and try to navigate to "'edit-abbrevs-map" 
(yes, including a quote). There will be a match, but it was in fact a 
faulty search: an Elisp identifier can't include a quote.

It's harder to present a realistic case of a user looking for one thing 
and getting another, but the point is, is the Etags parser decided that 
the implicit tag doesn't match the explicit tag, we should ignore the 
former (because we don't really know the way they mismatch).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 30 May 2015 14:22:02 GMT) Full text and rfc822 format available.

Message #203 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Francesco Potortì <pot <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;	Regression: TAGS broken,
 can't find anything in C++ files.
Date: Sat, 30 May 2015 16:21:29 +0200

>Another thing to consider is the possibility of merging Ex-Ctags and 
>Etags in the future. Compatible behaviors would make it easier on the users.

In fact, Exhuberant-ctags and Etags are very different, internally.
Etags is faster (which nowadays is not that important) and creates
optimised TAGS files (which nowadays is not that important).  Ex-ctags
generates new-style ctags tags and can recurs directories, both of which
would be easy to implement in Etags.

When I compared them, more than ten years ago, the quality of generated
tags were comparable.  I don't know if things have changed in the
meantime, but I don't think that they have changed a lot.  The code of
Ex-ctags is much more structured and, at a first sight, readable.
However, I have never tried to go deeply into it, so I don't know if, in
fact, it is really easier to manage.

So "merging" would mean, in fact, to have the communities managing them
agree on one of them and improve it so that it becomes a superset of the
other, then officielly declare the other one as deprecated.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 30 May 2015 14:32:01 GMT) Full text and rfc822 format available.

Message #206 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Sat, 30 May 2015 17:31:21 +0300

> Cc: 20629 <at> debbugs.gnu.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Sat, 30 May 2015 16:42:57 +0300
> 
> On 05/30/2015 03:46 PM, Eli Zaretskii wrote:
> 
> > It seemed to show both qualified and unqualified names, AFAICS.
> 
> I'd rather the comparison was made when TAGS is using 2-lines-per-item. 

Feel free to describe a full recipe for comparing.  I needed to guess
what you wanted me to test; I'd be happy to just follow instructions
and report back what I saw.

> - Like you mentioned, it's not always that qualified name occurs in the 
> pattern. Sometimes it's creates using curly braces and such, and thus 
> can only occur in an explicit tag name (which the discussed patch won't 
> account for). Thus, some qualified names would be present in 
> completions, and some won't. This is bad.

Do you have an actual example?  AFAIU, this shouldn't happen: etags
only decides that an explicit tag name is unneeded when it can be
deduced from the pattern.  So if the explicit tag is not in TAGS, it
means etags.el can find it in the pattern.  (Qualified tag names that
are constructed by etags are never in the pattern, so they will always
get explicit tag names.)

> >> Err, these changes are orthogonal, if not to say complete opposites. If
> >> there are two lines in TAGS for each item, no change to
> >> etags-tags-completion-table should be necessary.
> >
> > The question still stands.
> 
> It's only the question of the default behavior that's still undecided. 

Yes, but I'd like to make a decision before making the change for
placing 2 entries, so that the number of backward-incompatible changes
could be minimized.

> > But tag-implicit-name-match-p is called after tag-exact-match-p, so
> > the latter cannot be the fallback for the former.
> 
> I'm not sure what you mean. Why fallback?

Because if tag-exact-match-p finds a match, tag-implicit-name-match-p
will not be invoked, AFAIU.

> Consider this: if the explicit tag name is present, then the tag name we 
> can guess from the pattern implicitly is _incorrect_, so it's wrong to 
> match it.

And AFAIU we don't, because the match methods are invoked in order,
until one of them yields a match.

> For instance, visit lisp/TAGS and try to navigate to "'edit-abbrevs-map" 
> (yes, including a quote). There will be a match, but it was in fact a 
> faulty search: an Elisp identifier can't include a quote.

Which is why there's an explicit tag there.  But I'm afraid I don't
see what you meant to demonstrate by this example.  Which code will
look for 'edit-abbrevs-map, and under what circumstances?

> It's harder to present a realistic case of a user looking for one thing 
> and getting another, but the point is, is the Etags parser decided that 
> the implicit tag doesn't match the explicit tag, we should ignore the 
> former (because we don't really know the way they mismatch).

I think we already do.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 30 May 2015 14:45:03 GMT) Full text and rfc822 format available.

Message #209 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Francesco Potortì <pot <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Sat, 30 May 2015 17:44:25 +0300

On 05/30/2015 05:21 PM, Francesco Potortì wrote:

> Ex-ctags
> generates new-style ctags tags and can recurs directories, both of which
> would be easy to implement in Etags.

It seems the users interested in this feature go on to simply use 'ctags 
-e'. I know I did, a while ago.

> When I compared them, more than ten years ago, the quality of generated
> tags were comparable.  I don't know if things have changed in the
> meantime, but I don't think that they have changed a lot.  The code of
> Ex-ctags is much more structured and, at a first sight, readable.
> However, I have never tried to go deeply into it, so I don't know if, in
> fact, it is really easier to manage.

Ex-ctags supports 41 language (by the last count at 
http://ctags.sourceforge.net/languages.html, maybe more now), etags 
supports 26.

The version at https://github.com/fishman/ctags also supports delegation 
to external parsers (see the --xcmd argument and 
https://github.com/fishman/ctags/blob/master/docs/xcmd.rst).

> So "merging" would mean, in fact, to have the communities managing them
> agree on one of them and improve it so that it becomes a superset of the
> other, then officielly declare the other one as deprecated.

Indeed.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 30 May 2015 15:04:02 GMT) Full text and rfc822 format available.

Message #212 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Sat, 30 May 2015 18:03:18 +0300

On 05/30/2015 05:31 PM, Eli Zaretskii wrote:

> Feel free to describe a full recipe for comparing.  I needed to guess
> what you wanted me to test; I'd be happy to just follow instructions
> and report back what I saw.

I'd rather not bother (let's see if we can conclude this thread of 
discussion without that). The patch in question is at 
http://debbugs.gnu.org/cgi/bugreport.cgi?bug=20629#59, and I'm 
officially withdrawing it.

>> - Like you mentioned, it's not always that qualified name occurs in the
>> pattern. Sometimes it's creates using curly braces and such, and thus
>> can only occur in an explicit tag name (which the discussed patch won't
>> account for). Thus, some qualified names would be present in
>> completions, and some won't. This is bad.
>
> Do you have an actual example?  AFAIU, this shouldn't happen: etags
> only decides that an explicit tag name is unneeded when it can be
> deduced from the pattern.  So if the explicit tag is not in TAGS, it
> means etags.el can find it in the pattern.  (Qualified tag names that
> are constructed by etags are never in the pattern, so they will always
> get explicit tag names.)

I believe that the current choice is between "etags produces unqualified 
tags" and "etags produces both qualified and unqualified tags for lines 
where the distinction appies" (2 entries per line).

In the latter case the patch above is extraneous. So above I'm 
describing the situation where etags produces unqualified tags (which it 
currently does, by default).

> Yes, but I'd like to make a decision before making the change for
> placing 2 entries, so that the number of backward-incompatible changes
> could be minimized.

I think that would be a mistake. Rather, we should make the choice based 
on correctness.

> Because if tag-exact-match-p finds a match, tag-implicit-name-match-p
> will not be invoked, AFAIU.

It would, but that's not the point. But yes, the above would make sense. 
Anyway...

> And AFAIU we don't, because the match methods are invoked in order,
> until one of them yields a match.

Of course the difference in tag-implicit-name-match-p behavior will only 
matter when tag-exact-match-p returns nil for the entry in question. 
Which is the case in the example I've given in the previous email.

> Which is why there's an explicit tag there.  But I'm afraid I don't
> see what you meant to demonstrate by this example.  Which code will
> look for 'edit-abbrevs-map, and under what circumstances?

find-tag, for instance. After being asked by the user. Like I said, it's 
a contrived example (users don't usually bother with names as tricky as 
this one), but I can try to cook up a more realistic one, if you insist.

An Elisp identifier can actually include a quote if it's escaped, but 
that's not the case with edit-abbrevs-map.

>> It's harder to present a realistic case of a user looking for one thing
>> and getting another, but the point is, is the Etags parser decided that
>> the implicit tag doesn't match the explicit tag, we should ignore the
>> former (because we don't really know the way they mismatch).
>
> I think we already do.

Currently, we don't put the implicit tag into the completion table if 
the explicit tag is present.

But we do match implicit tags during navigation, even when an explicit 
tag is there.

The aforementioned patch would include the implicit tag in the 
completion table anyway. I'm now saying we don't want that, and we also 
don't want navigation to match implicit tags in the entries that contain 
an explicit tag as well.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 30 May 2015 16:39:02 GMT) Full text and rfc822 format available.

Message #215 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Sat, 30 May 2015 19:37:48 +0300

> Cc: 20629 <at> debbugs.gnu.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Sat, 30 May 2015 18:03:18 +0300
> 
> >> - Like you mentioned, it's not always that qualified name occurs in the
> >> pattern. Sometimes it's creates using curly braces and such, and thus
> >> can only occur in an explicit tag name (which the discussed patch won't
> >> account for). Thus, some qualified names would be present in
> >> completions, and some won't. This is bad.
> >
> > Do you have an actual example?  AFAIU, this shouldn't happen: etags
> > only decides that an explicit tag name is unneeded when it can be
> > deduced from the pattern.  So if the explicit tag is not in TAGS, it
> > means etags.el can find it in the pattern.  (Qualified tag names that
> > are constructed by etags are never in the pattern, so they will always
> > get explicit tag names.)
> 
> I believe that the current choice is between "etags produces unqualified 
> tags" and "etags produces both qualified and unqualified tags for lines 
> where the distinction appies" (2 entries per line).

Yes.

> In the latter case the patch above is extraneous. So above I'm 
> describing the situation where etags produces unqualified tags (which it 
> currently does, by default).

OK.

> > Yes, but I'd like to make a decision before making the change for
> > placing 2 entries, so that the number of backward-incompatible changes
> > could be minimized.
> 
> I think that would be a mistake. Rather, we should make the choice based 
> on correctness.

Won't TAGS file with 2 entries for such symbols facilitate more
correct operation, both from xref-find-definitions and completion?

> >> It's harder to present a realistic case of a user looking for one thing
> >> and getting another, but the point is, is the Etags parser decided that
> >> the implicit tag doesn't match the explicit tag, we should ignore the
> >> former (because we don't really know the way they mismatch).
> >
> > I think we already do.
> 
> Currently, we don't put the implicit tag into the completion table if 
> the explicit tag is present.
> 
> But we do match implicit tags during navigation, even when an explicit 
> tag is there.
> 
> The aforementioned patch would include the implicit tag in the 
> completion table anyway. I'm now saying we don't want that, and we also 
> don't want navigation to match implicit tags in the entries that contain 
> an explicit tag as well.

Then how will you find or complete on "foo" when the explicit tag is
"XX::foo"?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 30 May 2015 17:02:02 GMT) Full text and rfc822 format available.

Message #218 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Francesco Potortì <pot <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;	Regression: TAGS broken,
 can't find anything in C++ files.
Date: Sat, 30 May 2015 19:01:42 +0200

Dmitry Gutov:
> It's harder to present a realistic case of a user looking for one thing 
> and getting another, but the point is, is the Etags parser decided that 
> the implicit tag doesn't match the explicit tag, we should ignore the 
> former (because we don't really know the way they mismatch).

...

>Currently, we don't put the implicit tag into the completion table if 
>the explicit tag is present.
>
>But we do match implicit tags during navigation, even when an explicit 
>tag is there.
>
>The aforementioned patch would include the implicit tag in the 
>completion table anyway. I'm now saying we don't want that, and we also 
>don't want navigation to match implicit tags in the entries that contain 
>an explicit tag as well.

Sorry if I don't closely follow the discussion (I do not know all the
internals of etags.el), and consequently sorry if I am misanderstanding
anything.  In that case, please discard my observations below.

I fear I can read in the above quotes a fundamental misunderstanding.
If Emacs (etags.el or anything else) treats implicit tags differently
from explicit tags, that's an error.

Implicit tags are semantically the same as explicit tags.  Whether a tag
is implicit or explicit, it's only a matter of efficiency in building
the TAGS file. For a given TAGS file entry, there is either no tag, or
an implicit tag, or an explicit tag.  The latter two cases should be
treated exactly alike by whichever program is reading the TAGS file.
Nor is it possible that for a given entry its implicit tag does not
match its explicit tag, because either the former or the latter are
present, not both.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 30 May 2015 17:47:02 GMT) Full text and rfc822 format available.

Message #221 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Sat, 30 May 2015 20:46:37 +0300

On 05/30/2015 07:37 PM, Eli Zaretskii wrote:

> Won't TAGS file with 2 entries for such symbols facilitate more
> correct operation, both from xref-find-definitions and completion?

I suppose. But that's a separate decision, whether to make it the default.

> Then how will you find or complete on "foo" when the explicit tag is
> "XX::foo"?

I'd like to repeat that the current choice is between having only 
unqualified method names in explicit tags, or having both qualified and 
unqualified method names (2 entries per line).

Having only a qualified entry is not a situation we're going to handle.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 30 May 2015 18:14:02 GMT) Full text and rfc822 format available.

Message #224 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Francesco Potortì <pot <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Sat, 30 May 2015 21:13:30 +0300

On 05/30/2015 08:01 PM, Francesco Potortì wrote:

> Sorry if I don't closely follow the discussion (I do not know all the
> internals of etags.el), and consequently sorry if I am misanderstanding
> anything.  In that case, please discard my observations below.

I don't think I'm misunderstanding: it's mainly a problem of terminology.

> I fear I can read in the above quotes a fundamental misunderstanding.
> If Emacs (etags.el or anything else) treats implicit tags differently
> from explicit tags, that's an error.

It has different predicates, to determine whether point is after an 
"implicit tag name" for a given string. Or whether it's after an 
"explicit tag name", or some other kind of match.

> Implicit tags are semantically the same as explicit tags.  Whether a tag
> is implicit or explicit, it's only a matter of efficiency in building
> the TAGS file. For a given TAGS file entry, there is either no tag, or
> an implicit tag, or an explicit tag.

Maybe we should say that there's always a "tag name", for a given entry. 
And we can determine it by looking at the tag name field, or, in the 
absence of it, implicitly determine from the pattern.

It's easier to call the value of the tag name field an "explicit tag", 
and the value that we can derive from the pattern an "implicit tag". And 
if the explicit tag is present, naturally they'll be different.

> The latter two cases should be
> treated exactly alike by whichever program is reading the TAGS file.
> Nor is it possible that for a given entry its implicit tag does not
> match its explicit tag, because either the former or the latter are
> present, not both.

This confirms that we should always disregard implicit tag when the 
explicit tag is present.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 30 May 2015 18:43:02 GMT) Full text and rfc822 format available.

Message #227 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: pot <at> gnu.org, 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Sat, 30 May 2015 21:42:07 +0300

> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Sat, 30 May 2015 21:13:30 +0300
> Cc: 20629 <at> debbugs.gnu.org
> 
> This confirms that we should always disregard implicit tag when the 
> explicit tag is present.

"Disregard" for what purposes?  If all you need is the symbol name,
then yes, you should always use the explicit tag.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 30 May 2015 18:47:01 GMT) Full text and rfc822 format available.

Message #230 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Sat, 30 May 2015 21:46:30 +0300

> Cc: 20629 <at> debbugs.gnu.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Sat, 30 May 2015 20:46:37 +0300
> 
> On 05/30/2015 07:37 PM, Eli Zaretskii wrote:
> 
> > Won't TAGS file with 2 entries for such symbols facilitate more
> > correct operation, both from xref-find-definitions and completion?
> 
> I suppose. But that's a separate decision, whether to make it the default.

You said "based on correctness".  If the 2-entry alternative
facilitates more correct operation, that's the alternative we should
choose, no?

> > Then how will you find or complete on "foo" when the explicit tag is
> > "XX::foo"?
> 
> I'd like to repeat that the current choice is between having only 
> unqualified method names in explicit tags, or having both qualified and 
> unqualified method names (2 entries per line).
> 
> Having only a qualified entry is not a situation we're going to handle.

You elide too much of the previous context, and I cannot afford
reading 2 or 3 previous messages to restore that (and please don't
rely on my memory too much).  So I no longer understand what we are
talking about here.

Including the pattern (what you call "the implicit tag") in the
completion table could serve as context for disambiguating otherwise
similar tag names.  But if you think it's unneeded, I'm not going to
argue.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 30 May 2015 19:36:02 GMT) Full text and rfc822 format available.

Message #233 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Francesco Potortì <pot <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken,
 can't find anything in C++ files.
Date: Sat, 30 May 2015 21:35:14 +0200

>> Sorry if I don't closely follow the discussion (I do not know all the
>> internals of etags.el), and consequently sorry if I am misanderstanding
>> anything.  In that case, please discard my observations below.
>
>I don't think I'm misunderstanding: it's mainly a problem of terminology.

I was fearing *I* was the one who misunderstands :)

Anyway, probably it's also a terminology problem, and that's my fault
too.  You are right that what I called "implicit / explicit tag" is in
fact an "implicit / explicit tag name".  Sorry about that, it has been a
long time since I worked on that.  In the etc/ETAGS.EBNF file you can
read the complete description that I made at that time.  Here it is:


======================= 2) discussion of tag names =======================

- WHAT ARE TAG NAMES
Tag lines in a tags file are usually made from the above defined pattern
and by an optional tag name.  The pattern is a string that is searched
in the source file to find the tagged line.

- WHY TAG NAMES ARE GOOD
When a user looks for a tag, Emacs first compares the tag with the tag
names contained in the tags file.  If no match is found, Emacs compares
the tag with the patterns.  The tag name is then the preferred way to
look for tags in the tags file, because when the tag name is present
Emacs can find a tag faster and more accurately.  These tag names are
part of tag lines in the tags file, so we call them "explicit".

- WHY IMPLICIT TAG NAMES ARE EVEN BETTER
When a tag line has no name, but a name can be deduced from the pattern,
we say that the tag line has an implicit tag name.  Often tag names are
redundant; this happens when the name of a tag is an easily guessable
substring of the tag pattern.  We define a set of rules to decide
whether it is possible to deduce the tag name from the pattern, and make
an unnamed tag in those cases.  The name deduced from the pattern of an
unnamed tag is the implicit name of that tag.
  When the user looks for a tag, and Emacs finds no explicit tag names
that match it, Emacs then looks for an tag whose implicit tag name
matches the request.  etags.c uses implicit tag names when possible, in
order to reduce the size of the tags file.
  An implicit tag name is deduced from the pattern by discarding the
last character if it is one of ` \f\t\n\r()=,;', then taking all the
rightmost consecutive characters in the pattern which are not one of
those.

===================== end of discussion of tag names =====================

>> Implicit tags are semantically the same as explicit tags.  Whether a tag
>> is implicit or explicit, it's only a matter of efficiency in building
>> the TAGS file. For a given TAGS file entry, there is either no tag, or
>> an implicit tag, or an explicit tag.
>
>Maybe we should say that there's always a "tag name", for a given entry. 
>And we can determine it by looking at the tag name field, or, in the 
>absence of it, implicitly determine from the pattern.

Right.  But thereés more.  We can have either:

1) explicit tag name
2) implicit tag name (unnamed tag whose name can be deduced from pattern)
3) no tag name (unnamed tag)

In some languages, like C++ and derived, most tags are named.  In
others, most are unnamed, usually in the simplest languages.

>It's easier to call the value of the tag name field an "explicit tag", 
>and the value that we can derive from the pattern an "implicit tag". And 
>if the explicit tag is present, naturally they'll be different.

Well, no.  Or at least, it's not how they were intended.  The parser
finds a tag, then it decides whether it should be named or not and in
the latter case, depending on the tag and the language, what is the
appropriate tag name.  If the tag should have no name, an unnamed tag is
created.  If the tag should be named, and if the name can be deduced
from the pattern, then it creates no explicit name (thus creating an
unnamed tag with an implicit tag name), else it creates a tag with an
explicit name.

The idea is that when you look for a tag, you first look for the
(explicit) names in the tag, which are contained in the relevant field.
If you find one, it's done.  If it is not, you can try and see if the
tag you are looking for matches an implicit name.  If you find one, it's
done.  If you don't, then you should try and match the unnamed patterns
(in practice, I think that etags.el tries and matches all the patterns).
So there is no such thing as an explicit name plus an implicit name for
the same tag.  Or at least, it was never intended to work like that.

>> The latter two cases should be
>> treated exactly alike by whichever program is reading the TAGS file.
>> Nor is it possible that for a given entry its implicit tag does not
>> match its explicit tag, because either the former or the latter are
>> present, not both.
>
>This confirms that we should always disregard implicit tag when the
>explicit tag is present.

I suppose you can view it like this so.  In my language, I would say
that when an explicit name is present, we have found a name.  That's
all, no thing like an "implicit tag name" is there to be disregarded.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 30 May 2015 19:43:02 GMT) Full text and rfc822 format available.

Message #236 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Sat, 30 May 2015 22:42:26 +0300

On 05/30/2015 09:46 PM, Eli Zaretskii wrote:

> You said "based on correctness".  If the 2-entry alternative
> facilitates more correct operation, that's the alternative we should
> choose, no?

It adds a capability (to perform the search based on fully qualified 
name), rather than improving correctness.

But again, it's a separate question. You don't have to persuade me on 
that choice, but I'm inclined toward compatibility with Ex-Ctags.

>>> Then how will you find or complete on "foo" when the explicit tag is
>>> "XX::foo"?
>>
>> I'd like to repeat that the current choice is between having only
>> unqualified method names in explicit tags, or having both qualified and
>> unqualified method names (2 entries per line).
>>
>> Having only a qualified entry is not a situation we're going to handle.
>
> You elide too much of the previous context, and I cannot afford
> reading 2 or 3 previous messages to restore that (and please don't
> rely on my memory too much).  So I no longer understand what we are
> talking about here.

Sorry, I don't know where to start clarifying. In the previous message 
I've explained why your question, quoted above, doesn't make sense: the 
TAGS file must have another entry, for the same line, where the explicit 
tag is "foo". That one will be matched, not "XX::foo".

This discussion has grown quite long already. Francesco seems to agree 
with my conclusions, so I'm going to make the change.

> Including the pattern (what you call "the implicit tag") in the
> completion table could serve as context for disambiguating otherwise
> similar tag names.  But if you think it's unneeded, I'm not going to
> argue.

Here you're using a term that's not part of the usual completion table 
terminology. Context? Apparently you mean annotation, which would be 
possible (*), but using the pattern as annotation for the current 
entry's tag name is not at all the same as including the implicit tag 
name (derived from the pattern) in the completion table. Which means 
adding it as a separate element. For simplicity, think of this 
completion table as a list, especially now it's implemented as such.

(*) But not necessarily advisable, and would bring its own challenge WRT 
implementation.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 30 May 2015 20:05:03 GMT) Full text and rfc822 format available.

Message #239 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Francesco Potortì <pot <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Sat, 30 May 2015 23:04:24 +0300

On 05/30/2015 10:35 PM, Francesco Potortì wrote:

> In the etc/ETAGS.EBNF file you can
> read the complete description that I made at that time.  Here it is:

Yes, thanks. I've read it a few times now. :)

> 1) explicit tag name
> 2) implicit tag name (unnamed tag whose name can be deduced from pattern)
> 3) no tag name (unnamed tag)
>
> In some languages, like C++ and derived, most tags are named.  In
> others, most are unnamed, usually in the simplest languages.

It seems we don't have a suitable predicate for unnamed tags. How does 
one determine whether an entry contains a entirely unnamed tag, or an 
implicitly named one?

>> It's easier to call the value of the tag name field an "explicit tag",
>> and the value that we can derive from the pattern an "implicit tag". And
>> if the explicit tag is present, naturally they'll be different.
>
> The parser
> finds a tag, then it decides whether it should be named or not and in
> the latter case, depending on the tag and the language, what is the
> appropriate tag name.  If the tag should have no name, an unnamed tag is
> created.  If the tag should be named, and if the name can be deduced
> from the pattern, then it creates no explicit name (thus creating an
> unnamed tag with an implicit tag name), else it creates a tag with an
> explicit name.

You're describing how a TAGS file is created. I'm trying to describe how 
one should read it.

> The idea is that when you look for a tag, you first look for the
> (explicit) names in the tag, which are contained in the relevant field.
> If you find one, it's done.

This implies that an explicit name is better than an implicit name (and 
thus should be found/displayed first). That doesn't seem to be the case.

> If it is not, you can try and see if the
> tag you are looking for matches an implicit name.  If you find one, it's
> done.

You should try using the current Emacs master. Now when the user presses 
M-., we try to list all correct matches, not jump to them one by one (if 
there are more than one, of course).

>> This confirms that we should always disregard implicit tag when the
>> explicit tag is present.
>
> I suppose you can view it like this so.  In my language, I would say
> that when an explicit name is present, we have found a name.  That's
> all, no thing like an "implicit tag name" is there to be disregarded.

Present where?

When looking for matches for a given tag, we search through the whole 
TAGS file, and at the end of all textual match try one of the predicates 
(is if an explicit match? is it an implicit match? etc).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sat, 30 May 2015 22:37:02 GMT) Full text and rfc822 format available.

Message #242 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Francesco Potortì <pot <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken,
 can't find anything in C++ files.
Date: Sun, 31 May 2015 00:35:55 +0200

>> 1) explicit tag name
>> 2) implicit tag name (unnamed tag whose name can be deduced from pattern)
>> 3) no tag name (unnamed tag)
>>
>> In some languages, like C++ and derived, most tags are named.  In
>> others, most are unnamed, usually in the simplest languages.
>
>It seems we don't have a suitable predicate for unnamed tags. How does 
>one determine whether an entry contains a entirely unnamed tag, or an 
>implicitly named one?

If you look for an implicit name and you can't find one, then it's an
unnamed tag.  This is the rule of thumb.  In theory, it is not rigorous,
but in practice I think it always works.

I'll rewrite the rest now that I think I have better understood what you
need.

There is no possible comparison between an explicit and an implicit
name.  A tag is named or unnamed.  If it is named, whether the tag is
explicit or implicit it is only a matter of optimisation.  You can't
have the choice between an explicit and an implicit name for a tag:
either you have a name or not.

My description of the flow used when looking for a tag supposed that you
are satisfied when you find a match.  If instead you look for all the
matches at once, for example because you want to show them all, you
should match against all tag names first (whether the name is implicit
or explicit should not matter): those are the best matches.  Next come
the matches against the patterns, which are of lower quality.

In order to find all the matching names, you make two passes, one
against tags with names: in that case you match against the explicit
names; then one against tags without names: in that case you match
against implicit names.  All the matches found in the first and second
pass should be treated equally.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sun, 31 May 2015 00:36:02 GMT) Full text and rfc822 format available.

Message #245 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Francesco Potortì <pot <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Sun, 31 May 2015 03:34:57 +0300

On 05/31/2015 01:35 AM, Francesco Potortì wrote:

> If you look for an implicit name and you can't find one, then it's an
> unnamed tag.  This is the rule of thumb.  In theory, it is not rigorous,
> but in practice I think it always works.

It seems that the main (only?) case when we would consider a tag unnamed 
is when its pattern ends with 2 or more NONAM characters, and there's no 
explicit name.

> I'll rewrite the rest now that I think I have better understood what you
> need.

Thanks, that matches my understanding as well.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sun, 31 May 2015 21:47:02 GMT) Full text and rfc822 format available.

Message #248 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Francesco Potortì <pot <at> gnu.org>
To: 20629 <at> debbugs.gnu.org
Subject: Fwd: bug#20703: 24.4; Stack overflow in regexp matcher
Date: Sun, 31 May 2015 23:46:24 +0200

This unrelated bug report contains interesting info: maybe what I and
others have assumed is not true and optimising the size of the TAGS file
is still a worthwhile objective.

If it is indeed important to optimise for size, and if it is important
to have tag names both fully qualified and unqualified, then one should
consider augmenting the TAGS syntax with an arbitrary number of names
per tag.

------- Start of forwarded message -------
Date: Sun, 31 May 2015 18:46:21 +0200
Resent-from: lee <at> yagibdah.de
From: lee <at> yagibdah.de
Subject: bug#20703: 24.4; Stack overflow in regexp matcher
To: 20703 <at> debbugs.gnu.org

using projectile, trying to find a tag with C-p j

The TAGS file is 1.8GB.

[...]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sun, 31 May 2015 22:22:01 GMT) Full text and rfc822 format available.

Message #251 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Francesco Potortì <pot <at> gnu.org>, 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: Fwd: bug#20703: 24.4; Stack overflow in regexp matcher
Date: Mon, 1 Jun 2015 01:20:44 +0300

On 06/01/2015 12:46 AM, Francesco Potortì wrote:
> This unrelated bug report contains interesting info: maybe what I and
> others have assumed is not true and optimising the size of the TAGS file
> is still a worthwhile objective.

I'm guessing it's not the file size that led to the stack overflow 
problem there.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Sun, 31 May 2015 22:41:03 GMT) Full text and rfc822 format available.

Message #254 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Francesco Potortì <pot <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: Fwd: bug#20703: 24.4; Stack overflow in regexp matcher
Date: Mon, 01 Jun 2015 00:40:05 +0200

>> This unrelated bug report contains interesting info: maybe what I and
>> others have assumed is not true and optimising the size of the TAGS file
>> is still a worthwhile objective.
>
>I'm guessing it's not the file size that led to the stack overflow 
>problem there.

Most probably, in fact.  That's why I said the bug was unrelated.

However, it contains interesting info: TAGS file sizes of 2 GB are not
out of order.  This means that caring about file size is a worthwhile
goal.  With the consequences I had mentioned in my previous mail.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Thu, 26 Nov 2015 03:25:02 GMT) Full text and rfc822 format available.

Message #257 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Thu, 26 Nov 2015 05:23:58 +0200

Is there anything left to fix in this bug?

The example described in the first message now works (after one
re-generates their TAGS using the latest etags, or even 'ctags -e').

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Thu, 26 Nov 2015 15:44:02 GMT) Full text and rfc822 format available.

Message #260 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Thu, 26 Nov 2015 17:43:08 +0200

> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Cc: 20629 <at> debbugs.gnu.org
> Date: Thu, 26 Nov 2015 05:23:58 +0200
> 
> Is there anything left to fix in this bug?
> 
> The example described in the first message now works (after one
> re-generates their TAGS using the latest etags, or even 'ctags -e').

If we don't want to have duplicate qualified+unqualified entries for
OO languages, then no, there's nothing left to fix.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Thu, 26 Nov 2015 16:13:02 GMT) Full text and rfc822 format available.

Message #263 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Thu, 26 Nov 2015 18:12:01 +0200

On 11/26/2015 05:43 PM, Eli Zaretskii wrote:

> If we don't want to have duplicate qualified+unqualified entries for
> OO languages, then no, there's nothing left to fix.

Right. I thought that was also already done, for some reason.

Should we create a dedicated issue for that? It concerns not only C++, 
but also Lua (as we've found out recently), and other languages (whether 
we want to tackle them now or not).

We could also continue the discussion there about extending etags format 
to supporting several tag names on one line (aside from compatibility 
issues, it should be trivial).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Thu, 26 Nov 2015 16:34:01 GMT) Full text and rfc822 format available.

Message #266 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50;
 Regression: TAGS broken, can't find anything in C++ files.
Date: Thu, 26 Nov 2015 18:32:49 +0200

> Cc: 20629 <at> debbugs.gnu.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Thu, 26 Nov 2015 18:12:01 +0200
> 
> On 11/26/2015 05:43 PM, Eli Zaretskii wrote:
> 
> > If we don't want to have duplicate qualified+unqualified entries for
> > OO languages, then no, there's nothing left to fix.
> 
> Right. I thought that was also already done, for some reason.

It wasn't done because the discussion didn't reach any consent.

> Should we create a dedicated issue for that? It concerns not only C++, 
> but also Lua (as we've found out recently), and other languages (whether 
> we want to tackle them now or not).

Yes, we could create a separate issue.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20629; Package emacs. (Fri, 27 Nov 2015 03:55:02 GMT) Full text and rfc822 format available.

Message #269 received at 20629 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20629 <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Fri, 27 Nov 2015 05:54:39 +0200

On 11/26/2015 06:32 PM, Eli Zaretskii wrote:

> It wasn't done because the discussion didn't reach any consent.

FWIW, I left it with understanding that we should learn to generate both 
qualified and unqualified tag names for C++. Whether to do that by 
default or not, I'm not sure.

But Exuberant Ctags defaults to the latter option, and only generates 
unqualified tag names by default. It would be a good idea to follow 
suit, for consistency if nothing else.

And I'd like to revisit your previous comment:

> Including the pattern (what you call "the implicit tag") in the
> completion table could serve as context for disambiguating otherwise
> similar tag names.

Even if that can work in many cases (patterns are displayed in the xref 
buffer, for example), pattern won't necessarily contain the qualified 
name either.

In Java, it never will, as long as the pattern is created from the 
contents of the line with the method's definition (because there's no 
class name on that line).

In C++, it won't if the method is defined inside the class definition 
(Java-style), which seems to be recommended for short methods.

Reply sent to Eli Zaretskii <eliz <at> gnu.org>:
You have taken responsibility. (Sat, 19 Mar 2016 18:46:01 GMT) Full text and rfc822 format available.

Notification sent to "Jan D." <jan.h.d <at> swipnet.se>:
bug acknowledged by developer. (Sat, 19 Mar 2016 18:46:02 GMT) Full text and rfc822 format available.

Message #274 received at 20629-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 20629-done <at> debbugs.gnu.org
Subject: Re: bug#20629: 25.0.50; Regression: TAGS broken, can't find anything
 in C++ files.
Date: Sat, 19 Mar 2016 20:45:20 +0200

> Cc: 20629 <at> debbugs.gnu.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Fri, 27 Nov 2015 05:54:39 +0200
> 
> On 11/26/2015 06:32 PM, Eli Zaretskii wrote:
> 
> > It wasn't done because the discussion didn't reach any consent.
> 
> FWIW, I left it with understanding that we should learn to generate both 
> qualified and unqualified tag names for C++. Whether to do that by 
> default or not, I'm not sure.
> 
> But Exuberant Ctags defaults to the latter option, and only generates 
> unqualified tag names by default. It would be a good idea to follow 
> suit, for consistency if nothing else.
> 
> And I'd like to revisit your previous comment:
> 
>  > Including the pattern (what you call "the implicit tag") in the
>  > completion table could serve as context for disambiguating otherwise
>  > similar tag names.
> 
> Even if that can work in many cases (patterns are displayed in the xref 
> buffer, for example), pattern won't necessarily contain the qualified 
> name either.
> 
> In Java, it never will, as long as the pattern is created from the 
> contents of the line with the method's definition (because there's no 
> class name on that line).
> 
> In C++, it won't if the method is defined inside the class definition 
> (Java-style), which seems to be recommended for short methods.

As we now have a dedicated feature request (bug#22995), I'm closing
this bug.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 17 Apr 2016 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 9 years and 115 days ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #20629 25.0.50; Regression: TAGS broken, can't find anything in C++ files.

GNU bug report logs - #20629
25.0.50; Regression: TAGS broken, can't find anything in C++ files.