GNU bug report logs - #49836
Support ripgrep in semantic-symref-tool-grep

Previous Next

Package: emacs;

Reported by: Juri Linkov <juri <at> linkov.net>

Date: Mon, 2 Aug 2021 21:40:01 UTC

Severity: wishlist

Tags: patch

Full log


Message #19 received at 49836 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Juri Linkov <juri <at> linkov.net>
Cc: 49836 <at> debbugs.gnu.org
Subject: Re: bug#49836: Support ripgrep in semantic-symref-tool-grep
Date: Thu, 5 Aug 2021 06:03:08 +0300
On 05.08.2021 00:23, Juri Linkov wrote:
>> I think the original idea (surrounding with \W) is sound: after all, not
>> every symbol boundary in Emacs sense is a word boundary in Grep or RG. If
>> a method, say, ends with ?, then it won't be.
> I tried to search for 'soap-type-is-array?' in the Emacs tree,
> and ripgrep can find it with "\\b%s\\b", but Grep can't.

Did you search through symref, or in console? If the former, it seems 
some regexp-quoting is missing somewhere (the question mark was no 
escaped). Because see what the console says:

$ rg "\bsoap-type-is-array?\b"
lisp/net/soap-client.el
950:(defun soap-type-is-array? (type)
990:                       (if (soap-type-is-array? type)

ChangeLog.2
19080:	* lisp/net/soap-client.el (soap-type-is-array?): new defun

$ rg "\bsoap-type-is-array\?\b"

^^ no matches

And

$ rg "\bsoap-type-is-array\?"

has matches, of course.

> It would be more preferable not to change the existing default logic
> to avoid possible troubles.  Since Grep with Basic syntax works fine,
> then better not to switch to Extended syntax.

See above. But also consider what happens if a user sees that 
grep-program is now customizable and ripgrep is an officially supported 
value. They change it to "rg", and then suddenly their 'M-x rgrep' input 
has to use the extended regexp format?

Worse than that, any third-party package that uses grep-find-template 
will suddenly have a high chance of failing if they pass any nontrivial 
regexps to it, especially if those have groupings or alternations.

It's a hard problem: grep.el is not prepared for abstracting like that. 
If we at least standardized it internally on Extended format, that would 
at least remove one source of uncertainty and bugs. The user still can 
input basic regexps interactively, we can translate them easily.

Further: seeing xref-search-program-alist, people asked for support for 
other similar programs, such as ag and pt. Any solution we end up with, 
we should try to ensure they are valid values of grep-program as well.

> The new user option is already used in many places in grep.el
> in the previous patch, so it should be ok to use it in semantic-symref
> as well:
> 
> diff --git a/lisp/cedet/semantic/symref/grep.el b/lisp/cedet/semantic/symref/grep.el
> index 180d779a78..b7d08409aa 100644
> --- a/lisp/cedet/semantic/symref/grep.el
> +++ b/lisp/cedet/semantic/symref/grep.el
> @@ -150,15 +150,22 @@ semantic-symref-perform-search
>                              "-l ")
>                             ((eq (oref tool searchtype) 'regexp)
>                              "-nE ")
> -                          (t "-n ")))
> +                          (t (if (equal grep-program "rg")
> +                                 ;; TODO: remove this after ripgrep is fixed (bug#49836)
> +                                 (unless (string-search "rg <C> -nH" grep-find-template)
> +                                   "-n ")
> +                               "-n "))))

I'm actually fine with this part.

>            (greppat (cond ((eq (oref tool searchtype) 'regexp)
>                            (oref tool searchfor))
>                           (t
>                            ;; Can't use the word boundaries: Grep
>                            ;; doesn't always agree with the language
>                            ;; syntax on those.
> -                         (format "\\(^\\|\\W\\)%s\\(\\W\\|$\\)"
> -                                 (oref tool searchfor)))))
> +                         (if (equal grep-program "rg")
> +                             (format "(^|\\W)%s(\\W|$)"
> +                                     (oref tool searchfor))
> +                           (format "\\(^\\|\\W\\)%s\\(\\W\\|$\\)"
> +                                   (oref tool searchfor))))))

This can work. Except the comparison should be with "grep", I think: all 
other alternatives only work with the Extended format.




This bug report was last modified 123 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.