GNU bug report logs -
#34023
Support double colons in Info index entries
Previous Next
To reply to this bug, email your comments to 34023 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#34023
; Package
emacs
.
(Wed, 09 Jan 2019 21:14:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Gavin Smith <gavinsmith0123 <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Wed, 09 Jan 2019 21:14:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Emacs version checked: 26.1.
In the Info format colons are special, and for this reason, there is
limited support for colons in index entries. The Emacs Info mode
supports single colons in index entries as long as they are not followed
by a space.
There is this comment at the start of info.el:
;; Note that nowadays we expect Info files to be made using makeinfo.
;; In particular we make these assumptions:
;; - a menu item MAY contain colons but not colon-space ": "
;; - a menu item ending with ": " (but not ":: ") is an index entry
;; - a node name MAY NOT contain a colon
;; This distinction is to support indexing of computer programming
;; language terms that may contain ":" but not ": ".
It doesn't state it, but when I tested it double colons don't work even
if they are not followed by a space.
There is a fairly simple solution to this problem that I haven't seen
suggested in all the messages posted on this topic in the mailing list
archives. In index nodes only (which have a special marker included,
^@^H[index^@^H]), use a colon to terminate the text of the index entry,
but instead of looking for the first colon in the line, look for the
last. So this entry:
* a::b: a colon b. (line 129)
would refer to line 129 of the node "a colon b". This is possible
because node names cannot contain colons. This restriction is not too
important, whereas the inability to index items containing colons is
quite important. This is what is implemented in the standalone info
browser (since change on 2017-04-08).
This change shouldn't be made for all nodes, because the comment after
the closing '.' could contain a colon:
* label: node. comment: with a colon.
This shouldn't be interpreted as refering to a node "with a colon".
However, the "(line ...)" comment can't contain a colon.
I'm not familiar with Emacs Lisp enough to propose a patch to implement
this change myself.
The standalone info program also implemented a quoting mechanism
(surrounding the text with a pair of 0x7F bytes) to allow nearly all
characters to be included in node names and index entries. This has
never been implemented in Emacs Info and has never been used by default
in texi2any's output. I think my suggestion above would be sufficient
and would work with existing Info files and versions of
texi2any/makeinfo without anything breaking. The quoting mechanism could
potentially be removed from texi2any and info as nobody has ever used it
and it makes things more complicated for no reason.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#34023
; Package
emacs
.
(Fri, 11 Jan 2019 00:10:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 34023 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hi Gavin,
> In the Info format colons are special, and for this reason, there is
> limited support for colons in index entries. The Emacs Info mode
> supports single colons in index entries as long as they are not followed
> by a space.
Thanks for the detailed description.
> It doesn't state it, but when I tested it double colons don't work even
> if they are not followed by a space.
>
> There is a fairly simple solution to this problem that I haven't seen
> suggested in all the messages posted on this topic in the mailing list
> archives. In index nodes only (which have a special marker included,
> ^@^H[index^@^H]), use a colon to terminate the text of the index entry,
> but instead of looking for the first colon in the line, look for the
> last. So this entry:
>
> * a::b: a colon b. (line 129)
>
> would refer to line 129 of the node "a colon b". This is possible
> because node names cannot contain colons. This restriction is not too
> important, whereas the inability to index items containing colons is
> quite important. This is what is implemented in the standalone info
> browser (since change on 2017-04-08).
The following patch handles the cases that you presented,
but it's hard to predict what other cases it might break.
Do you have a sample test file that covers different cases?
We could add such file to Emacs regression tests.
> This change shouldn't be made for all nodes, because the comment after
> the closing '.' could contain a colon:
>
> * label: node. comment: with a colon.
>
> This shouldn't be interpreted as refering to a node "with a colon".
>
> However, the "(line ...)" comment can't contain a colon.
The following change is made only for index nodes.
I have to say that the current regexp-based parsing is
an inherently fragile approach. Do you think it would be possible
to add more markup to Info files instead of relying on regexps?
Like index nodes having a special marker ^@^H[index^@^H]
maybe adding some markers to identify index entries,
node references, line numbers?
Better yet would be to read Info manual in HTML format in Info reader.
That would allow extracting all information unambiguously.
[info.el.support-double-colons-in-Info-index-entries.patch (text/x-diff, inline)]
diff --git a/lisp/info.el b/lisp/info.el
index 6038273c37..2f7e293297 100644
--- a/lisp/info.el
+++ b/lisp/info.el
@@ -2664,9 +2664,15 @@ Info-menu-entry-name-re
Because of ambiguities, this should be concatenated with something like
`:' and `Info-following-node-name-re'.")
+(defconst Info-index-entry-name-re "\\(?:[^:]\\|:[^,.;() \t\n]\\)*"
+ "Regexp that matches an index entry name possibly including a colon.")
+
(defun Info-extract-menu-node-name (&optional multi-line index-node)
(skip-chars-forward " \t\n")
- (when (looking-at (concat Info-menu-entry-name-re ":\\(:\\|"
+ (when (looking-at (concat (if index-node
+ Info-index-entry-name-re
+ Info-menu-entry-name-re
+ ) ":\\(:\\|"
(Info-following-node-name-re
(cond
(index-node "^,\t\n")
@@ -2741,7 +2747,9 @@ Info-complete-menu-item
(t
(let ((pattern (concat "\n\\* +\\("
(regexp-quote string)
- Info-menu-entry-name-re "\\):"
+ (if (Info-index-node)
+ Info-index-entry-name-re
+ Info-menu-entry-name-re) "\\):"
Info-node-spec-re))
completions
(complete-nodes Info-complete-nodes))
@@ -3966,7 +3974,8 @@ Info-try-follow-nearest-node
(setq node t))
(setq node nil))))
;; menu item: node name
- ((setq node (Info-get-token (point) "\\* +" "\\* +\\([^:]*\\)::"))
+ ((setq node (unless (Info-index-node)
+ (Info-get-token (point) "\\* +" "\\* +\\([^:]*\\)::")))
(Info-goto-node node fork))
;; menu item: node name or index entry
((Info-get-token (point) "\\* +" "\\* +\\(.*\\): ")
@@ -4929,7 +4938,9 @@ Info-fontify-node
(let ((n 0)
cont)
(while (re-search-forward
- (concat "^\\* Menu:\\|\\(?:^\\* +\\(" Info-menu-entry-name-re "\\)\\(:"
+ (concat "^\\* Menu:\\|\\(?:^\\* +\\(" (if (Info-index-node)
+ Info-index-entry-name-re
+ Info-menu-entry-name-re) "\\)\\(:"
Info-node-spec-re "\\([ \t]*\\)\\)\\)")
nil t)
(when (match-beginning 1)
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#34023
; Package
emacs
.
(Fri, 11 Jan 2019 00:29:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 34023 <at> debbugs.gnu.org (full text, mbox):
> The Emacs Info mode supports single colons in index
> entries as long as they are not followed by a space.
I thought they were verboten altogether. Does this
mean that we can finally have index entries such as
`:type'? That would be good.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#34023
; Package
emacs
.
(Fri, 11 Jan 2019 00:54:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 34023 <at> debbugs.gnu.org (full text, mbox):
Gavin Smith wrote:
> This is what is implemented in the standalone info browser (since
> change on 2017-04-08).
"Defining the Entries of an Index" in the Texinfo manual continues to
say (through Texinfo 6.5.90) "Caution: Do not use a colon in an index entry".
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#34023
; Package
emacs
.
(Fri, 11 Jan 2019 19:49:02 GMT)
Full text and
rfc822 format available.
Message #17 received at 34023 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Fri, Jan 11, 2019 at 07:46:31PM +0000, Gavin Smith wrote:
> I've attached a file that includes different possibilities.
Attaching file.
[index-test-cases.info (text/plain, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#34023
; Package
emacs
.
(Fri, 11 Jan 2019 19:53:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 34023 <at> debbugs.gnu.org (full text, mbox):
On Fri, Jan 11, 2019 at 02:04:32AM +0200, Juri Linkov wrote:
> The following patch handles the cases that you presented,
> but it's hard to predict what other cases it might break.
>
> Do you have a sample test file that covers different cases?
> We could add such file to Emacs regression tests.
I've attached a file that includes different possibilities.
> I have to say that the current regexp-based parsing is
> an inherently fragile approach. Do you think it would be possible
> to add more markup to Info files instead of relying on regexps?
I don't understand. Whatever markup is added has to be read somehow,
with regexp or other.
> Better yet would be to read Info manual in HTML format in Info reader.
> That would allow extracting all information unambiguously.
That would be a different project with several unresolved questions; this
could be the way forward in the long term. I would be opposed to making
the standalone info program read HTML as this would be a complete
rewrite of the program and there are probably better ways of dealing
with it.
> diff --git a/lisp/info.el b/lisp/info.el
> index 6038273c37..2f7e293297 100644
> --- a/lisp/info.el
> +++ b/lisp/info.el
> @@ -2664,9 +2664,15 @@ Info-menu-entry-name-re
> Because of ambiguities, this should be concatenated with something like
> `:' and `Info-following-node-name-re'.")
>
> +(defconst Info-index-entry-name-re "\\(?:[^:]\\|:[^,.;() \t\n]\\)*"
> + "Regexp that matches an index entry name possibly including a colon.")
> +
> (defun Info-extract-menu-node-name (&optional multi-line index-node)
> (skip-chars-forward " \t\n")
> - (when (looking-at (concat Info-menu-entry-name-re ":\\(:\\|"
> + (when (looking-at (concat (if index-node
> + Info-index-entry-name-re
> + Info-menu-entry-name-re
> + ) ":\\(:\\|"
> (Info-following-node-name-re
> (cond
> (index-node "^,\t\n")
> @@ -2741,7 +2747,9 @@ Info-complete-menu-item
> (t
> (let ((pattern (concat "\n\\* +\\("
> (regexp-quote string)
> - Info-menu-entry-name-re "\\):"
> + (if (Info-index-node)
> + Info-index-entry-name-re
> + Info-menu-entry-name-re) "\\):"
> Info-node-spec-re))
> completions
> (complete-nodes Info-complete-nodes))
> @@ -3966,7 +3974,8 @@ Info-try-follow-nearest-node
> (setq node t))
> (setq node nil))))
> ;; menu item: node name
> - ((setq node (Info-get-token (point) "\\* +" "\\* +\\([^:]*\\)::"))
> + ((setq node (unless (Info-index-node)
> + (Info-get-token (point) "\\* +" "\\* +\\([^:]*\\)::")))
> (Info-goto-node node fork))
> ;; menu item: node name or index entry
> ((Info-get-token (point) "\\* +" "\\* +\\(.*\\): ")
> @@ -4929,7 +4938,9 @@ Info-fontify-node
> (let ((n 0)
> cont)
> (while (re-search-forward
> - (concat "^\\* Menu:\\|\\(?:^\\* +\\(" Info-menu-entry-name-re "\\)\\(:"
> + (concat "^\\* Menu:\\|\\(?:^\\* +\\(" (if (Info-index-node)
> + Info-index-entry-name-re
> + Info-menu-entry-name-re) "\\)\\(:"
> Info-node-spec-re "\\([ \t]*\\)\\)\\)")
> nil t)
> (when (match-beginning 1)
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#34023
; Package
emacs
.
(Fri, 11 Jan 2019 20:12:01 GMT)
Full text and
rfc822 format available.
Message #23 received at 34023 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Thu, Jan 10, 2019 at 07:53:52PM -0500, Glenn Morris wrote:
> Gavin Smith wrote:
>
> > This is what is implemented in the standalone info browser (since
> > change on 2017-04-08).
>
> "Defining the Entries of an Index" in the Texinfo manual continues to
> say (through Texinfo 6.5.90) "Caution: Do not use a colon in an index entry".
Even if Info mode and the standalone Info browser are changed to
support colons in index entries, people running older versions of these
won't be able to read them. However, texi2any does output the colon in
the index entry without complaint. See attached Texinfo input and Info
output. Newer versions of 'info' can deal with the colons in the index
entries that are output here.
[colon-index.info (text/plain, attachment)]
[colon-index.texi (application/x-texinfo, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#34023
; Package
emacs
.
(Fri, 11 Jan 2019 20:14:02 GMT)
Full text and
rfc822 format available.
Message #26 received at 34023 <at> debbugs.gnu.org (full text, mbox):
On Fri, Jan 11, 2019 at 08:13:23PM +0000, Gavin Smith wrote:
> On Thu, Jan 10, 2019 at 07:53:52PM -0500, Glenn Morris wrote:
> > Gavin Smith wrote:
> >
> > > This is what is implemented in the standalone info browser (since
> > > change on 2017-04-08).
> >
> > "Defining the Entries of an Index" in the Texinfo manual continues to
> > say (through Texinfo 6.5.90) "Caution: Do not use a colon in an index entry".
>
> Even if Info mode and the standalone Info browser are changed to
> support colons in index entries, people running older versions of these
> won't be able to read them. However, texi2any does output the colon in
> the index entry without complaint. See attached Texinfo input and Info
> output. Newer versions of 'info' can deal with the colons in the index
> entries that are output here.
>
There should still be a warning about this in the Texinfo manual, but it
could be toned down.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#34023
; Package
emacs
.
(Fri, 11 Jan 2019 20:33:01 GMT)
Full text and
rfc822 format available.
Message #29 received at 34023 <at> debbugs.gnu.org (full text, mbox):
Gavin Smith wrote:
> Even if Info mode and the standalone Info browser are changed to
> support colons in index entries, people running older versions of these
> won't be able to read them.
Sure. However, if Texinfo is intending to support them from version X,
IMO it should document that.
> However, texi2any does output the colon in the index entry without
> complaint.
Personally I think this is a bug, but Texinfo's previous maintainer
disagreed about what warnings were appropriate.
http://lists.gnu.org/r/bug-texinfo/2014-02/msg00029.html
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#34023
; Package
emacs
.
(Sun, 13 Jan 2019 03:05:03 GMT)
Full text and
rfc822 format available.
Message #32 received at 34023 <at> debbugs.gnu.org (full text, mbox):
>> The following patch handles the cases that you presented,
>> but it's hard to predict what other cases it might break.
>>
>> Do you have a sample test file that covers different cases?
>> We could add such file to Emacs regression tests.
>
> I've attached a file that includes different possibilities.
Thanks.
>> I have to say that the current regexp-based parsing is
>> an inherently fragile approach. Do you think it would be possible
>> to add more markup to Info files instead of relying on regexps?
>
> I don't understand. Whatever markup is added has to be read somehow,
> with regexp or other.
This is a hint for using more XML-like markup languages with more
reliable parsing.
>> Better yet would be to read Info manual in HTML format in Info reader.
>> That would allow extracting all information unambiguously.
>
> That would be a different project with several unresolved questions; this
> could be the way forward in the long term. I would be opposed to making
> the standalone info program read HTML as this would be a complete
> rewrite of the program and there are probably better ways of dealing
> with it.
Maybe not rewrite, but just adding a HTML "add-on" to the info program.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#34023
; Package
emacs
.
(Wed, 16 Jan 2019 19:17:02 GMT)
Full text and
rfc822 format available.
Message #35 received at 34023 <at> debbugs.gnu.org (full text, mbox):
On Fri, Jan 11, 2019 at 03:32:35PM -0500, Glenn Morris wrote:
> Gavin Smith wrote:
>
> > Even if Info mode and the standalone Info browser are changed to
> > support colons in index entries, people running older versions of these
> > won't be able to read them.
>
> Sure. However, if Texinfo is intending to support them from version X,
> IMO it should document that.
I changed the wording a bit in git revision 3381bcb.
This bug report was last modified 6 years and 158 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.