GNU bug report logs -
#27978
Detection of section name in man.el
Previous Next
Full log
Message #10 received at 27978-done <at> debbugs.gnu.org (full text, mbox):
> From: Grégory Mounié
> <Gregory.Mounie <at> imag.fr>
> Date: Sun, 6 Aug 2017 01:44:19 +0200
>
> When parsing manual in languages with non-ascii letters, the section
> names using non-ascii letters are not added to the table of content.
>
> I noticed the bug reading the French bash manual: the quite useful
> "COMMANDES INTERNES DE l'INTERPRÉTEUR" section does not appear (SHELL
> BUILTIN COMMAND). (because of the É letter)
>
> I propose to use Character class instead of ascii interval in the
> appropriate regexp defvar. It should not change anything for english
> manual and it should work for many other languages.
Thanks, I pushed these changes with some minor adjustments.
Specifically:
> -(defvar Man-section-regexp "[0-9][a-zA-Z0-9+]*\\|[LNln]"
> +(defvar Man-section-regexp "[[:digit:]][[:alnum:]+]*\\|[LNln]"
> "Regular expression describing a manpage section within parentheses.")
I didn't change this one, because I think a section always uses only
ASCII letters and numbers, as in ".1n". If you disagree, can you show
an example where this is not so?
> -(defvar Man-heading-regexp "^\\([A-Z][A-Z0-9 /-]+\\)$"
> +(defvar Man-heading-regexp "^\\([[:upper:]][[:upper:][:digit:] /-]+\\)$"
> "Regular expression describing a manpage heading entry.")
I see no reason to replace 0-9 with [:digit:] here, since I think
non-ASCII digits will never be used in this context. Do you agree?
Incidentally, I see quite a few similar regexps elsewhere in man.el,
did you audit all of them and established that they don't need similar
changes? If not, would you like to propose similar changes there?
This bug report was last modified 7 years and 274 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.