GNU bug report logs - #23647
25.1.50; In man pages, links on hyphenated words don't work

Previous Next

Package: emacs;

Reported by: Stephen Berman <stephen.berman <at> gmx.net>

Date: Sun, 29 May 2016 09:53:01 UTC

Severity: minor

Tags: patch

Found in version 25.1.50

Done: Stephen Berman <stephen.berman <at> gmx.net>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 23647 in the body.
You can then email your comments to 23647 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#23647; Package emacs. (Sun, 29 May 2016 09:53:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Stephen Berman <stephen.berman <at> gmx.net>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sun, 29 May 2016 09:53:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Stephen Berman <stephen.berman <at> gmx.net>
To: bug-gnu-emacs <at> gnu.org
Subject: 25.1.50; In man pages, links on hyphenated words don't work
Date: Sun, 29 May 2016 11:52:29 +0200
O. emacs -Q
1. Open a man page that has a link on a hyphenated word, e.g. on my
   system: M-x man RET signal RET, put point on the word spanning lines
   129-130, which is displayed as `sig-
   nalfd(2)'.
2. Type RET (or click mouse-1 or mouse-2) on that link.
=> The error message "Can’t find the 2 sig-nalfd manpage" is displayed.

The following patch makes the link DTRT:

diff --git a/lisp/man.el b/lisp/man.el
index 5acf90b..5d4cacc 100644
--- a/lisp/man.el
+++ b/lisp/man.el
@@ -1430,8 +1430,14 @@ Man-bgproc-sentinel
 			(quit-restore-window
 			 (get-buffer-window (current-buffer) t) 'kill)
 		      (kill-buffer (current-buffer)))
-		    (message "Can't find the %s manpage"
-			     (Man-page-from-arguments args)))
+                    ;; Entries hyphenated due to the window width
+                    ;; won't be found in the man database, so remove
+                    ;; the hyphenation and look again.
+		    (if (string-match "-" args)
+			(let ((str (replace-match "" nil nil args)))
+			  (Man-getpage-in-background str))
+                      (message "Can't find the %s manpage"
+                               (Man-page-from-arguments args))))
 
 		(if Man-fontify-manpage-flag
 		    (message "%s man page formatted"

This is a long-standing bug (presumably since commit
162a12b1d7b1e985a8810bad24d068c825286f56 of Sep 13 2007), but although
the fix seems safe, I suppose it's too late for emacs-25.  So if there
are no objections, should I commit it to master, or is it ok for the
upcoming release?


In GNU Emacs 25.1.50.19 (x86_64-suse-linux-gnu, GTK+ Version 3.14.15)
 of 2016-05-28 built on rosalinde
Repository revision: 4ef0fc192b8a10625053dbb9376c814e68612eb6
Windowing system distributor 'The X.Org Foundation', version 11.0.11601000
System Description:	openSUSE 13.2 (Harlequin) (x86_64)

Configured using:
 'configure --with-xwidgets 'CFLAGS=-Og -g3''

Configured features:
XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK SOUND DBUS GCONF GSETTINGS NOTIFY
GNUTLS LIBXML2 FREETYPE M17N_FLT LIBOTF XFT ZLIB TOOLKIT_SCROLL_BARS
GTK3 X11 XWIDGETS

Important settings:
  value of $LANG: en_US.UTF-8
  value of $XMODIFIERS: @im=ibus
  locale-coding-system: utf-8-unix




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#23647; Package emacs. (Sun, 29 May 2016 14:43:02 GMT) Full text and rfc822 format available.

Message #8 received at 23647 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stephen Berman <stephen.berman <at> gmx.net>
Cc: 23647 <at> debbugs.gnu.org
Subject: Re: bug#23647: 25.1.50;
 In man pages, links on hyphenated words don't work
Date: Sun, 29 May 2016 17:42:13 +0300
> From: Stephen Berman <stephen.berman <at> gmx.net>
> Date: Sun, 29 May 2016 11:52:29 +0200
> 
> O. emacs -Q
> 1. Open a man page that has a link on a hyphenated word, e.g. on my
>    system: M-x man RET signal RET, put point on the word spanning lines
>    129-130, which is displayed as `sig-
>    nalfd(2)'.
> 2. Type RET (or click mouse-1 or mouse-2) on that link.
> => The error message "Can’t find the 2 sig-nalfd manpage" is displayed.
> 
> The following patch makes the link DTRT:
> 
> diff --git a/lisp/man.el b/lisp/man.el
> index 5acf90b..5d4cacc 100644
> --- a/lisp/man.el
> +++ b/lisp/man.el
> @@ -1430,8 +1430,14 @@ Man-bgproc-sentinel
>  			(quit-restore-window
>  			 (get-buffer-window (current-buffer) t) 'kill)
>  		      (kill-buffer (current-buffer)))
> -		    (message "Can't find the %s manpage"
> -			     (Man-page-from-arguments args)))
> +                    ;; Entries hyphenated due to the window width
> +                    ;; won't be found in the man database, so remove
> +                    ;; the hyphenation and look again.
> +		    (if (string-match "-" args)

Is it only the ASCII hyphen/minus, or could there be other characters
(e.g., if Groff/troff are invoked with some exotic -Tfoo switch)?

> This is a long-standing bug (presumably since commit
> 162a12b1d7b1e985a8810bad24d068c825286f56 of Sep 13 2007), but although
> the fix seems safe, I suppose it's too late for emacs-25.  So if there
> are no objections, should I commit it to master, or is it ok for the
> upcoming release?

Master, please.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#23647; Package emacs. (Sun, 29 May 2016 23:10:02 GMT) Full text and rfc822 format available.

Message #11 received at 23647 <at> debbugs.gnu.org (full text, mbox):

From: Stephen Berman <stephen.berman <at> gmx.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 23647 <at> debbugs.gnu.org
Subject: Re: bug#23647: 25.1.50;
 In man pages, links on hyphenated words don't work
Date: Mon, 30 May 2016 01:09:21 +0200
On Sun, 29 May 2016 17:42:13 +0300 Eli Zaretskii <eliz <at> gnu.org> wrote:

>> From: Stephen Berman <stephen.berman <at> gmx.net>
>> Date: Sun, 29 May 2016 11:52:29 +0200
>> 
>> O. emacs -Q
>> 1. Open a man page that has a link on a hyphenated word, e.g. on my
>>    system: M-x man RET signal RET, put point on the word spanning lines
>>    129-130, which is displayed as `sig-
>>    nalfd(2)'.
>> 2. Type RET (or click mouse-1 or mouse-2) on that link.
>> => The error message "Can’t find the 2 sig-nalfd manpage" is displayed.
>> 
>> The following patch makes the link DTRT:
>> 
>> diff --git a/lisp/man.el b/lisp/man.el
>> index 5acf90b..5d4cacc 100644
>> --- a/lisp/man.el
>> +++ b/lisp/man.el
>> @@ -1430,8 +1430,14 @@ Man-bgproc-sentinel
>>  			(quit-restore-window
>>  			 (get-buffer-window (current-buffer) t) 'kill)
>>  		      (kill-buffer (current-buffer)))
>> -		    (message "Can't find the %s manpage"
>> -			     (Man-page-from-arguments args)))
>> +                    ;; Entries hyphenated due to the window width
>> +                    ;; won't be found in the man database, so remove
>> +                    ;; the hyphenation and look again.
>> +		    (if (string-match "-" args)
>
> Is it only the ASCII hyphen/minus, or could there be other characters
> (e.g., if Groff/troff are invoked with some exotic -Tfoo switch)?

That possibility didn't occur to me but according to Wikipedia, groff
also outputs soft hyphens (octal 255) and indeed I see that the function
Man-build-references-alist, which also removes hyphenation (in a more
complicated way that doesn't seem to be needed in the present case),
also takes the soft hyphen into account.  That can be done here too by
changing the above string-match regexp to "[-­]".  If someone knows of
other possibilities allowed by [gt]roff, maybe the regexp could be
further extended, or the condition reformulated as required.  What do
you think?

>> This is a long-standing bug (presumably since commit
>> 162a12b1d7b1e985a8810bad24d068c825286f56 of Sep 13 2007), but although
>> the fix seems safe, I suppose it's too late for emacs-25.  So if there
>> are no objections, should I commit it to master, or is it ok for the
>> upcoming release?
>
> Master, please.

Ok.  I'll wait another day or two in case there's more feedback.
Thanks.

Steve Berman




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#23647; Package emacs. (Mon, 30 May 2016 00:23:01 GMT) Full text and rfc822 format available.

Message #14 received at 23647 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stephen Berman <stephen.berman <at> gmx.net>
Cc: 23647 <at> debbugs.gnu.org
Subject: Re: bug#23647: 25.1.50;
 In man pages, links on hyphenated words don't work
Date: Mon, 30 May 2016 03:22:58 +0300
> From: Stephen Berman <stephen.berman <at> gmx.net>
> Cc: 23647 <at> debbugs.gnu.org
> Date: Mon, 30 May 2016 01:09:21 +0200
> 
> > Is it only the ASCII hyphen/minus, or could there be other characters
> > (e.g., if Groff/troff are invoked with some exotic -Tfoo switch)?
> 
> That possibility didn't occur to me but according to Wikipedia, groff
> also outputs soft hyphens (octal 255) and indeed I see that the function
> Man-build-references-alist, which also removes hyphenation (in a more
> complicated way that doesn't seem to be needed in the present case),
> also takes the soft hyphen into account.  That can be done here too by
> changing the above string-match regexp to "[-­]".  If someone knows of
> other possibilities allowed by [gt]roff, maybe the regexp could be
> further extended, or the condition reformulated as required.  What do
> you think?

I'm not enough of a roff expert to tell, but how about asking on the
Groff list?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#23647; Package emacs. (Mon, 30 May 2016 13:56:02 GMT) Full text and rfc822 format available.

Message #17 received at 23647 <at> debbugs.gnu.org (full text, mbox):

From: Stephen Berman <stephen.berman <at> gmx.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 23647 <at> debbugs.gnu.org
Subject: Re: bug#23647: 25.1.50;
 In man pages, links on hyphenated words don't work
Date: Mon, 30 May 2016 15:55:47 +0200
On Mon, 30 May 2016 03:22:58 +0300 Eli Zaretskii <eliz <at> gnu.org> wrote:

>> From: Stephen Berman <stephen.berman <at> gmx.net>
>> Cc: 23647 <at> debbugs.gnu.org
>> Date: Mon, 30 May 2016 01:09:21 +0200
>> 
>> > Is it only the ASCII hyphen/minus, or could there be other characters
>> > (e.g., if Groff/troff are invoked with some exotic -Tfoo switch)?
>> 
>> That possibility didn't occur to me but according to Wikipedia, groff
>> also outputs soft hyphens (octal 255) and indeed I see that the function
>> Man-build-references-alist, which also removes hyphenation (in a more
>> complicated way that doesn't seem to be needed in the present case),
>> also takes the soft hyphen into account.  That can be done here too by
>> changing the above string-match regexp to "[-­]".  If someone knows of
>> other possibilities allowed by [gt]roff, maybe the regexp could be
>> further extended, or the condition reformulated as required.  What do
>> you think?
>
> I'm not enough of a roff expert to tell, but how about asking on the
> Groff list?

I did that and got this feedback from Steffen Nurpmeso:

> I have been convinced that soft hyphen is a control character and
> not something visual, it should be used as a «break-indicator»
> rather than as a hyphenation character, interpretation of which is
> left as an excercise for the processing software.  I have no idea
> still but would guess groff uses "hyphen minus" U+002D or hyphen
> U+2010 if Unicode is possible.

In a followup to another response he added:

> For display purposes however i think U+00AD can't be used
> directly, but will be replaced by the renderer to either nothing,
> if no wrap is to be applied at the character position, or
> something appropriate, like ASCII hyphen-minus or some extended
> Unicode "Pd" letter, of which there are some (e.g., U+058A
> ARMENIAN HYPHEN, U+1400 CANADIAN SYLLABICS HYPHEN, and more).

And he also made this suggestion:

> Eli Zaretskii is so active on the
> Unicode list, why don't you use the Pd character class for
> detecting «hyphen»?  I guess this should cover all such things
> already as of today, thanks to Werner Lemberg?!

So how should we proceed from here?  We could add U+2010 to the regexp
in my patch, which would then be this: "[-‐­]" (hyphen-minus (ASCII 45),
hyphen (U+2010), soft hyphen (U+00AD) -- it seems harmless to retain the
latter, given that man.el already uses it elsewhere), but if these are
all included in the Unicode Pd character class along with other possible
hyphen characters, maybe a different approach is required.  I know
nothing about the Pd character class and how to detect it with Elisp; I
also don't know if doing that would lead to further changes in man.el,
making this a larger undertaking.  What do you suggest?

Steve Berman




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#23647; Package emacs. (Sat, 04 Jun 2016 15:36:01 GMT) Full text and rfc822 format available.

Message #20 received at 23647 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stephen Berman <stephen.berman <at> gmx.net>
Cc: 23647 <at> debbugs.gnu.org
Subject: Re: bug#23647: 25.1.50;
 In man pages, links on hyphenated words don't work
Date: Sat, 04 Jun 2016 18:35:46 +0300
> From: Stephen Berman <stephen.berman <at> gmx.net>
> Cc: 23647 <at> debbugs.gnu.org
> Date: Mon, 30 May 2016 15:55:47 +0200
> 
> > I'm not enough of a roff expert to tell, but how about asking on the
> > Groff list?
> 
> I did that and got this feedback from Steffen Nurpmeso:
> 
> > I have been convinced that soft hyphen is a control character and
> > not something visual, it should be used as a «break-indicator»
> > rather than as a hyphenation character, interpretation of which is
> > left as an excercise for the processing software.  I have no idea
> > still but would guess groff uses "hyphen minus" U+002D or hyphen
> > U+2010 if Unicode is possible.
> 
> In a followup to another response he added:
> 
> > For display purposes however i think U+00AD can't be used
> > directly, but will be replaced by the renderer to either nothing,
> > if no wrap is to be applied at the character position, or
> > something appropriate, like ASCII hyphen-minus or some extended
> > Unicode "Pd" letter, of which there are some (e.g., U+058A
> > ARMENIAN HYPHEN, U+1400 CANADIAN SYLLABICS HYPHEN, and more).
> 
> And he also made this suggestion:
> 
> > Eli Zaretskii is so active on the
> > Unicode list, why don't you use the Pd character class for
> > detecting «hyphen»?  I guess this should cover all such things
> > already as of today, thanks to Werner Lemberg?!
> 
> So how should we proceed from here?  We could add U+2010 to the regexp
> in my patch, which would then be this: "[-‐­]" (hyphen-minus (ASCII 45),
> hyphen (U+2010), soft hyphen (U+00AD) -- it seems harmless to retain the
> latter, given that man.el already uses it elsewhere), but if these are
> all included in the Unicode Pd character class along with other possible
> hyphen characters, maybe a different approach is required.  I know
> nothing about the Pd character class and how to detect it with Elisp; I
> also don't know if doing that would lead to further changes in man.el,
> making this a larger undertaking.  What do you suggest?

I'd go with just those 3, I think the others will not be produced by
Groff.

Thanks.




Reply sent to Stephen Berman <stephen.berman <at> gmx.net>:
You have taken responsibility. (Sun, 05 Jun 2016 11:19:02 GMT) Full text and rfc822 format available.

Notification sent to Stephen Berman <stephen.berman <at> gmx.net>:
bug acknowledged by developer. (Sun, 05 Jun 2016 11:19:02 GMT) Full text and rfc822 format available.

Message #25 received at 23647-done <at> debbugs.gnu.org (full text, mbox):

From: Stephen Berman <stephen.berman <at> gmx.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 23647-done <at> debbugs.gnu.org
Subject: Re: bug#23647: 25.1.50;
 In man pages, links on hyphenated words don't work
Date: Sun, 05 Jun 2016 13:17:59 +0200
On Sat, 04 Jun 2016 18:35:46 +0300 Eli Zaretskii <eliz <at> gnu.org> wrote:

>> So how should we proceed from here?  We could add U+2010 to the regexp
>> in my patch, which would then be this: "[-‐­]" (hyphen-minus (ASCII 45),
>> hyphen (U+2010), soft hyphen (U+00AD) -- it seems harmless to retain the
>> latter, given that man.el already uses it elsewhere), but if these are
>> all included in the Unicode Pd character class along with other possible
>> hyphen characters, maybe a different approach is required.  I know
>> nothing about the Pd character class and how to detect it with Elisp; I
>> also don't know if doing that would lead to further changes in man.el,
>> making this a larger undertaking.  What do you suggest?
>
> I'd go with just those 3, I think the others will not be produced by
> Groff.

Done in commit 75de364 on master, and closing the bug.

Steve Berman




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 03 Jul 2016 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 8 years and 355 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.