GNU bug report logs - #48149
27.2; Wrong underline width when the line char has a width of 2

Previous Next

Package: org-mode;

Reported by: Shingo Tanaka <shingo.fg8 <at> gmail.com>

Date: Sun, 2 May 2021 01:13:02 UTC

Severity: normal

To reply to this bug, email your comments to 48149 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#48149; Package emacs. (Sun, 02 May 2021 01:13:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Shingo Tanaka <shingo.fg8 <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sun, 02 May 2021 01:13:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Shingo Tanaka <shingo.fg8 <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 27.2; Wrong underline width when the line char has a width of 2
Date: Sun, 02 May 2021 10:12:43 +0900
[Message part 1 (text/plain, inline)]
Hi,

When exporting org-mode document to plain text (either ascii/unicode/utf-8)
with `org-export-dispatch', Emacs inserts lines under headlines, inline
tasks, table rows and titles of the document, TOC, list of listings, list of
tables and footnotes.  The problem is it inserts too long (double width) line
when the line character has a width of 2.

Those lines are made of 3 types of characters below (in ox-ascii.el):
1) org-ascii-underline
2) (if (eq (plist-get info :ascii-charset) 'utf-8) ?─ ?_)
3) (if utf8p ?━ ?_)

In case of 1), it correctly takes account of the case in which the character
has a width of 2 in `org-ascii--build-title', by dividing the line width by
`(char-width under-char)' (line 700-701), maybe because the character is user
configurable and its width in unknown.  However, in case of 2) and
3), maybe because the characters is embedded in the code, it looks like only
considering the character always has a width of 1.  But the reality is
character ?─ or ?━ can have a width of 2 in the screen displayed with some
fonts (ex. "Noto Sans Mono CJK JP"), and in that case the line width gets
doubled of the expected width.

Attached one is a potential patch.  The basic concepts are:

a) Do the same in case of 2) and 3) as in case of 1)
   (dividing the line width by `(char-width under-char)',
    assuming `char-width-table' is correctly set)
    
b) Prefer the longer line width if the width is odd, even in case of 1)
   (adding `(1- (char-width under-char))' to dividend,
    just because it should be more beautiful ;-) )

Regards,
---
Shingo Tanaka
[ox-ascii.el.patch (application/octet-stream, attachment)]

bug reassigned from package 'emacs' to 'org-mode,emacs'. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Sun, 02 May 2021 06:58:02 GMT) Full text and rfc822 format available.

bug No longer marked as found in versions 27.2. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Sun, 02 May 2021 06:58:02 GMT) Full text and rfc822 format available.

Information forwarded to emacs-orgmode <at> gnu.org, bug-gnu-emacs <at> gnu.org:
bug#48149; Package org-mode,emacs. (Sun, 02 May 2021 07:19:02 GMT) Full text and rfc822 format available.

Message #12 received at 48149 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Shingo Tanaka <shingo.fg8 <at> gmail.com>
Cc: 48149 <at> debbugs.gnu.org
Subject: Re: bug#48149: 27.2;
 Wrong underline width when the line char has a width of 2
Date: Sun, 02 May 2021 10:17:53 +0300
> Date: Sun, 02 May 2021 10:12:43 +0900
> From: Shingo Tanaka <shingo.fg8 <at> gmail.com>
> 
> When exporting org-mode document to plain text (either ascii/unicode/utf-8)
> with `org-export-dispatch', Emacs inserts lines under headlines, inline
> tasks, table rows and titles of the document, TOC, list of listings, list of
> tables and footnotes.  The problem is it inserts too long (double width) line
> when the line character has a width of 2.
> 
> Those lines are made of 3 types of characters below (in ox-ascii.el):
> 1) org-ascii-underline
> 2) (if (eq (plist-get info :ascii-charset) 'utf-8) ?─ ?_)
> 3) (if utf8p ?━ ?_)
> 
> In case of 1), it correctly takes account of the case in which the character
> has a width of 2 in `org-ascii--build-title', by dividing the line width by
> `(char-width under-char)' (line 700-701), maybe because the character is user
> configurable and its width in unknown.  However, in case of 2) and
> 3), maybe because the characters is embedded in the code, it looks like only
> considering the character always has a width of 1.  But the reality is
> character ?─ or ?━ can have a width of 2 in the screen displayed with some
> fonts (ex. "Noto Sans Mono CJK JP"), and in that case the line width gets
> doubled of the expected width.
> 
> Attached one is a potential patch.  The basic concepts are:
> 
> a) Do the same in case of 2) and 3) as in case of 1)
>    (dividing the line width by `(char-width under-char)',
>     assuming `char-width-table' is correctly set)
>     
> b) Prefer the longer line width if the width is odd, even in case of 1)
>    (adding `(1- (char-width under-char))' to dividend,
>     just because it should be more beautiful ;-) )

You reported a similar bug already, and I replied there that TRT in
these cases is to use window-text-pixel-size, which will automatically
account for the actual width on display of any characters and any
fonts specified for displaying them.  char-width is an approximation,
and is accurate only on TTY frames.




Information forwarded to emacs-orgmode <at> gnu.org, bug-gnu-emacs <at> gnu.org:
bug#48149; Package org-mode,emacs. (Sun, 02 May 2021 08:40:01 GMT) Full text and rfc822 format available.

Message #15 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Rudolf Schlatte <rudi <at> constantly.at>
To: bug-gnu-emacs <at> gnu.org
Subject: Re: bug#48149: 27.2;
 Wrong underline width when the line char has a width of 2
Date: Sun, 02 May 2021 10:36:14 +0200
Eli Zaretskii <eliz <at> gnu.org> writes:

>> Date: Sun, 02 May 2021 10:12:43 +0900
>> From: Shingo Tanaka <shingo.fg8 <at> gmail.com>
>> 
>> When exporting org-mode document to plain text (either ascii/unicode/utf-8)
>> with `org-export-dispatch', Emacs inserts lines under headlines, inline
>> tasks, table rows and titles of the document, TOC, list of listings, list of
>> tables and footnotes.  The problem is it inserts too long (double width) line
>> when the line character has a width of 2.
>> 
>> Those lines are made of 3 types of characters below (in ox-ascii.el):
>> 1) org-ascii-underline
>> 2) (if (eq (plist-get info :ascii-charset) 'utf-8) ?─ ?_)
>> 3) (if utf8p ?━ ?_)
>> 
>> In case of 1), it correctly takes account of the case in which the character
>> has a width of 2 in `org-ascii--build-title', by dividing the line width by
>> `(char-width under-char)' (line 700-701), maybe because the character is user
>> configurable and its width in unknown.  However, in case of 2) and
>> 3), maybe because the characters is embedded in the code, it looks like only
>> considering the character always has a width of 1.  But the reality is
>> character ?─ or ?━ can have a width of 2 in the screen displayed with some
>> fonts (ex. "Noto Sans Mono CJK JP"), and in that case the line width gets
>> doubled of the expected width.
>> 
>> Attached one is a potential patch.  The basic concepts are:
>> 
>> a) Do the same in case of 2) and 3) as in case of 1)
>>    (dividing the line width by `(char-width under-char)',
>>     assuming `char-width-table' is correctly set)
>>     
>> b) Prefer the longer line width if the width is odd, even in case of 1)
>>    (adding `(1- (char-width under-char))' to dividend,
>>     just because it should be more beautiful ;-) )
>
> You reported a similar bug already, and I replied there that TRT in
> these cases is to use window-text-pixel-size, which will automatically
> account for the actual width on display of any characters and any
> fonts specified for displaying them.  char-width is an approximation,
> and is accurate only on TTY frames.

Isn't the primary result of org-export a plain (UTF-8) text file,
instead of an emacs buffer to be displayed in a GUI or TTY frame?

If so, maybe the criterion for correctness should be that "cat
filename.txt" looks as expected in a terminal, even if opening that file
in Emacs shows lines of different lengths due to variable-pitch faces
etc.





Information forwarded to emacs-orgmode <at> gnu.org, bug-gnu-emacs <at> gnu.org:
bug#48149; Package org-mode,emacs. (Sun, 02 May 2021 09:17:02 GMT) Full text and rfc822 format available.

Message #18 received at 48149 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Rudolf Schlatte <rudi <at> constantly.at>
Cc: 48149 <at> debbugs.gnu.org
Subject: Re: bug#48149: 27.2;
 Wrong underline width when the line char has a width of 2
Date: Sun, 02 May 2021 12:16:32 +0300
> From: Rudolf Schlatte <rudi <at> constantly.at>
> Date: Sun, 02 May 2021 10:36:14 +0200
> 
> > You reported a similar bug already, and I replied there that TRT in
> > these cases is to use window-text-pixel-size, which will automatically
> > account for the actual width on display of any characters and any
> > fonts specified for displaying them.  char-width is an approximation,
> > and is accurate only on TTY frames.
> 
> Isn't the primary result of org-export a plain (UTF-8) text file,
> instead of an emacs buffer to be displayed in a GUI or TTY frame?
> 
> If so, maybe the criterion for correctness should be that "cat
> filename.txt" looks as expected in a terminal, even if opening that file
> in Emacs shows lines of different lengths due to variable-pitch faces
> etc.

If the result is supposed to be displayed only on text-mode terminals,
then indeed string-width is the way to go (assuming that the terminal
in question will use fonts that will not break the alignment).
However, if the result is supposed to be displayed by a GUI program
such as Emacs, then string-width will not produce accurate results.

Maybe this is not important in this kind of export, in which case I
apologize for the noise.




Information forwarded to emacs-orgmode <at> gnu.org, bug-gnu-emacs <at> gnu.org:
bug#48149; Package org-mode,emacs. (Sun, 02 May 2021 16:10:03 GMT) Full text and rfc822 format available.

Message #21 received at 48149 <at> debbugs.gnu.org (full text, mbox):

From: Nicolas Goaziou <mail <at> nicolasgoaziou.fr>
To: Shingo Tanaka <shingo.fg8 <at> gmail.com>
Cc: 48149 <at> debbugs.gnu.org
Subject: Re: bug#48149: 27.2; Wrong underline width when the line char has a
 width of 2
Date: Sun, 02 May 2021 18:08:50 +0200
Hello,

Shingo Tanaka <shingo.fg8 <at> gmail.com> writes:

> When exporting org-mode document to plain text (either ascii/unicode/utf-8)
> with `org-export-dispatch', Emacs inserts lines under headlines, inline
> tasks, table rows and titles of the document, TOC, list of listings, list of
> tables and footnotes.  The problem is it inserts too long (double width) line
> when the line character has a width of 2.
>
> Those lines are made of 3 types of characters below (in ox-ascii.el):
> 1) org-ascii-underline
> 2) (if (eq (plist-get info :ascii-charset) 'utf-8) ?─ ?_)
> 3) (if utf8p ?━ ?_)
>
> In case of 1), it correctly takes account of the case in which the character
> has a width of 2 in `org-ascii--build-title', by dividing the line width by
> `(char-width under-char)' (line 700-701), maybe because the character is user
> configurable and its width in unknown.  However, in case of 2) and
> 3), maybe because the characters is embedded in the code, it looks like only
> considering the character always has a width of 1.  But the reality is
> character ?─ or ?━ can have a width of 2 in the screen displayed with some
> fonts (ex. "Noto Sans Mono CJK JP"), and in that case the line width gets
> doubled of the expected width.
>
> Attached one is a potential patch.  The basic concepts are:
>
> a) Do the same in case of 2) and 3) as in case of 1)
>    (dividing the line width by `(char-width under-char)',
>     assuming `char-width-table' is correctly set)
>     
> b) Prefer the longer line width if the width is odd, even in case of 1)
>    (adding `(1- (char-width under-char))' to dividend,
>     just because it should be more beautiful ;-) )

Thank you. This looks good. I cannot apply it on "maint" branch,
however. Also, a proper commit message would be nice. Could you send an
updated patch?

Moreover, have you signed FSF papers already? This is above limit for
tiny changes.

Regards,
-- 
Nicolas Goaziou




Information forwarded to emacs-orgmode <at> gnu.org, bug-gnu-emacs <at> gnu.org:
bug#48149; Package org-mode,emacs. (Sun, 02 May 2021 16:24:01 GMT) Full text and rfc822 format available.

Message #24 received at 48149 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Nicolas Goaziou <mail <at> nicolasgoaziou.fr>
Cc: shingo.fg8 <at> gmail.com, 48149 <at> debbugs.gnu.org
Subject: Re: bug#48149: 27.2;
 Wrong underline width when the line char has a width of 2
Date: Sun, 02 May 2021 19:23:02 +0300
> From: Nicolas Goaziou <mail <at> nicolasgoaziou.fr>
> Date: Sun, 02 May 2021 18:08:50 +0200
> Cc: 48149 <at> debbugs.gnu.org
> 
> > In case of 1), it correctly takes account of the case in which the character
> > has a width of 2 in `org-ascii--build-title', by dividing the line width by
> > `(char-width under-char)' (line 700-701), maybe because the character is user
> > configurable and its width in unknown.  However, in case of 2) and
> > 3), maybe because the characters is embedded in the code, it looks like only
> > considering the character always has a width of 1.  But the reality is
> > character ?─ or ?━ can have a width of 2 in the screen displayed with some
> > fonts (ex. "Noto Sans Mono CJK JP"), and in that case the line width gets
> > doubled of the expected width.
> >
> > Attached one is a potential patch.  The basic concepts are:
> >
> > a) Do the same in case of 2) and 3) as in case of 1)
> >    (dividing the line width by `(char-width under-char)',
> >     assuming `char-width-table' is correctly set)
> >     
> > b) Prefer the longer line width if the width is odd, even in case of 1)
> >    (adding `(1- (char-width under-char))' to dividend,
> >     just because it should be more beautiful ;-) )
> 
> Thank you. This looks good. I cannot apply it on "maint" branch,
> however. Also, a proper commit message would be nice. Could you send an
> updated patch?

Please note that using char-width cannot solve the problem of a
character whose width depends on the font, because char-width is
oblivious to fonts, it only knows about the character's codepoint.




Information forwarded to emacs-orgmode <at> gnu.org, bug-gnu-emacs <at> gnu.org:
bug#48149; Package org-mode,emacs. (Sun, 09 May 2021 13:58:01 GMT) Full text and rfc822 format available.

Message #27 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Shingo Tanaka <shingo.fg8 <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: bug-gnu-emacs <at> gnu.org, shingo.fg8 <at> gmail.com,
 Nicolas Goaziou <mail <at> nicolasgoaziou.fr>, 48149 <at> debbugs.gnu.org
Subject: Re: bug#48149: 27.2;
 Wrong underline width when the line char has a width of 2
Date: Sun, 09 May 2021 22:57:43 +0900
[Message part 1 (text/plain, inline)]
Hi,

> Please note that using char-width cannot solve the problem of a
> character whose width depends on the font, because char-width is
> oblivious to fonts, it only knows about the character's codepoint.

I updated my patch proposal as attached to use window-text-pixel-size based
on Eli's advice.  Could you check it to see if it meets the expectation?  It
works good in my environment with some fonts of different char widths.

Here are some comments:

- New internal functions org-ascii--make-string and org-ascii--pixel-width
  are introduced just to improve code readability of this modification
  
- Line width is decided by org-ascii--make-string, which is a pixel width
  based make-string

- org-ascii--make-string uses org-ascii--pixel-width, which returns
  actual pixel width of characters and strings by using
  window-text-pixel-size with frame default font

- Line justification is also modified to be a pixel width basis

Since this is not a simple modification, I think we might need further
improvement, so any feedback is appreciated.  Especially, we could do better
for table alignment, as that is not very easy because the pixel width of line
character and that of space character is not always the same.

Anyway, I appreciate it if you can give it a try.  I am doing FSF signing
process in parallel just in case.

---
Shingo Tanaka


On Mon, 03 May 2021 01:23:02 +0900,
Eli Zaretskii wrote:
> 
> > From: Nicolas Goaziou <mail <at> nicolasgoaziou.fr>
> > Date: Sun, 02 May 2021 18:08:50 +0200
> > Cc: 48149 <at> debbugs.gnu.org
> > 
> > > In case of 1), it correctly takes account of the case in which the character
> > > has a width of 2 in `org-ascii--build-title', by dividing the line width by
> > > `(char-width under-char)' (line 700-701), maybe because the character is user
> > > configurable and its width in unknown.  However, in case of 2) and
> > > 3), maybe because the characters is embedded in the code, it looks like only
> > > considering the character always has a width of 1.  But the reality is
> > > character ?─ or ?━ can have a width of 2 in the screen displayed with some
> > > fonts (ex. "Noto Sans Mono CJK JP"), and in that case the line width gets
> > > doubled of the expected width.
> > >
> > > Attached one is a potential patch.  The basic concepts are:
> > >
> > > a) Do the same in case of 2) and 3) as in case of 1)
> > >    (dividing the line width by `(char-width under-char)',
> > >     assuming `char-width-table' is correctly set)
> > >     
> > > b) Prefer the longer line width if the width is odd, even in case of 1)
> > >    (adding `(1- (char-width under-char))' to dividend,
> > >     just because it should be more beautiful ;-) )
> > 
> > Thank you. This looks good. I cannot apply it on "maint" branch,
> > however. Also, a proper commit message would be nice. Could you send an
> > updated patch?
> 
> Please note that using char-width cannot solve the problem of a
> character whose width depends on the font, because char-width is
> oblivious to fonts, it only knows about the character's codepoint.
[ox-ascii.el.patch (application/octet-stream, attachment)]

Information forwarded to emacs-orgmode <at> gnu.org, bug-gnu-emacs <at> gnu.org:
bug#48149; Package org-mode,emacs. (Sun, 09 May 2021 13:58:02 GMT) Full text and rfc822 format available.

bug reassigned from package 'org-mode,emacs' to 'org-mode'. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Fri, 01 Jul 2022 11:15:03 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 349 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.