GNU bug report logs - #48148
27.2; ox-ascii breaks TITLE line wrongly when 2 width char is used

Previous Next

Package: org-mode;

Reported by: Shingo Tanaka <shingo.fg8 <at> gmail.com>

Date: Sat, 1 May 2021 23:53:02 UTC

Severity: normal

Found in version 27.2

Done: Nicolas Goaziou <mail <at> nicolasgoaziou.fr>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 48148 in the body.
You can then email your comments to 48148 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#48148; Package emacs. (Sat, 01 May 2021 23:53:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Shingo Tanaka <shingo.fg8 <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sat, 01 May 2021 23:53:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Shingo Tanaka <shingo.fg8 <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 27.2;  ox-ascii breaks TITLE line wrongly when 2 width char is used
Date: Sun, 02 May 2021 08:52:13 +0900
Hi,

When exporting org-mode document to plain text (either ascii/unicode/utf-8)
with `org-export-dispatch', Emacs translates the document title with
`org-ascii-template--document-title'.  However, when 2 width character is
used, it detects the title line's width wrongly and breaks it even if the
width is not too long.

For example, when the title is "ABCDEF" (each character has width of
2), expected title would be like:

                     ━━━━━━━━━━━━━━━
                              ABCDEF
                     ━━━━━━━━━━━━━━━

However, the reality is:

                     ━━━━━━━━━━━━━━━
                                 ABC
                                 DEF
                     ━━━━━━━━━━━━━━━
                     
This is because it uses `length' to detects the width, which only returns the
number of characters (6 in this case) but not the actual width displayed (12
in this case), and it tries to fill the line with that half width.
`string-width' should be used instead.

Here is a potential patch.

--- ox-ascii.el.org	2021-03-26 09:28:44.000000000 +0900
+++ ox-ascii.el	2021-05-02 08:11:57.657347150 +0900
@@ -1033,7 +1033,7 @@
 	     ;; Format TITLE.  It may be filled if it is too wide,
 	     ;; that is wider than the two thirds of the total width.
 	     (title-len (min (apply #'max
-				    (mapcar #'length
+				    (mapcar #'string-width
 					    (org-split-string
 					     (concat title "\n" subtitle) "\n")))
 			     (/ (* 2 text-width) 3)))

---
Shingo Tanaka




Information forwarded to emacs-orgmode <at> gnu.org:
bug#48148; Package org-mode. (Sun, 02 May 2021 07:05:02 GMT) Full text and rfc822 format available.

Message #8 received at 48148 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Shingo Tanaka <shingo.fg8 <at> gmail.com>
Cc: 48148 <at> debbugs.gnu.org
Subject: Re: bug#48148: 27.2;
 ox-ascii breaks TITLE line wrongly when 2 width char is used
Date: Sun, 02 May 2021 10:03:36 +0300
> Date: Sun, 02 May 2021 08:52:13 +0900
> From: Shingo Tanaka <shingo.fg8 <at> gmail.com>
> 
> For example, when the title is "ABCDEF" (each character has width of
> 2), expected title would be like:
> 
>                      ━━━━━━━━━━━━━━━
>                               ABCDEF
>                      ━━━━━━━━━━━━━━━
> 
> However, the reality is:
> 
>                      ━━━━━━━━━━━━━━━
>                                  ABC
>                                  DEF
>                      ━━━━━━━━━━━━━━━
>                      
> This is because it uses `length' to detects the width, which only returns the
> number of characters (6 in this case) but not the actual width displayed (12
> in this case), and it tries to fill the line with that half width.
> `string-width' should be used instead.
> 
> Here is a potential patch.
> 
> --- ox-ascii.el.org	2021-03-26 09:28:44.000000000 +0900
> +++ ox-ascii.el	2021-05-02 08:11:57.657347150 +0900
> @@ -1033,7 +1033,7 @@
>  	     ;; Format TITLE.  It may be filled if it is too wide,
>  	     ;; that is wider than the two thirds of the total width.
>  	     (title-len (min (apply #'max
> -				    (mapcar #'length
> +				    (mapcar #'string-width
>  					    (org-split-string
>  					     (concat title "\n" subtitle) "\n")))
>  			     (/ (* 2 text-width) 3)))

Thanks, but the change you propose will not work reliably on GUI
frames, because the actual width of double-width characters on display
is not necessarily twice the width of a "normal" character.
Especially if this is done in a non-CJK locale, where the default font
is likely to be different from the font used for double-width
characters.

The accurate method of lining up in these cases is to use
window-text-pixel-size instead.  That function will return the exact
width of a string as it will displayed, in pixels, because it uses the
same code as the display engine.





Information forwarded to emacs-orgmode <at> gnu.org:
bug#48148; Package org-mode. (Sun, 02 May 2021 08:24:02 GMT) Full text and rfc822 format available.

Message #11 received at 48148 <at> debbugs.gnu.org (full text, mbox):

From: Nicolas Goaziou <mail <at> nicolasgoaziou.fr>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Shingo Tanaka <shingo.fg8 <at> gmail.com>, 48148 <at> debbugs.gnu.org
Subject: Re: bug#48148: 27.2; ox-ascii breaks TITLE line wrongly when 2
 width char is used
Date: Sun, 02 May 2021 10:23:34 +0200
Hello,

Eli Zaretskii <eliz <at> gnu.org> writes:

>> Date: Sun, 02 May 2021 08:52:13 +0900
>> From: Shingo Tanaka <shingo.fg8 <at> gmail.com>
>> 
>> For example, when the title is "ABCDEF" (each character has width of
>> 2), expected title would be like:
>> 
>>                      ━━━━━━━━━━━━━━━
>>                               ABCDEF
>>                      ━━━━━━━━━━━━━━━
>> 
>> However, the reality is:
>> 
>>                      ━━━━━━━━━━━━━━━
>>                                  ABC
>>                                  DEF
>>                      ━━━━━━━━━━━━━━━
>>                      
>> This is because it uses `length' to detects the width, which only returns the
>> number of characters (6 in this case) but not the actual width displayed (12
>> in this case), and it tries to fill the line with that half width.
>> `string-width' should be used instead.
>> 
>> Here is a potential patch.
>> 
>> --- ox-ascii.el.org	2021-03-26 09:28:44.000000000 +0900
>> +++ ox-ascii.el	2021-05-02 08:11:57.657347150 +0900
>> @@ -1033,7 +1033,7 @@
>>  	     ;; Format TITLE.  It may be filled if it is too wide,
>>  	     ;; that is wider than the two thirds of the total width.
>>  	     (title-len (min (apply #'max
>> -				    (mapcar #'length
>> +				    (mapcar #'string-width
>>  					    (org-split-string
>>  					     (concat title "\n" subtitle) "\n")))
>>  			     (/ (* 2 text-width) 3)))
>
> Thanks, but the change you propose will not work reliably on GUI
> frames, because the actual width of double-width characters on display
> is not necessarily twice the width of a "normal" character.
> Especially if this is done in a non-CJK locale, where the default font
> is likely to be different from the font used for double-width
> characters.
>
> The accurate method of lining up in these cases is to use
> window-text-pixel-size instead.  That function will return the exact
> width of a string as it will displayed, in pixels, because it uses the
> same code as the display engine.

Would you mind giving an example about `window-text-pixel-size' usage in
this situation?

AFAIU, `window-text-pixel-size' returns the size of the window, but
I fail to see how it is relevant here. Note that `text-width' in the
code above is not related to the width of the window, but is a maximum
number of allowed characters on a line.

Thank you!

Regards,
-- 
Nicolas Goaziou




Information forwarded to emacs-orgmode <at> gnu.org:
bug#48148; Package org-mode. (Sun, 02 May 2021 09:13:01 GMT) Full text and rfc822 format available.

Message #14 received at 48148 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Nicolas Goaziou <mail <at> nicolasgoaziou.fr>
Cc: shingo.fg8 <at> gmail.com, 48148 <at> debbugs.gnu.org
Subject: Re: bug#48148: 27.2; ox-ascii breaks TITLE line wrongly when 2
 width char is used
Date: Sun, 02 May 2021 12:11:41 +0300
> From: Nicolas Goaziou <mail <at> nicolasgoaziou.fr>
> Cc: Shingo Tanaka <shingo.fg8 <at> gmail.com>,  48148 <at> debbugs.gnu.org
> Date: Sun, 02 May 2021 10:23:34 +0200
> 
> > The accurate method of lining up in these cases is to use
> > window-text-pixel-size instead.  That function will return the exact
> > width of a string as it will displayed, in pixels, because it uses the
> > same code as the display engine.
> 
> Would you mind giving an example about `window-text-pixel-size' usage in
> this situation?

I'm not sure what kind of example is necessary.  How about if you ask
specific questions about the arguments of that function which you
don't understand clearly how to use?

> AFAIU, `window-text-pixel-size' returns the size of the window

No, it returns the size of _text_ when displayed in a window.

> Note that `text-width' in the code above is not related to the width
> of the window, but is a maximum number of allowed characters on a
> line.

I didn't mean text-width, I meant the use of string-width: it should
be replaced by a call to window-text-pixel-size.




Information forwarded to emacs-orgmode <at> gnu.org:
bug#48148; Package org-mode. (Sun, 02 May 2021 11:34:01 GMT) Full text and rfc822 format available.

Message #17 received at 48148 <at> debbugs.gnu.org (full text, mbox):

From: Shingo Tanaka <shingo.fg8 <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: shingo.fg8 <at> gmail.com, 48148 <at> debbugs.gnu.org,
 Nicolas Goaziou <mail <at> nicolasgoaziou.fr>
Subject: Re: bug#48148: 27.2;
 ox-ascii breaks TITLE line wrongly when 2 width char is used
Date: Sun, 02 May 2021 20:33:23 +0900
Thank you for the advice.  I see that `window-text-pixel-size' returns true
displayed width but I think that's TRT for only the other bug I reported
(bug#48149).  These two bugs looks similar but the root causes are completed
different.

This bug (bug#48148) is actually caused by the difference of the width
detection methods between line 1036 in ox-ascii.el (`length') and
`fill-region'.  This is because `org-ascii-template--document-title' first
detects the title width by `length' and then tries to fill it by
`org-ascii--fill-string' which does the action by `fill-region' inside.  And
since the filling point in `fill-region' is based on `move-to-column' and it
looks like giving the same result as `string-width', I think `string-width'
is TRT for this bug.

In other words, specific to this bug, only the same width detection method as
`fill-region' is required, even if it doesn't give you the precise width
displayed.

Please correct me if I am wrong.

---
Shingo Tanaka


On Sun, 02 May 2021 18:11:41 +0900,
Eli Zaretskii wrote:
> 
> > From: Nicolas Goaziou <mail <at> nicolasgoaziou.fr>
> > Cc: Shingo Tanaka <shingo.fg8 <at> gmail.com>,  48148 <at> debbugs.gnu.org
> > Date: Sun, 02 May 2021 10:23:34 +0200
> > 
> > > The accurate method of lining up in these cases is to use
> > > window-text-pixel-size instead.  That function will return the exact
> > > width of a string as it will displayed, in pixels, because it uses the
> > > same code as the display engine.
> > 
> > Would you mind giving an example about `window-text-pixel-size' usage in
> > this situation?
> 
> I'm not sure what kind of example is necessary.  How about if you ask
> specific questions about the arguments of that function which you
> don't understand clearly how to use?
> 
> > AFAIU, `window-text-pixel-size' returns the size of the window
> 
> No, it returns the size of _text_ when displayed in a window.
> 
> > Note that `text-width' in the code above is not related to the width
> > of the window, but is a maximum number of allowed characters on a
> > line.
> 
> I didn't mean text-width, I meant the use of string-width: it should
> be replaced by a call to window-text-pixel-size.




Information forwarded to emacs-orgmode <at> gnu.org:
bug#48148; Package org-mode. (Sun, 02 May 2021 12:19:01 GMT) Full text and rfc822 format available.

Message #20 received at 48148 <at> debbugs.gnu.org (full text, mbox):

From: Nicolas Goaziou <mail <at> nicolasgoaziou.fr>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: shingo.fg8 <at> gmail.com, 48148 <at> debbugs.gnu.org
Subject: Re: bug#48148: 27.2; ox-ascii breaks TITLE line wrongly when 2
 width char is used
Date: Sun, 02 May 2021 14:18:24 +0200
Eli Zaretskii <eliz <at> gnu.org> writes:

> I'm not sure what kind of example is necessary.  How about if you ask
> specific questions about the arguments of that function which you
> don't understand clearly how to use?

Fair enough. However, i don't think my misunderstanding is related to
arguments of that function.

>> AFAIU, `window-text-pixel-size' returns the size of the window
>
> No, it returns the size of _text_ when displayed in a window.

True. 

My problem is that I have some string, _which is not displayed anywhere_
yet. I need to obtain its real width along with the width of a single
character in order to compute the length argument in `make-string'.

I may be missing something obvious.

> I didn't mean text-width, I meant the use of string-width: it should
> be replaced by a call to window-text-pixel-size.

`string-width' applies to a string. `window-text-pixel-size' doesn't.
This is the root of my misunderstanding, I guess.

Regards,




Information forwarded to emacs-orgmode <at> gnu.org:
bug#48148; Package org-mode. (Sun, 02 May 2021 12:44:02 GMT) Full text and rfc822 format available.

Message #23 received at 48148 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Shingo Tanaka <shingo.fg8 <at> gmail.com>
Cc: shingo.fg8 <at> gmail.com, 48148 <at> debbugs.gnu.org, mail <at> nicolasgoaziou.fr
Subject: Re: bug#48148: 27.2;
 ox-ascii breaks TITLE line wrongly when 2 width char is used
Date: Sun, 02 May 2021 15:43:28 +0300
> Date: Sun, 02 May 2021 20:33:23 +0900
> From: Shingo Tanaka <shingo.fg8 <at> gmail.com>
> Cc: Nicolas Goaziou <mail <at> nicolasgoaziou.fr>,
> 	shingo.fg8 <at> gmail.com,
> 	48148 <at> debbugs.gnu.org
> 
> This bug (bug#48148) is actually caused by the difference of the width
> detection methods between line 1036 in ox-ascii.el (`length') and
> `fill-region'.  This is because `org-ascii-template--document-title' first
> detects the title width by `length' and then tries to fill it by
> `org-ascii--fill-string' which does the action by `fill-region' inside.  And
> since the filling point in `fill-region' is based on `move-to-column' and it
> looks like giving the same result as `string-width', I think `string-width'
> is TRT for this bug.
> 
> In other words, specific to this bug, only the same width detection method as
> `fill-region' is required, even if it doesn't give you the precise width
> displayed.
> 
> Please correct me if I am wrong.

I think you are right, thanks.




Information forwarded to emacs-orgmode <at> gnu.org:
bug#48148; Package org-mode. (Sun, 02 May 2021 12:49:01 GMT) Full text and rfc822 format available.

Message #26 received at 48148 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Nicolas Goaziou <mail <at> nicolasgoaziou.fr>
Cc: shingo.fg8 <at> gmail.com, 48148 <at> debbugs.gnu.org
Subject: Re: bug#48148: 27.2; ox-ascii breaks TITLE line wrongly when 2
 width char is used
Date: Sun, 02 May 2021 15:48:14 +0300
> From: Nicolas Goaziou <mail <at> nicolasgoaziou.fr>
> Cc: shingo.fg8 <at> gmail.com,  48148 <at> debbugs.gnu.org
> Date: Sun, 02 May 2021 14:18:24 +0200
> 
> My problem is that I have some string, _which is not displayed anywhere_
> yet. I need to obtain its real width along with the width of a single
> character in order to compute the length argument in `make-string'.

The width of any text on display is meaningless unless you also tell
in what window will it be displayed.  That's because some of the
factors that affect the display width depend on the window and the
buffer shown by that window.

So assuming the string you have will eventually be displayed in some
window -- and most strings in Emacs are of that kind -- you should use
that window up front.  Otherwise, the value you get from other methods
can only be an approximation, which will sometimes be close, and
sometimes quite far from the truth.




Reply sent to Nicolas Goaziou <mail <at> nicolasgoaziou.fr>:
You have taken responsibility. (Sun, 02 May 2021 15:57:02 GMT) Full text and rfc822 format available.

Notification sent to Shingo Tanaka <shingo.fg8 <at> gmail.com>:
bug acknowledged by developer. (Sun, 02 May 2021 15:57:02 GMT) Full text and rfc822 format available.

Message #31 received at 48148-done <at> debbugs.gnu.org (full text, mbox):

From: Nicolas Goaziou <mail <at> nicolasgoaziou.fr>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: shingo.fg8 <at> gmail.com, 48148-done <at> debbugs.gnu.org
Subject: Re: bug#48148: 27.2; ox-ascii breaks TITLE line wrongly when 2
 width char is used
Date: Sun, 02 May 2021 17:56:04 +0200
Eli Zaretskii <eliz <at> gnu.org> writes:

>> From: Nicolas Goaziou <mail <at> nicolasgoaziou.fr>
>> Cc: shingo.fg8 <at> gmail.com,  48148 <at> debbugs.gnu.org
>> Date: Sun, 02 May 2021 14:18:24 +0200
>> 
>> My problem is that I have some string, _which is not displayed anywhere_
>> yet. I need to obtain its real width along with the width of a single
>> character in order to compute the length argument in `make-string'.
>
> The width of any text on display is meaningless unless you also tell
> in what window will it be displayed.  That's because some of the
> factors that affect the display width depend on the window and the
> buffer shown by that window.

I understand. More than the width of the text, I'm interested in the
ratio between the width of the text and the width of an underline
character (assuming monospace).

> So assuming the string you have will eventually be displayed in some
> window -- and most strings in Emacs are of that kind -- you should use
> that window up front.  Otherwise, the value you get from other methods
> can only be an approximation, which will sometimes be close, and
> sometimes quite far from the truth.

The string may not be displayed at all. Since it is the output of an
export process, it could, e.g., be written to a file.

I applied Shingo Tanaka's suggestion using `string-width', which is the
best we can do considering our requirements.

Thank you for your answer, and to Shingo Tanaka for the report an the
patch.

Regards,





Information forwarded to emacs-orgmode <at> gnu.org:
bug#48148; Package org-mode. (Sun, 02 May 2021 16:13:02 GMT) Full text and rfc822 format available.

Message #34 received at 48148 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Nicolas Goaziou <mail <at> nicolasgoaziou.fr>
Cc: shingo.fg8 <at> gmail.com, 48148 <at> debbugs.gnu.org
Subject: Re: bug#48148: 27.2; ox-ascii breaks TITLE line wrongly when 2
 width char is used
Date: Sun, 02 May 2021 19:11:49 +0300
> From: Nicolas Goaziou <mail <at> nicolasgoaziou.fr>
> Cc: shingo.fg8 <at> gmail.com,  48148-done <at> debbugs.gnu.org
> Date: Sun, 02 May 2021 17:56:04 +0200
> 
> More than the width of the text, I'm interested in the
> ratio between the width of the text and the width of an underline
> character (assuming monospace).

And that is rarely 2 for double-width characters.  Especially if the
underline and the double-width characters are displayed using
different fonts.

> The string may not be displayed at all. Since it is the output of an
> export process, it could, e.g., be written to a file.

If that text file can be displayed using some software you know
nothing about, then this problem doesn't have an accurate solution,
only approximate ones.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 31 May 2021 11:24:07 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 16 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.