Package: emacs;
Reported by: Halim <mhalimln <at> outlook.com>
Date: Sat, 4 Feb 2023 07:47:02 UTC
Severity: normal
Found in version 28.2
Done: Eli Zaretskii <eliz <at> gnu.org>
Bug is archived. No further changes may be made.
View this message in rfc822 format
From: Halim <mhalimln <at> outlook.com> To: 61269 <at> debbugs.gnu.org Subject: bug#61269: 28.2; Sequence of spaces preceding tab in bidirectional line Date: Sun, 05 Feb 2023 23:55:38 +0700
Eli Zaretskii <eliz <at> gnu.org> writes: >> From: Halim <mhalimln <at> outlook.com> >> Date: Sat, 04 Feb 2023 02:41:35 +0700 >> >> >> In a left-to-right line emacs display a sequence of one or more >> spaces (U+0020), where the spaces precede a tab (U+0009) and they >> both appear between two right-to-left alphabet, to the left of the >> first (in typing order) rtl alphabet. >> >> The bug does not present when the rtl text is inside an rtl >> isolate. >> >> Let s represent space, t represet tab, l represent itself, r and >> m represent arabic alphabet. The following example have this format >> in typing order from left to right. >> >> Format: >> lsrssstm >> >> Example text: >> l ح م >> >> The expected display is 'lsrssstm', the actual is 'lssssrtm'. >> The spaces following 'r' in the format is displayed to the left >> of 'r' in the actual display. Using 'C-f' from 'r' moves the >> cursor to the left until it hits 't' where the cursor move to >> the right of 'r'. >> >> I have tried to view the file containing the buggy text in >> focuswriter and fribidi. They both display the same expected >> way. >> >> Extra Info >> >> The bug also present to ltr text on rtl line. I believe >> this is generic and is caused by this line >> '&& level != bidi_it->level_stack[0].level' (see below). >> >> The bug also present in emacs built from commit >> 'ac7ec87a7a0db887e4ae7fe9005aea517958b778' with >> --without-all. In this commit I make the following >> modification. >> >> --------------- >> $ git diff ac7ec87a7a0db887e4ae7fe9005aea517958b778 >> diff --git a/src/bidi.c b/src/bidi.c >> index e012512..fe6e4d6 100644 >> --- a/src/bidi.c >> +++ b/src/bidi.c >> @@ -3302,10 +3302,7 @@ bidi_level_of_next_char (struct bidi_it *bidi_it) >> if ((bidi_it->orig_type == NEUTRAL_WS >> || bidi_it->orig_type == WEAK_BN >> || bidi_isolate_fmt_char (bidi_it->orig_type)) >> - && bidi_it->next_for_ws.charpos < bidi_it->charpos >> - /* If this character is already at base level, we don't need to >> - reset it, so avoid the potentially costly loop below. */ >> - && level != bidi_it->level_stack[0].level) >> + && bidi_it->next_for_ws.charpos < bidi_it->charpos) >> { >> int ch; >> ptrdiff_t clen = bidi_it->ch_len; >> --------------- >> >> It fixes the bug. > > Thanks. > > You are right that the logic there was flawed. However, just removing > the base-level test is sub-optimal: that test was added to speed up > redisplay when the buffer has a lot of control characters (e.g., > binary null bytes) that don't need to be reordered; see bug#22739. > > So I have installed a slightly different change, reproduced below; > please see that it solves the problem, including (presumably) some > real-life problems you had in displaying RTL text with embedded TABs. > > diff --git a/src/bidi.c b/src/bidi.c > index e012512..93875d2 100644 > --- a/src/bidi.c > +++ b/src/bidi.c > @@ -3300,12 +3300,15 @@ bidi_level_of_next_char (struct bidi_it *bidi_it) > it belongs to a sequence of WS characters preceding a newline > or a TAB or a paragraph separator. */ > if ((bidi_it->orig_type == NEUTRAL_WS > - || bidi_it->orig_type == WEAK_BN > + || (bidi_it->orig_type == WEAK_BN > + /* If this BN character is already at base level, we don't > + need to consider resetting it, since I1 and I2 below > + will not change the level, so avoid the potentially > + costly loop below. */ > + && level != bidi_it->level_stack[0].level) > || bidi_isolate_fmt_char (bidi_it->orig_type)) > - && bidi_it->next_for_ws.charpos < bidi_it->charpos > - /* If this character is already at base level, we don't need to > - reset it, so avoid the potentially costly loop below. */ > - && level != bidi_it->level_stack[0].level) > + /* This means the informaition about WS resolution is not valid. */ > + && bidi_it->next_for_ws.charpos < bidi_it->charpos) > { > int ch; > ptrdiff_t clen = bidi_it->ch_len; > @@ -3340,7 +3343,7 @@ bidi_level_of_next_char (struct bidi_it *bidi_it) > || bidi_it->orig_type == NEUTRAL_S > || bidi_it->ch == '\n' || bidi_it->ch == BIDI_EOB > || ((bidi_it->orig_type == NEUTRAL_WS > - || bidi_it->orig_type == WEAK_BN > + || bidi_it->orig_type == WEAK_BN /* L1/Retaining */ > || bidi_isolate_fmt_char (bidi_it->orig_type) > || bidi_explicit_dir_char (bidi_it->ch)) > && (bidi_it->next_for_ws.type == NEUTRAL_B I have done the same test as I did before and your patch does fix the problem. Unfortunately I never had any real-life problems as I did not write any bidi text (I does write, but its only to help my understanding on UBA), so I cant give any result on this. Thanks.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.