GNU bug report logs -
#11082
24.0.94; u.glyphless member in struct glyph does not fit in 32 bits
Previous Next
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 11082 in the body.
You can then email your comments to 11082 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#11082
; Package
emacs
.
(Sat, 24 Mar 2012 05:55:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
YAMAMOTO Mitsuharu <mituharu <at> math.s.chiba-u.ac.jp>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Sat, 24 Mar 2012 05:55:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
In dispextern.h:
316 struct glyph
317 {
(snip)
418 /* A union of sub-structures for different glyph types. */
419 union
420 {
(snip)
447 /* Sub-stretch for type == GLYPHLESS_GLYPH. */
448 struct
449 {
450 /* Value is an enum of the type glyphless_display_method. */
451 unsigned method : 2;
452 /* 1 iff this glyph is for a character of no font. */
453 unsigned for_no_font : 1;
454 /* Length of acronym or hexadecimal code string (at most 8). */
455 unsigned len : 4;
456 /* Character to display. Actually we need only 22 bits. */
457 unsigned ch : 26;
458 } glyphless;
459
460 /* Used to compare all bit-fields above in one step. */
461 unsigned val;
462 } u;
463 };
The member `u.glyphless' above requires at least 33 bits and does not
fit in the size (32 bits) of `u.val' on many environments. As a
result, equality with respect to the `u.val' member (e.g., used in
GLYPH_EQUAL_P) does not necessarily mean the equality of glyphless
glyphs.
According to the comment above, it seems to be OK to shorten the
length of `u.glyphless.ch' member from 26 to 25. Could someone
confirm this?
YAMAMOTO Mitsuharu
mituharu <at> math.s.chiba-u.ac.jp
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#11082
; Package
emacs
.
(Sat, 24 Mar 2012 07:33:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 11082 <at> debbugs.gnu.org (full text, mbox):
> Date: Sat, 24 Mar 2012 14:23:28 +0900
> From: YAMAMOTO Mitsuharu <mituharu <at> math.s.chiba-u.ac.jp>
>
> In dispextern.h:
>
> 316 struct glyph
> 317 {
> (snip)
> 418 /* A union of sub-structures for different glyph types. */
> 419 union
> 420 {
> (snip)
> 447 /* Sub-stretch for type == GLYPHLESS_GLYPH. */
> 448 struct
> 449 {
> 450 /* Value is an enum of the type glyphless_display_method. */
> 451 unsigned method : 2;
> 452 /* 1 iff this glyph is for a character of no font. */
> 453 unsigned for_no_font : 1;
> 454 /* Length of acronym or hexadecimal code string (at most 8). */
> 455 unsigned len : 4;
> 456 /* Character to display. Actually we need only 22 bits. */
> 457 unsigned ch : 26;
> 458 } glyphless;
> 459
> 460 /* Used to compare all bit-fields above in one step. */
> 461 unsigned val;
> 462 } u;
> 463 };
>
> The member `u.glyphless' above requires at least 33 bits and does not
> fit in the size (32 bits) of `u.val' on many environments. As a
> result, equality with respect to the `u.val' member (e.g., used in
> GLYPH_EQUAL_P) does not necessarily mean the equality of glyphless
> glyphs.
?? Isn't the size of a union defined by its widest member? If so, we
just end up wasting some storage here, but we should never truncate a
bit field. Do you have an actual test case that shows such kind of a
bug?
> According to the comment above, it seems to be OK to shorten the
> length of `u.glyphless.ch' member from 26 to 25. Could someone
> confirm this?
Confirmed. From the ELisp manual:
To support this multitude of characters and scripts, Emacs closely
follows the "Unicode Standard". The Unicode Standard assigns a unique
number, called a "codepoint", to each and every character. The range
of codepoints defined by Unicode, or the Unicode "codespace", is
`0..#x10FFFF' (in hexadecimal notation), inclusive. Emacs extends this
range with codepoints in the range `#x110000..#x3FFFFF', which it uses
for representing characters that are not unified with Unicode and "raw
8-bit bytes" that cannot be interpreted as characters. Thus, a
character codepoint in Emacs is a 22-bit integer number.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
I would actually suggest to use 22-bit for this field, to avoid
confusion in the future.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#11082
; Package
emacs
.
(Sat, 24 Mar 2012 09:26:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 11082 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii <eliz <at> gnu.org> writes:
>> The member `u.glyphless' above requires at least 33 bits and does not
>> fit in the size (32 bits) of `u.val' on many environments. As a
>> result, equality with respect to the `u.val' member (e.g., used in
>> GLYPH_EQUAL_P) does not necessarily mean the equality of glyphless
>> glyphs.
This is broken since GLYPHLESS_GLYPH was added.
> ?? Isn't the size of a union defined by its widest member?
The size of u.val is defined by the size of unsigned.
> If so, we just end up wasting some storage here, but we should never
> truncate a bit field.
It's not about truncation, but about ignored bits in GLYPH_EQUAL_P.
> I would actually suggest to use 22-bit for this field, to avoid
> confusion in the future.
Making the struct exactly 32 bits may be better since it can make access
to the ch member simpler.
Andreas.
--
Andreas Schwab, schwab <at> linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#11082
; Package
emacs
.
(Sun, 25 Mar 2012 01:02:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 11082 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Sat, 24 Mar 2012 09:54:09 +0100, Andreas Schwab <schwab <at> linux-m68k.org> said:
>> ?? Isn't the size of a union defined by its widest member?
> The size of u.val is defined by the size of unsigned.
>> If so, we just end up wasting some storage here, but we should
>> never truncate a bit field.
> It's not about truncation, but about ignored bits in GLYPH_EQUAL_P.
Yes, I meant that.
A test case is as follows. I checked it with Ubuntu 11.10, GTK+
build.
1. emacs -Q
2. (insert #xe0100) C-j
3. C-p C-p C-e C-b C-b C-d 1 C-e
Now the line at the cursor is "(insert #xe0101)".
4. C-j
The glyphless glyph just added is shown as "0E0100" instead of
"0E0101".
YAMAMOTO Mitsuharu
mituharu <at> math.s.chiba-u.ac.jp
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#11082
; Package
emacs
.
(Mon, 26 Mar 2012 06:20:02 GMT)
Full text and
rfc822 format available.
Message #17 received at 11082 <at> debbugs.gnu.org (full text, mbox):
In article <83zkb6trdb.fsf <at> gnu.org>, Eli Zaretskii <eliz <at> gnu.org> writes:
> > 447 /* Sub-stretch for type == GLYPHLESS_GLYPH. */
> > 448 struct
> > 449 {
> > 450 /* Value is an enum of the type glyphless_display_method. */
> > 451 unsigned method : 2;
> > 452 /* 1 iff this glyph is for a character of no font. */
> > 453 unsigned for_no_font : 1;
> > 454 /* Length of acronym or hexadecimal code string (at most 8). */
> > 455 unsigned len : 4;
> > 456 /* Character to display. Actually we need only 22 bits. */
> > 457 unsigned ch : 26;
> > 458 } glyphless;
[...]
> > According to the comment above, it seems to be OK to shorten the
> > length of `u.glyphless.ch' member from 26 to 25. Could someone
> > confirm this?
> Confirmed. From the ELisp manual:
[...]
> I would actually suggest to use 22-bit for this field, to avoid
> confusion in the future.
I agree to change the bit length. I don't remeber well but I
think the current bit length setting was just my mistake.
In article <jwv4ntch6w6.fsf-monnier+emacs <at> gnu.org>, Stefan Monnier <monnier <at> iro.umontreal.ca> writes:
> > dispextern.h (struct glyph): Change the bit length of glyphless.ch to 22
> > to make the member glyphless fit in 32 bits.
> I think it's safer to reduce it to 25 bits, otherwise `val' field will
> refer to undefined bits.
Ok. I've just installed that change.
---
Kenichi Handa
handa <at> m17n.org
bug closed, send any further explanations to
11082 <at> debbugs.gnu.org and YAMAMOTO Mitsuharu <mituharu <at> math.s.chiba-u.ac.jp>
Request was from
Chong Yidong <cyd <at> gnu.org>
to
control <at> debbugs.gnu.org
.
(Mon, 26 Mar 2012 06:39:02 GMT)
Full text and
rfc822 format available.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Mon, 23 Apr 2012 11:24:03 GMT)
Full text and
rfc822 format available.
This bug report was last modified 13 years and 138 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.