GNU bug report logs - #11082
24.0.94; u.glyphless member in struct glyph does not fit in 32 bits

Previous Next

Package: emacs;

Reported by: YAMAMOTO Mitsuharu <mituharu <at> math.s.chiba-u.ac.jp>

Date: Sat, 24 Mar 2012 05:55:01 UTC

Severity: normal

Found in version 24.0.94

Done: Chong Yidong <cyd <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 11082 in the body.
You can then email your comments to 11082 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#11082; Package emacs. (Sat, 24 Mar 2012 05:55:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to YAMAMOTO Mitsuharu <mituharu <at> math.s.chiba-u.ac.jp>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sat, 24 Mar 2012 05:55:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: YAMAMOTO Mitsuharu <mituharu <at> math.s.chiba-u.ac.jp>
To: bug-gnu-emacs <at> gnu.org
Subject: 24.0.94; u.glyphless member in struct glyph does not fit in 32 bits
Date: Sat, 24 Mar 2012 14:23:28 +0900
In dispextern.h:

   316	struct glyph
   317	{
(snip)
   418	  /* A union of sub-structures for different glyph types.  */
   419	  union
   420	  {
(snip)
   447	    /* Sub-stretch for type == GLYPHLESS_GLYPH.  */
   448	    struct
   449	    {
   450	      /* Value is an enum of the type glyphless_display_method.  */
   451	      unsigned method : 2;
   452	      /* 1 iff this glyph is for a character of no font. */
   453	      unsigned for_no_font : 1;
   454	      /* Length of acronym or hexadecimal code string (at most 8).  */
   455	      unsigned len : 4;
   456	      /* Character to display.  Actually we need only 22 bits.  */
   457	      unsigned ch : 26;
   458	    } glyphless;
   459	
   460	    /* Used to compare all bit-fields above in one step.  */
   461	    unsigned val;
   462	  } u;
   463	};

The member `u.glyphless' above requires at least 33 bits and does not
fit in the size (32 bits) of `u.val' on many environments.  As a
result, equality with respect to the `u.val' member (e.g., used in
GLYPH_EQUAL_P) does not necessarily mean the equality of glyphless
glyphs.

According to the comment above, it seems to be OK to shorten the
length of `u.glyphless.ch' member from 26 to 25.  Could someone
confirm this?

				     YAMAMOTO Mitsuharu
				mituharu <at> math.s.chiba-u.ac.jp




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#11082; Package emacs. (Sat, 24 Mar 2012 07:33:01 GMT) Full text and rfc822 format available.

Message #8 received at 11082 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: YAMAMOTO Mitsuharu <mituharu <at> math.s.chiba-u.ac.jp>
Cc: 11082 <at> debbugs.gnu.org
Subject: Re: bug#11082: 24.0.94;
	u.glyphless member in struct glyph does not fit in 32 bits
Date: Sat, 24 Mar 2012 09:01:04 +0200
> Date: Sat, 24 Mar 2012 14:23:28 +0900
> From: YAMAMOTO Mitsuharu <mituharu <at> math.s.chiba-u.ac.jp>
> 
> In dispextern.h:
> 
>    316	struct glyph
>    317	{
> (snip)
>    418	  /* A union of sub-structures for different glyph types.  */
>    419	  union
>    420	  {
> (snip)
>    447	    /* Sub-stretch for type == GLYPHLESS_GLYPH.  */
>    448	    struct
>    449	    {
>    450	      /* Value is an enum of the type glyphless_display_method.  */
>    451	      unsigned method : 2;
>    452	      /* 1 iff this glyph is for a character of no font. */
>    453	      unsigned for_no_font : 1;
>    454	      /* Length of acronym or hexadecimal code string (at most 8).  */
>    455	      unsigned len : 4;
>    456	      /* Character to display.  Actually we need only 22 bits.  */
>    457	      unsigned ch : 26;
>    458	    } glyphless;
>    459	
>    460	    /* Used to compare all bit-fields above in one step.  */
>    461	    unsigned val;
>    462	  } u;
>    463	};
> 
> The member `u.glyphless' above requires at least 33 bits and does not
> fit in the size (32 bits) of `u.val' on many environments.  As a
> result, equality with respect to the `u.val' member (e.g., used in
> GLYPH_EQUAL_P) does not necessarily mean the equality of glyphless
> glyphs.

?? Isn't the size of a union defined by its widest member?  If so, we
just end up wasting some storage here, but we should never truncate a
bit field.  Do you have an actual test case that shows such kind of a
bug?

> According to the comment above, it seems to be OK to shorten the
> length of `u.glyphless.ch' member from 26 to 25.  Could someone
> confirm this?

Confirmed.  From the ELisp manual:

     To support this multitude of characters and scripts, Emacs closely
  follows the "Unicode Standard".  The Unicode Standard assigns a unique
  number, called a "codepoint", to each and every character.  The range
  of codepoints defined by Unicode, or the Unicode "codespace", is
  `0..#x10FFFF' (in hexadecimal notation), inclusive.  Emacs extends this
  range with codepoints in the range `#x110000..#x3FFFFF', which it uses
  for representing characters that are not unified with Unicode and "raw
  8-bit bytes" that cannot be interpreted as characters.  Thus, a
  character codepoint in Emacs is a 22-bit integer number.
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I would actually suggest to use 22-bit for this field, to avoid
confusion in the future.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#11082; Package emacs. (Sat, 24 Mar 2012 09:26:02 GMT) Full text and rfc822 format available.

Message #11 received at 11082 <at> debbugs.gnu.org (full text, mbox):

From: Andreas Schwab <schwab <at> linux-m68k.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 11082 <at> debbugs.gnu.org, YAMAMOTO Mitsuharu <mituharu <at> math.s.chiba-u.ac.jp>
Subject: Re: bug#11082: 24.0.94;
	u.glyphless member in struct glyph does not fit in 32 bits
Date: Sat, 24 Mar 2012 09:54:09 +0100
Eli Zaretskii <eliz <at> gnu.org> writes:

>> The member `u.glyphless' above requires at least 33 bits and does not
>> fit in the size (32 bits) of `u.val' on many environments.  As a
>> result, equality with respect to the `u.val' member (e.g., used in
>> GLYPH_EQUAL_P) does not necessarily mean the equality of glyphless
>> glyphs.

This is broken since GLYPHLESS_GLYPH was added.

> ?? Isn't the size of a union defined by its widest member?

The size of u.val is defined by the size of unsigned.

> If so, we just end up wasting some storage here, but we should never
> truncate a bit field.

It's not about truncation, but about ignored bits in GLYPH_EQUAL_P.

> I would actually suggest to use 22-bit for this field, to avoid
> confusion in the future.

Making the struct exactly 32 bits may be better since it can make access
to the ch member simpler.

Andreas.

-- 
Andreas Schwab, schwab <at> linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#11082; Package emacs. (Sun, 25 Mar 2012 01:02:01 GMT) Full text and rfc822 format available.

Message #14 received at 11082 <at> debbugs.gnu.org (full text, mbox):

From: YAMAMOTO Mitsuharu <mituharu <at> math.s.chiba-u.ac.jp>
To: Andreas Schwab <schwab <at> linux-m68k.org>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 11082 <at> debbugs.gnu.org
Subject: Re: bug#11082: 24.0.94;
	u.glyphless member in struct glyph does not fit in 32 bits
Date: Sun, 25 Mar 2012 09:29:25 +0900
>>>>> On Sat, 24 Mar 2012 09:54:09 +0100, Andreas Schwab <schwab <at> linux-m68k.org> said:

>> ?? Isn't the size of a union defined by its widest member?

> The size of u.val is defined by the size of unsigned.

>> If so, we just end up wasting some storage here, but we should
>> never truncate a bit field.

> It's not about truncation, but about ignored bits in GLYPH_EQUAL_P.

Yes, I meant that.

A test case is as follows.  I checked it with Ubuntu 11.10, GTK+
build.

1. emacs -Q
2. (insert #xe0100) C-j
3. C-p C-p C-e C-b C-b C-d 1 C-e
   Now the line at the cursor is "(insert #xe0101)".
4. C-j
   The glyphless glyph just added is shown as "0E0100" instead of
   "0E0101".

				     YAMAMOTO Mitsuharu
				mituharu <at> math.s.chiba-u.ac.jp




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#11082; Package emacs. (Mon, 26 Mar 2012 06:20:02 GMT) Full text and rfc822 format available.

Message #17 received at 11082 <at> debbugs.gnu.org (full text, mbox):

From: Kenichi Handa <handa <at> m17n.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 11082 <at> debbugs.gnu.org, mituharu <at> math.s.chiba-u.ac.jp
Subject: Re: bug#11082: 24.0.94;
	u.glyphless member in struct glyph does not fit in 32 bits
Date: Mon, 26 Mar 2012 14:47:16 +0900
In article <83zkb6trdb.fsf <at> gnu.org>, Eli Zaretskii <eliz <at> gnu.org> writes:

> >    447	    /* Sub-stretch for type == GLYPHLESS_GLYPH.  */
> >    448	    struct
> >    449	    {
> >    450	      /* Value is an enum of the type glyphless_display_method.  */
> >    451	      unsigned method : 2;
> >    452	      /* 1 iff this glyph is for a character of no font. */
> >    453	      unsigned for_no_font : 1;
> >    454	      /* Length of acronym or hexadecimal code string (at most 8).  */
> >    455	      unsigned len : 4;
> >    456	      /* Character to display.  Actually we need only 22 bits.  */
> >    457	      unsigned ch : 26;
> >    458	    } glyphless;
[...]
> > According to the comment above, it seems to be OK to shorten the
> > length of `u.glyphless.ch' member from 26 to 25.  Could someone
> > confirm this?

> Confirmed.  From the ELisp manual:
[...]
> I would actually suggest to use 22-bit for this field, to avoid
> confusion in the future.

I agree to change the bit length.  I don't remeber well but I
think the current bit length setting was just my mistake.

In article <jwv4ntch6w6.fsf-monnier+emacs <at> gnu.org>, Stefan Monnier <monnier <at> iro.umontreal.ca> writes:

> >   dispextern.h (struct glyph): Change the bit length of glyphless.ch to 22
> > to make the member glyphless fit in 32 bits.

> I think it's safer to reduce it to 25 bits, otherwise `val' field will
> refer to undefined bits.

Ok.  I've just installed that change.

---
Kenichi Handa
handa <at> m17n.org




bug closed, send any further explanations to 11082 <at> debbugs.gnu.org and YAMAMOTO Mitsuharu <mituharu <at> math.s.chiba-u.ac.jp> Request was from Chong Yidong <cyd <at> gnu.org> to control <at> debbugs.gnu.org. (Mon, 26 Mar 2012 06:39:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 23 Apr 2012 11:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 13 years and 138 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.