GNU bug report logs -
#6283
doc/lispref/searching.texi reference to octal code `0377' correct?
Previous Next
Reported by: MON KEY <monkey <at> sandpframing.com>
Date: Thu, 27 May 2010 17:29:02 UTC
Severity: minor
Done: Chong Yidong <cyd <at> stupidchicken.com>
Bug is archived. No further changes may be made.
Full log
Message #41 received at 6283 <at> debbugs.gnu.org (full text, mbox):
On Mon, May 31, 2010 at 2:49 PM, Eli Zaretskii <eliz <at> gnu.org> wrote:
>> > In Unicode, it's a codepoint of LATIN SMALL LETTER Y WITH DIAERESIS.
>>
>> I don't understand this.
>
> I don't know how to express this more clearly. Perhaps you could ask
> specific questions.
>
If you step through the Emacs Lisp example I sent along previously you
may notice that the search doesn't match either of the `ÿ's.
It does however match the character with numeric notations:
4194303, #o17777777, #x3fffff
4194221, #o17777655, #x3fffad
E.g. These rawbytes as presented by Emacs as characters:
(insert-byte (multibyte-char-to-unibyte 4194221) 1)
(insert-byte (multibyte-char-to-unibyte 4194303) 1)
This is what I don't understand.
If I evauate the following:
(progn
(save-excursion
(insert-byte (multibyte-char-to-unibyte 4194221) 1)
(insert-byte (multibyte-char-to-unibyte 4194303) 1))
(search-forward-regexp "ÿ" nil t))
I don't match.
Whereas if I evaluate:
(progn
(save-excursion (insert 10 #o377))
(search-forward-regexp "ÿ" nil t))
I get a match.
Likewise, if I evaluate
(progn (save-excursion (insert 10 4194303))
(search-forward-regexp "\377" nil t))
I get a match.
Which is to say, given the example regexp from the manual, i.e:
,----
| You cannot always match all non-ASCII characters with the regular
| expression `"[\200-\377]"'
`----
I am unable to locate the character: ÿ (255, #o377, #xff) e.g.
LATIN SMALL LETTER Y WITH DIAERESIS
To be clear, my issue isn't that I am not able to match `ÿ' but rather
that I am able to match the raw-byte character representation with a
visual appearance which coincides with the octal value for the `ÿ'
character code i.e. #o377 this being otherwise widely understood as
`octal 0377'.
I hope this is more clear than the previous mail. I apologize if it is not.
--
/s_P]
This bug report was last modified 14 years and 358 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.