#6283 - doc/lispref/searching.texi reference to octal code `0377' correct?

GNU bug report logs - #6283
doc/lispref/searching.texi reference to octal code `0377' correct?

Package: emacs;

Reported by: MON KEY <monkey <at> sandpframing.com>

Date: Thu, 27 May 2010 17:29:02 UTC

Severity: minor

Done: Chong Yidong <cyd <at> stupidchicken.com>

Bug is archived. No further changes may be made.

Message #17 received at 6283 <at> debbugs.gnu.org (full text, mbox):

From: MON KEY <monkey <at> sandpframing.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: 6283 <at> debbugs.gnu.org Subject: Re: bug#6283: doc/lispref/searching.texi reference to octal code `0377' correct? Date: Fri, 28 May 2010 19:20:18 -0400

On Fri, May 28, 2010 at 3:15 AM, Eli Zaretskii <eliz <at> gnu.org> wrote: > Sorry, I don't see the relevance. The manual talks about the > _numeric_ code of characters, not about their read syntax. I must be misunderstanding something. What is the numeric code of \255 ? > It uses "octal 0377" to present values because octal notation of > single-byte characters is something many people are familiar with, Where is this convention detailed/discussed in the manual? I don't find it mentioned in the (info "(elisp)Conventions"). Should it be, esp. as 0377 is not a representation exposed by the Emacs user level interface (at least none that that I'm aware of). > After all, that is the codepoint of the character. Of which character? 0377 doesn't have a character that I'm aware of. > This is explained in "Non-ASCII Characters". But we generally try not But this is my point, that section (being the most relevant to Non-ASCII notation) tends to use the #<Radian> notation. > to advertise this issue too much, because there should be no good > reason for a Lisp program to create raw bytes. Emacs is a text > editor, while raw bytes are not text Thats just silly. Emacs accomodates noodling w/ raw-bytes because it is neccesary to edit them on occasion. Heck, Emacs w32 distributes with a dedicated executable just to edit binary data in hexadecimal form. >> whenever I need to manually revert some raw-bytes or improperly >> encoded bit-rotted text using regexps. > > It's hard to believe Emacs couldn't handle any such text in some other > way. It generally can. However, sometimes file encodings get out of whack over time and once they are more than a generation away from rightedness Emacs isn't always able to revert them. The good thing is Emacs can do this and I'm very glad it does :) Besides, its my prerogative how I choose to abuse Emacs into abusing my data. > What "improper encoding" was that which Emacs couldn't handle? The "mixed bag encoding". Not all of my files origniated in Emacs. Not all of them get read into an Emacs buffer without problems. GIGO c'est la vie. FWIW I have entire SQL databases multi-lingual multi-encoding data that was improperly uploaded into them via a misconfigured PHP script with a funky encoding declartion which itself got its input from a certain legacy proprietary w32 web-browser that understood (read willfully mis-interpreted) UTF-8 according to its own whims and I can assure you that encodings don't translate perfectly nor are the mis-translations always easily caught or corrected. Stuff like this can sometimes happen with system locales too. Transitioning files from vfat will clobber file names too if your not carefull. Sometimes I need to find the raw-bytes and replace them with their character equivalent. > Could it be that you simply gave up too early and tried to solve the > problem by treating text as bytes, while it really wasn't? Nope. I'm usually pretty good about _not_ approaching these problems with this type of hammer unless it is a last resort. -- /s_P\

This bug report was last modified 15 years and 42 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #6283 doc/lispref/searching.texi reference to octal code `0377' correct?

GNU bug report logs - #6283
doc/lispref/searching.texi reference to octal code `0377' correct?