GNU bug report logs - #72879
30.0.90; string-lines hardcoded eol

Previous Next

Package: emacs;

Reported by: Christopher Howard <christopher <at> librehacker.com>

Date: Thu, 29 Aug 2024 18:04:01 UTC

Severity: normal

Found in version 30.0.90

Done: Christopher Howard <christopher <at> librehacker.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Eli Zaretskii <eliz <at> gnu.org>
To: Christopher Howard <christopher <at> librehacker.com>
Cc: 72879 <at> debbugs.gnu.org
Subject: bug#72879: 30.0.90; string-lines hardcoded eol
Date: Fri, 30 Aug 2024 09:15:15 +0300
> X-Spam-Status: No, score=-1 tagged_above=-10 required=5 tests=[ALL_TRUSTED=-1]
>  autolearn=ham autolearn_force=no
> From: Christopher Howard <christopher <at> librehacker.com>
> Cc: 72879 <at> debbugs.gnu.org
> Date: Thu, 29 Aug 2024 12:15:06 -0800
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > Well, "newline" is the name of the \n character, so I think the doc
> > string is okay.
> 
> So, in Emacs, \n = newline = LINEFEED by definition?

Yes.  And not only in Emacs, AFAIK.

> I notice if I open a new buffer, set it to DOS encoding, and then run (newline), than it inserts 0d0a. But the newline docstring doesn't mention anything about making adjustments for encoding.

I think you are conflating two concepts: "newline" is the \n
character, whereas "end of line" (a.k.a. "EOL") is one or two
characters that indicate the end of the line.  The DOS-style CR-LF
end-of-line format is what you are concerned about.

If you want Emacs to convert CR-LF to a single newline, this must be
done when the process output is read, and you should define a proper
coding-system for your process.  If the incoming text _always_ has the
DOS-style CR-LF EOLs, the coding-system for reading from the process
should be either 'dos' or SOMETHING-dos (like 'utf-8-dos') if the
encoding of text is also known.  If the EOL format of incoming text is
not known, and could sometimes be DOS and sometimes not, then leave
the EOL part of the coding-system unspecified, as in 'utf-8', and
Emacs will then auto-detect the EOL format and convert if needed.
(Actually, I think it is safe to always use 'dos' or SOMETHING-dos,
unless you could also see the Mac-specific CR-only EOL, which is
nowadays extremely rare.)

The doc string of string-lines does not mention anything about EOL
format decoding because it is a function for processing strings, not a
function for receiving input from outside Emacs.  The decoding issues
are part of receiving input, and are mentioned in the relevant APIs.




This bug report was last modified 202 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.