Thanks for digging into this, it indeed illustrates the point well.

Just for the record:

Here's what I get on FreeBSD 10.1.2 and on macOS 10.12.4:

$ printf 'a' | sed '' | od -tx1
0000000    61  0a                                                        
0000002

macOS typically comes with an older version of the BSD implementation (which doesn't support --version, but the man pages are dated June 20, 2014 and May 10, 2005, respectively).

Another (minor) point of interest:

On macOS 10.12.4 (but not FreeBSD 10.1.2), Sed chokes on bytes that aren't valid in UTF-8 encoding, when using regex-based functionality:

$ printf '\xfc\n' | sed  -n '/./p'
sed: RE error: illegal byte sequence




On Apr 20, 2017, at 2:32 PM, Assaf Gordon <assafgordon@gmail.com> wrote:

Hello,

On Thu, Apr 20, 2017 at 11:46:15AM -0500, Eric Blake wrote:
On 04/20/2017 11:36 AM, Michael Klement wrote:
Thanks for the detailed feedback, Eric.

The POSIX spec. is, unfortunately, vague on this topic:

The definition of a line (which you quote) is complemented with the definition of an incomplete line <http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_195>:

A sequence of one or more non- <newline> characters at the end of the file.


So while the standard is aware of this possibility and gives it a name that suggests it is a kind of line, but something's missing, there is precious little behavior prescribed with respect to such incomplete lines.


You're welcome to submit a bug report to get POSIX to more clearly word
its intentions that a file with an incomplete line is NOT a text file
(http://austingroupbugs.net/main_page.php), but everyone on the Austin
Group (myself included) has already agreed that the intention is there
(even if the wording could be improved): Omitting a trailing newline
causes sed to enter into the realm of undefined behavior - and this is
BECAUSE there are existing sed implementations that behave differently
when a trailing newline is omitted.  Some do not do anything with an
incomplete line (sed behaves as though the file were truncated at the
last newline).


For completeness, here's the behaviour of several implementaions:

sed implementations that do not add a newline (like gnu sed):
 FreeBSD 10
 OpenBSD 5.9
 BusyBox 1.22
 ToyBox 7.2
 AIX 7

sed implementations that do add a new line:
 NetBSD 7.0
 Heirloom

SunOS 5.11's sed prints nothing if there is no newline:
 $ printf 'a' | sed '' | od -tx1
 0000000
 $ printf 'a\n' | sed '' | od -tx1
 0000000 61 0a
 0000002
 $ uname -a
 SunOS unstable11s 5.11 11.2 sun4u sparc SUNW,SPARC-Enterprise
 $ which sed
 /usr/bin/sed


The behaviour (of processing a file without newline at the last line) also differs in other programs/languages/implementations:

 $ printf a | perl -npe '' | od -tx1
 0000000 61
 0000001

 $ printf a | perl -lnpe '' | od -tx1
 0000000 61 0a
 0000002

 $ printf a | awk '{print}' | od -tx1
 0000000 61 0a
 0000002

 $ printf 'a' | sh -c 'while read A ; do echo $A ; done' | od -tx1
 0000000

 $ printf 'a' \
    | python3 -c 'import sys; [print(x,end="") for x in sys.stdin]' \
    | od -tx1
 0000000 61
 0000001

 $ printf a | uniq-gnu | od -t x1
 0000000 61 0a
 0000002

 $ printf a | uniq-freebsd-11 | od -t x1
 0000000    61
 0000001

 $ printf a | cut-gnu -f1 | od -tx1
 0000000 61 0a
 0000002

 $ printf a | cut-freebsd-11 -f1 | od -tx1
 0000000    61
 0000001

 $ printf a | sort | od -t x1
 0000000 61 0a
 0000002


And this reinforces what Eric wrote: there is simply no
'one correct' (or agreed-upon) way to deal with files without newlines on the last line.


regards,
- assaf