GNU bug report logs - #26574
v4.4: POSIX violation with respect to output of a trailing newline, even with --posix

Previous Next

Package: sed;

Reported by: Michael Klement <michael.klement <at> usa.net>

Date: Thu, 20 Apr 2017 04:00:02 UTC

Severity: normal

Tags: notabug

Done: Eric Blake <eblake <at> redhat.com>

Bug is archived. No further changes may be made.

Full log


Message #21 received at 26574-done <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: Eric Blake <eblake <at> redhat.com>
Cc: Michael Klement <michael.klement <at> usa.net>, 26574-done <at> debbugs.gnu.org
Subject: Re: bug#26574: v4.4: POSIX violation with respect to output of a
 trailing newline, even with --posix
Date: Thu, 20 Apr 2017 18:32:22 +0000
Hello,

On Thu, Apr 20, 2017 at 11:46:15AM -0500, Eric Blake wrote:
>On 04/20/2017 11:36 AM, Michael Klement wrote:
>> Thanks for the detailed feedback, Eric.
>>
>> The POSIX spec. is, unfortunately, vague on this topic:
>>
>> The definition of a line (which you quote) is complemented with the definition of an incomplete line <http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_195>:
>>
>>> A sequence of one or more non- <newline> characters at the end of the file.
>>
>>
>> So while the standard is aware of this possibility and gives it a name that suggests it is a kind of line, but something's missing, there is precious little behavior prescribed with respect to such incomplete lines.
>>
>
>You're welcome to submit a bug report to get POSIX to more clearly word
>its intentions that a file with an incomplete line is NOT a text file
>(http://austingroupbugs.net/main_page.php), but everyone on the Austin
>Group (myself included) has already agreed that the intention is there
>(even if the wording could be improved): Omitting a trailing newline
>causes sed to enter into the realm of undefined behavior - and this is
>BECAUSE there are existing sed implementations that behave differently
>when a trailing newline is omitted.  Some do not do anything with an
>incomplete line (sed behaves as though the file were truncated at the
>last newline).
>

For completeness, here's the behaviour of several implementaions:

sed implementations that do not add a newline (like gnu sed):
  FreeBSD 10
  OpenBSD 5.9
  BusyBox 1.22
  ToyBox 7.2
  AIX 7

sed implementations that do add a new line:
  NetBSD 7.0
  Heirloom

SunOS 5.11's sed prints nothing if there is no newline:
  $ printf 'a' | sed '' | od -tx1
  0000000
  $ printf 'a\n' | sed '' | od -tx1
  0000000 61 0a
  0000002
  $ uname -a
  SunOS unstable11s 5.11 11.2 sun4u sparc SUNW,SPARC-Enterprise
  $ which sed
  /usr/bin/sed


The behaviour (of processing a file without newline at the last 
line) also differs in other programs/languages/implementations:

  $ printf a | perl -npe '' | od -tx1
  0000000 61
  0000001

  $ printf a | perl -lnpe '' | od -tx1
  0000000 61 0a
  0000002

  $ printf a | awk '{print}' | od -tx1
  0000000 61 0a
  0000002

  $ printf 'a' | sh -c 'while read A ; do echo $A ; done' | od -tx1
  0000000

  $ printf 'a' \
     | python3 -c 'import sys; [print(x,end="") for x in sys.stdin]' \
     | od -tx1
  0000000 61
  0000001

  $ printf a | uniq-gnu | od -t x1
  0000000 61 0a
  0000002

  $ printf a | uniq-freebsd-11 | od -t x1
  0000000    61
  0000001

  $ printf a | cut-gnu -f1 | od -tx1
  0000000 61 0a
  0000002

  $ printf a | cut-freebsd-11 -f1 | od -tx1
  0000000    61
  0000001

  $ printf a | sort | od -t x1
  0000000 61 0a
  0000002


And this reinforces what Eric wrote: there is simply no
'one correct' (or agreed-upon) way to deal with files without newlines 
on the last line.


regards,
- assaf




This bug report was last modified 8 years and 36 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.