GNU bug report logs - #26574
v4.4: POSIX violation with respect to output of a trailing newline, even with --posix

Previous Next

Package: sed;

Reported by: Michael Klement <michael.klement <at> usa.net>

Date: Thu, 20 Apr 2017 04:00:02 UTC

Severity: normal

Tags: notabug

Done: Eric Blake <eblake <at> redhat.com>

Bug is archived. No further changes may be made.

Full log


Message #27 received at 26574-done <at> debbugs.gnu.org (full text, mbox):

From: Eric Blake <eblake <at> redhat.com>
To: Michael Klement <michael.klement <at> usa.net>,
 Assaf Gordon <assafgordon <at> gmail.com>
Cc: 26574-done <at> debbugs.gnu.org
Subject: Re: bug#26574: v4.4: POSIX violation with respect to output of a
 trailing newline, even with --posix
Date: Thu, 20 Apr 2017 14:36:39 -0500
[Message part 1 (text/plain, inline)]
On 04/20/2017 02:32 PM, Michael Klement wrote:

> On macOS 10.12.4 (but not FreeBSD 10.1.2), Sed chokes on bytes that aren't valid in UTF-8 encoding, when using regex-based functionality:
> 
> $ printf '\xfc\n' | sed  -n '/./p'
> sed: RE error: illegal byte sequence
> 

That's locale dependent (should not happen with LC_ALL=C) - but it
illustrates another nice point about POSIX text files: a text file may
not have encoding errors, but as a corollary of that fact, there exist
files which are text files in some locales but binary files in others!

The behavior of sed is only specified when you have no encoding errors,
so your choice of locale can indeed affect whether you get output that
you wanted.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

[signature.asc (application/pgp-signature, attachment)]

This bug report was last modified 8 years and 35 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.