GNU bug report logs - #22001
Is it possible to tab separate concatenated files?

Previous Next

Package: coreutils;

Reported by: "Macdonald, Kim - BCCDC" <kim.macdonald <at> bccdc.ca>

Date: Mon, 23 Nov 2015 21:03:02 UTC

Severity: normal

Tags: notabug

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


Message #35 received at 22001 <at> debbugs.gnu.org (full text, mbox):

From: Erik Auerswald <auerswal <at> unix-ag.uni-kl.de>
To: Eric Blake <eblake <at> redhat.com>
Cc: 22001 <at> debbugs.gnu.org, kim.macdonald <at> bccdc.ca,
 Linda Walsh <coreutils <at> tlinx.org>, Bob Proulx <bob <at> proulx.com>
Subject: Re: bug#22001: Is it possible to tab separate concatenated files?
Date: Fri, 27 Nov 2015 09:22:05 +0100
Hi,

On Thu, Nov 26, 2015 at 08:28:13PM -0700, Eric Blake wrote:
> On 11/26/2015 04:52 PM, Linda Walsh wrote:
> 
> >> Because every plain
> >> text line in a file must be terminated with a newline.
> > ----
> >    That's only a recent POSIX definition.  It's not related to
> > real life.  When I looked for a text file definition on google, nothing
> > was mentioned about needing a newline on the last line -- except on
> > 1 site -- and that site was clearly not talking about 'text' files, but
> > Unix-text-record files w/each record terminated by a NL char.
> > 
> 
> Quit spreading FUD about POSIX.  That definition of text file is NOT a
> recent invention; even back in POSIX 2001 the definition read:
> 
> 3.392 Text File
> 
> A file that contains characters organized into one or more lines. The
> lines do not contain NUL characters and none can exceed {LINE_MAX} bytes
> in length, including the <newline>. Although IEEE Std 1003.1-2001 does
> not distinguish between text files and binary files (see the ISO C
> standard), many utilities only produce predictable or meaningful output
> when operating on text files. The standard utilities that have such
> restrictions always specify "text files" in their STDIN or INPUT FILES
> sections.
> http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap03.html

At least the definition of a "line" is needed as well to understand the
above (from the same URL):

 3.205 Line

 A sequence of zero or more non- <newline>s plus a terminating <newline>.

[...]
> 
> No, it has ALWAYS been a problem.  Even 40 years ago, before POSIX was
> invented, the only PORTABLE way to use programs like sed was to use it
> on text files [...]

The sed of Solaris 10 ignores trailing text after the last line, that
is after the last newline. I am quite sure this behavior has been in
older Solaris and SunOS versions as well.

Best regards,
Erik
-- 
http://www.unix-ag.uni-kl.de/~auerswal/




This bug report was last modified 6 years and 213 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.