GNU bug report logs - #9680
fmt: -f <unlimited> should be supported

Previous Next

Package: coreutils;

Reported by: Jim Meyering <jim <at> meyering.net>

Date: Thu, 6 Oct 2011 06:40:02 UTC

Severity: wishlist

Full log


Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Bug 868747 <868747 <at> bugs.launchpad.net>
Cc: bug-coreutils <at> gnu.org
Subject: Re: [Bug 868747] Re: fmt -f <unlimited> should be supported
Date: Thu, 06 Oct 2011 08:39:37 +0200
jimav wrote:
>   Strictly speaking this is an enhancement request.
>
>   fmt imposes an artificial limit on the maximum output line length
>   controlled by the -f option, which prevents using this tool to "join"

You meant -w, not -f, throughout.

Thanks for the suggestion.  Note that the code has this:

    /* Size of paragraph buffer, in words and characters.  Longer paragraphs
       are handled neatly (cf. flush_paragraph()), so long as these values
       are considerably greater than required by the width.  These values
       cannot be extended indefinitely: doing so would run into size limits
       and/or cause more overflows in cost calculations.  FIXME: Remove these
       arbitrary limits.  */

    #define MAXWORDS	1000
    #define MAXCHARS	5000

where MAXCHARS/2 specifies the largest width.
I.e., fmt -w 2500 works, but not 2501.

We agree that there should not be such a limit.
But the internals of fmt are not pretty -- significantly less
so than most other parts of the coreutils, and as the comment says
we cannot easily increase them arbitrarily.

In the mean time what can you do if you want truly unlimited-length
paragraphs?  It's not trivial since you want to retain paragraph delimiters.
This perl command should do the trick.
It processes your input a paragraph at a time, replacing each newline
(and spaces before/after) with a single space:

    perl -00ple 's/\s*\n\s*/ /g'

E.g., given this input,

1
2
3
4

1
2
3
4
5

It prints this:

    $ (seq 4; echo; seq 5) | perl -00ple 's/\s*\n\s*/ /g'
    1 2 3 4

    1 2 3 4 5

It doesn't preserve indentation, but if you're just going to
paste it into libreoffice, that should be fine.

I've Cc'd the upstream bug-tracker, so we'll have a bug number there, too.




This bug report was last modified 6 years and 247 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.