GNU bug report logs -
#7085
fmt (GNU coreutils) 6.10
Previous Next
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 7085 in the body.
You can then email your comments to 7085 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#7085
; Package
coreutils
.
(Wed, 22 Sep 2010 20:23:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
"Denis M. Wilson" <dmw <at> oxytropis.plus.com>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Wed, 22 Sep 2010 20:23:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
This program does not deal properly with CRLF terminators.
Gratuitous CRs are left in joined lines; they should be
removed. The user may want to keep the CRLF style or change
to Unix (LF). There should be an option for this.
Denis M. Wilson
--
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#7085
; Package
coreutils
.
(Wed, 22 Sep 2010 21:06:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 7085 <at> debbugs.gnu.org (full text, mbox):
On 09/22/2010 01:59 PM, Denis M. Wilson wrote:
> This program does not deal properly with CRLF terminators.
> Gratuitous CRs are left in joined lines; they should be
> removed. The user may want to keep the CRLF style or change
> to Unix (LF). There should be an option for this.
Thanks for the report.
POSIX is clear that CR is data and not a line terminator, so the
existing behavior (in the absence of any new command-line option) is
correct.
Now, we have a philosphy question - is it better to teach fmt a new
command line option, and then have to wait for a new enough coreutils
installation to propagate to various machines where you plan on using
that extension, or is it better to use existing tools that are already
portable according to POSIX and do the job now? If you answered this
question the same as me, then a solution that works now is favorable, so
it becomes a question of massaging your files to get rid of the CR
characters prior to using fmt on the file. Something as simple as:
tr -d '\r' < file > file.out && mv file.out file
will do the trick. Other common wrappers, like dos2unix, exist to make
this sanitization process even easier. So, I'm reluctant to add such a
feature myself, and it will take some pretty strong arguments (such as
existing practice in other fmt implementations) to convince me that a
new option is worthwhile.
Meanwhile, you may want to consider upgrading to a newer coreutils; the
latest stable version is 8.5, with a number of bug fixes in various tools.
--
Eric Blake eblake <at> redhat.com +1-801-349-2682
Libvirt virtualization library http://libvirt.org
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#7085
; Package
coreutils
.
(Mon, 27 Sep 2010 22:24:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 7085 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Wed, Sep 22, 2010 at 2:59 PM, Denis M. Wilson <dmw <at> oxytropis.plus.com>wrote:
> This program does not deal properly with CRLF terminators.
> Gratuitous CRs are left in joined lines; they should be
> removed. The user may want to keep the CRLF style or change
> to Unix (LF). There should be an option for this.
>
> Denis M. Wilson
>
> --
>
>
>
> Here is a patch that I wrote because I was bored today (I finished all my
homework this weekend :^p) that will remove CRs from lines ending in CRLF. I
must warn you that I didn't add documentation to the --help (and by
implication the man page) or to the texi files. I did this to minimize the
number of changes, so as to become less likely to conflict with future
commits, since this is most likely not going into the mainstream repository.
You invoke it like this:
fmt -d [file]
or alternatively:
fmt --dos [file]
The reason I sent this to everyone and not just Denis Wilson is that I
wanted to be able to point to it in the future if people want this feature
and are willing to risk future incompatibility.
I believe it works, though I haven't tested it too throughly. Though
Valgrind doesn't complain and it seems to do the job. (I downloaded a file
written on MS-DOS and it ran it through via: 'fmt -d file.txt' and it works
on it.)
Hope this helps someone,
William
From 12a2bee879e3c803f872fe1960a1dedaed485d10 Mon Sep 17 00:00:00 2001
From: Patrick W. Plusnick II <pwplusnick2 <at> gmail.com>
Date: Mon, 27 Sep 2010 16:57:06 -0500
Subject: [PATCH] fmt: added the -d option so that it removes Carriage
Returns from MS-DOS files
* src/fmt.c: simply removes the Carriage Returns from lines ending with
CRLF.
---
src/fmt.c | 13 +++++++++++--
1 files changed, 11 insertions(+), 2 deletions(-)
diff --git a/src/fmt.c b/src/fmt.c
index 8a5d8bd..9150f43 100644
--- a/src/fmt.c
+++ b/src/fmt.c
@@ -173,6 +173,8 @@ static void put_space (int space);
/* If true, first 2 lines may have different indent (default false). */
static bool crown;
+/* If true, Removes the CR out of CRLFs. Mainly for MS-DOS files */
+static bool trunc_crlf;
/* If true, first 2 lines _must_ have different indent (default false). */
static bool tagged;
@@ -304,6 +306,7 @@ With no FILE, or when FILE is -, read standard
input.\n"),
static struct option const long_options[] =
{
{"crown-margin", no_argument, NULL, 'c'},
+ {"dos", no_argument, NULL, 'd'},
{"prefix", required_argument, NULL, 'p'},
{"split-only", no_argument, NULL, 's'},
{"tagged-paragraph", no_argument, NULL, 't'},
@@ -329,7 +332,7 @@ main (int argc, char **argv)
atexit (close_stdout);
- crown = tagged = split = uniform = false;
+ crown = trunc_crlf = tagged = split = uniform = false;
max_width = WIDTH;
prefix = "";
prefix_length = prefix_lead_space = prefix_full_length = 0;
@@ -345,7 +348,7 @@ main (int argc, char **argv)
argc--;
}
- while ((optchar = getopt_long (argc, argv, "0123456789cstuw:p:",
+ while ((optchar = getopt_long (argc, argv, "0123456789cdstuw:p:",
long_options, NULL))
!= -1)
switch (optchar)
@@ -361,6 +364,10 @@ main (int argc, char **argv)
crown = true;
break;
+ case 'd':
+ trunc_crlf = true;
+ break;
+
case 's':
split = true;
break;
@@ -691,6 +698,8 @@ get_line (FILE *f, int c)
word_limit++;
}
while (c != '\n' && c != EOF);
+ if (c == '\n' && *(wptr-1) == '\r' && trunc_crlf)
+ *--wptr = '\n';
return get_prefix (f);
}
--
1.7.0.4
[Message part 2 (text/html, inline)]
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#7085
; Package
coreutils
.
(Mon, 27 Sep 2010 23:21:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 7085 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
>
> I believe it works, though I haven't tested it too throughly. Though
> Valgrind doesn't complain and it seems to do the job. (I downloaded a file
> written on MS-DOS and it ran it through via: 'fmt -d file.txt' and it works
> on it.)
>
Well this is a bit embarrassing, it appears that my patch does indeed change
the output a bit that was subtle in the particular file I was working with
and it is obvious as to why now. Oh well, that is what I get for not testing
much.
So it is indeed a broken, unofficial feature.
I'll try to fix it,
William
[Message part 2 (text/html, inline)]
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#7085
; Package
coreutils
.
(Tue, 28 Sep 2010 03:26:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 7085 <at> debbugs.gnu.org (full text, mbox):
William Plusnick wrote:
> The reason I sent this to everyone and not just Denis Wilson is that I
> wanted to be able to point to it in the future if people want this feature
> and are willing to risk future incompatibility.
Isn't the more modular flow that Eric suggested more in keeping with
the Unix philosophy and the better solution? If you want to do both
stripping of carriage returns and word wrapping together then you can
pipe them together.
tr -d '\r' < somefile | fmt
Or if you are calling it as a stdin filter such as within an editor:
tr -d '\r' | fmt
Bob
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#7085
; Package
coreutils
.
(Wed, 29 Sep 2010 20:18:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 7085 <at> debbugs.gnu.org (full text, mbox):
> Isn't the more modular flow that Eric suggested more in keeping with
> the Unix philosophy and the better solution? If you want to do both
> stripping of carriage returns and word wrapping together then you can
> pipe them together.
>
> tr -d '\r' < somefile | fmt
>
> Or if you are calling it as a stdin filter such as within an editor:
>
> tr -d '\r' | fmt
>
> Bob
>
Yes, it is indeed. I don't know exactly why I chose to try (and
ultimately fail) writing that patch. I think the koan ESR wrote for
The Rootless Root: The Unix Koans Of Master Foo has some truth to it:
“There is more Unix-nature in one line of shell script than there is
in ten thousand lines of C.”
Finally in true koan fashion, "I am enlightened." :^)
William
Reply sent
to
Jim Meyering <jim <at> meyering.net>
:
You have taken responsibility.
(Fri, 22 Jul 2011 22:13:01 GMT)
Full text and
rfc822 format available.
Notification sent
to
"Denis M. Wilson" <dmw <at> oxytropis.plus.com>
:
bug acknowledged by developer.
(Fri, 22 Jul 2011 22:13:01 GMT)
Full text and
rfc822 format available.
Message #25 received at 7085-done <at> debbugs.gnu.org (full text, mbox):
tags 7085 + notabug
close 7085
thanks
Denis M. Wilson wrote:
> This program does not deal properly with CRLF terminators.
> Gratuitous CRs are left in joined lines; they should be
> removed. The user may want to keep the CRLF style or change
> to Unix (LF). There should be an option for this.
As seen in the rest of this thread, this is not a bug.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Sat, 20 Aug 2011 11:24:06 GMT)
Full text and
rfc822 format available.
This bug report was last modified 13 years and 308 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.