GNU bug report logs -
#49873
Replacing all \n with spaces doesn't work in GNU sed as expected
Previous Next
Full log
View this message in rfc822 format
tag 49873 notabug
close 49873
stop
Hello,
On 2021-08-04 4:27 a.m., AlvinSeville7cf wrote:
> Hello! I want to read entire file and then replace all *\n* with space.
For that I would recommend using 'tr' - it'll be much faster:
tr '\n' ' ' < input > output
> My sed script is (I know that it is not optimal but it demonstrates
> problem):
>
> |:a $! { N; ta } s/\n/ /g p |
The above script isn't valid as-is (perhaps line breaks were lost in the
email?).
I'm going to assume you meant the following script, and used "sed -n":
sed -n ':a $! { N; ta } ; s/\n/ /g ; p' < input > output
or with line breaks:
sed -n ':a
$! { N; ta }
s/\n/ /g
p' < input > output
> So why even with *g* flag *s* command replaces only first *\n* in
> pattern space? For instance I have the following file:
You script is almost correct :)
I assume that with the "$!{N;ta}" command you meant to accumulate all
lines except the last in the pattern space, and then replace all
the new lines and print the patern space.
The only 'bug': "t" is "conditional jump".
It jumps once to label "a", accumulating one more line, but then
doesn't jump again - so the "s///" is executed and the two lines are
printed (and one newline replaced with space). The "s///" command also
resets the "t" conditional, so the next line (3rd line in the input
file) then does causes a jump.
Observe:
$ seq 10 | sed -n ':a $! { N; ta } ; s/\n/ /g ; p'
1 2
3 4
5 6
7 8
9 10
If you replace the "t" with a "b" command (b = always jump),
it behaves as you expected:
$ seq 10 | sed -n ':a $! { N; ba } ; s/\n/ /g ; p'
1 2 3 4 5 6 7 8 9 10
Note that even with this script, the last newline is preserved and
printed.
As a work-around, you can instruct "sed" to use NUL as line-breaks,
causing "\n" characters to be treated like any other character:
$ seq 10 | sed -z 's/\n/ /g'
1 2 3 4 5 6 7 8 9 10
But this won't be as efficient as using 'tr'.
> |It was the best of times, it was the worst of times, it was the age of
> wisdom, it was the age of foolishness, |
>
> The result of script execution is:
>
> |It was the best of times, it was the worst of times, it was the age of
> wisdom, it was |
> I use GNU sed 4.8. It seems to be a bug.
Without line breaks it's a bit hard to reproduce your case,
but I hope the explanation above was sufficient.
As such I'm closing this as "not a bug",
but discussion can continue by replying to this thread.
regards,
- assaf
This bug report was last modified 3 years and 290 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.