GNU bug report logs -
#9116
Bug in unexpand --all of <spaces><tab>
Previous Next
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 9116 in the body.
You can then email your comments to 9116 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#9116
; Package
coreutils
.
(Mon, 18 Jul 2011 12:43:03 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Hallvard B Furuseth <h.b.furuseth <at> usit.uio.no>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Mon, 18 Jul 2011 12:43:03 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Unexpand --all of <7 printables, 2-8 spaces, tab, word> loses a tab.
perl -lwe 'print 1234567, " " x $_, "\t$_" for (1..9)' | unexpand --all
-->
1234567 1
1234567 2
1234567 3
1234567 4
1234567 5
1234567 6
1234567 7
1234567 8
1234567 9
Coreutils-8.12. Old bug, has existed at least since version 6.8.
--
Hallvard
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#9116
; Package
coreutils
.
(Mon, 18 Jul 2011 15:28:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 9116 <at> debbugs.gnu.org (full text, mbox):
On 18/07/11 12:18, Hallvard B Furuseth wrote:
> Unexpand --all of <7 printables, 2-8 spaces, tab, word> loses a tab.
>
> perl -lwe 'print 1234567, " " x $_, "\t$_" for (1..9)' | unexpand --all
> -->
> 1234567 1
> 1234567 2
> 1234567 3
> 1234567 4
> 1234567 5
> 1234567 6
> 1234567 7
> 1234567 8
> 1234567 9
>
> Coreutils-8.12. Old bug, has existed at least since version 6.8.
Yep, 5.97 has the same issue at least.
Interestingly the i18n patch gets it right:
$ printf "1234567 \t8\n" | unexpand -a
1234567 8
$ printf "1234567 \t8\n" | LANG=C unexpand -a
1234567 8
Looking at this for a few minutes suggests the following patch.
Though it's probably wrong, as I'm not sure why the current
code is not converting the trailing space in a field to a tab,
which is even enforced with test misc/unexpand::infloop-3.
Note the i18n patch does not maintain this trailing space,
nor does freebsd, which is what I'd expect.
$ printf "[ \t\t ]\n" | unexpand -t 2,3 | tr '\t ' ts
[ttts]
$ printf "[ \t\t ]\n" | LANG=C unexpand -t 2,3 | tr '\t ' ts
[stts]
I'll look at this later this evening.
cheers,
Pádraig.
diff --git a/src/unexpand.c b/src/unexpand.c
index 0014375..1489c4b 100644
--- a/src/unexpand.c
+++ b/src/unexpand.c
@@ -381,11 +381,14 @@ unexpand (void)
/* Discard pending blanks, unless it was a single
blank just before the previous tab stop. */
- if (! (pending == 1 && one_blank_before_tab_stop))
+ if (one_blank_before_tab_stop)
{
- pending = 0;
+ pending = 1;
+ pending_blank[0] = '\t';
one_blank_before_tab_stop = false;
}
+ else
+ pending = 0;
}
else
{
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#9116
; Package
coreutils
.
(Mon, 18 Jul 2011 23:56:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 9116 <at> debbugs.gnu.org (full text, mbox):
On 18/07/11 16:25, Pádraig Brady wrote:
> On 18/07/11 12:18, Hallvard B Furuseth wrote:
>> Unexpand --all of <7 printables, 2-8 spaces, tab, word> loses a tab.
>>
>> perl -lwe 'print 1234567, " " x $_, "\t$_" for (1..9)' | unexpand --all
>> -->
>> 1234567 1
>> 1234567 2
>> 1234567 3
>> 1234567 4
>> 1234567 5
>> 1234567 6
>> 1234567 7
>> 1234567 8
>> 1234567 9
>>
>> Coreutils-8.12. Old bug, has existed at least since version 6.8.
>
> Yep, 5.97 has the same issue at least.
> Interestingly the i18n patch gets it right:
>
> $ printf "1234567 \t8\n" | unexpand -a
> 1234567 8
> $ printf "1234567 \t8\n" | LANG=C unexpand -a
> 1234567 8
>
> Looking at this for a few minutes suggests the following patch.
> Though it's probably wrong, as I'm not sure why the current
> code is not converting the trailing space in a field to a tab,
> which is even enforced with test misc/unexpand::infloop-3.
> Note the i18n patch does not maintain this trailing space,
> nor does freebsd, which is what I'd expect.
>
> $ printf "[ \t\t ]\n" | unexpand -t 2,3 | tr '\t ' ts
> [ttts]
> $ printf "[ \t\t ]\n" | LANG=C unexpand -t 2,3 | tr '\t ' ts
> [stts]
>
> I'll look at this later this evening.
Actually POSIX is quite specific and my reading
is that a space before tabstop should be preserved
iff it's the only blank before tabstop and it
isn't followed by another blank.
In that sense, both i18n patched unexpand
and current coreutils get this wrong.
The following seems to conform to POSIX
and will need tests/misc/unexpand tweaked.
I'll clean this up and add some tests tomorrow.
Note the change that seemed to introduce
this issue, was to adjust as per POSIX, and
was added in 5.3.0
diff --git a/src/unexpand.c b/src/unexpand.c
index 0014375..53b5a18 100644
--- a/src/unexpand.c
+++ b/src/unexpand.c
@@ -379,13 +379,8 @@ unexpand (void)
{
column = next_tab_column;
- /* Discard pending blanks, unless it was a single
- blank just before the previous tab stop. */
- if (! (pending == 1 && one_blank_before_tab_stop))
- {
- pending = 0;
- one_blank_before_tab_stop = false;
- }
+ if (pending)
+ pending_blank[0] = '\t';
}
else
{
@@ -404,8 +399,11 @@ unexpand (void)
/* Replace the pending blanks by a tab or two. */
pending_blank[0] = c = '\t';
- pending = one_blank_before_tab_stop;
}
+
+ /* Discard pending blanks, unless it was a single
+ blank just before the previous tab stop. */
+ pending = one_blank_before_tab_stop;
}
}
else if (c == '\b')
cheers,
Pâdraig.
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#9116
; Package
coreutils
.
(Tue, 19 Jul 2011 07:33:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 9116 <at> debbugs.gnu.org (full text, mbox):
Pádraig Brady writes:
> Actually POSIX is quite specific and my reading
> is that a space before tabstop should be preserved
> iff it's the only blank before tabstop and it
> isn't followed by another blank.
>
> In that sense, both i18n patched unexpand
> and current coreutils get this wrong.
Coreutils 5.12 gets that right in my test:
1st output line is "1234567<space><tab>1".
But now that you mention it, an option to never
output the sequence <space><tab> would be nice.
--
Hallvard
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#9116
; Package
coreutils
.
(Tue, 19 Jul 2011 07:33:02 GMT)
Full text and
rfc822 format available.
Message #17 received at 9116 <at> debbugs.gnu.org (full text, mbox):
Hallvard B Furuseth writes:
> Coreutils 5.12 gets that right in my test:
> 1st output line is "1234567<space><tab>1".
Oops, coreutils 8.12.
> But now that you mention it, an option to never
> output the sequence <space><tab> would be nice.
--
Hallvard
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#9116
; Package
coreutils
.
(Tue, 19 Jul 2011 09:44:01 GMT)
Full text and
rfc822 format available.
Message #20 received at 9116 <at> debbugs.gnu.org (full text, mbox):
On 19/07/11 08:13, Hallvard B Furuseth wrote:
> Pádraig Brady writes:
>> Actually POSIX is quite specific and my reading
>> is that a space before tabstop should be preserved
>> iff it's the only blank before tabstop and it
>> isn't followed by another blank.
>>
>> In that sense, both i18n patched unexpand
>> and current coreutils get this wrong.
>
> Coreutils 5.12 gets that right in my test:
> 1st output line is "1234567<space><tab>1".
That's incorrect according to POSIX.
The space should be converted to tab as
it's followed by a blank.
> But now that you mention it, an option to never
> output the sequence <space><tab> would be nice.
From my reading of POSIX, that's what it specifies.
cheers,
Pádraig.
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#9116
; Package
coreutils
.
(Tue, 19 Jul 2011 10:43:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 9116 <at> debbugs.gnu.org (full text, mbox):
Pádraig Brady writes:
>On 19/07/11 08:13, Hallvard B Furuseth wrote:
>> Coreutils 5.12 gets that right in my test:
>> 1st output line is "1234567<space><tab>1".
>
> That's incorrect according to POSIX.
> The space should be converted to tab as
> it's followed by a blank.
Duh, sorry. I read "...immediately preceding a tab stop"
as "...immediately preceding a <tab>".
The manpage about this is wrong too:
"-a, --all convert all blanks, instead of just initial blanks"
No, not single spaces that happen to be just before a tab stop.
--
Hallvard
bug marked as fixed in version 8.13, send any further explanations to
9116 <at> debbugs.gnu.org and Hallvard B Furuseth <h.b.furuseth <at> usit.uio.no>
Request was from
Pádraig Brady <P <at> draigBrady.com>
to
control <at> debbugs.gnu.org
.
(Wed, 20 Jul 2011 13:03:02 GMT)
Full text and
rfc822 format available.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Thu, 18 Aug 2011 11:24:03 GMT)
Full text and
rfc822 format available.
This bug report was last modified 13 years and 362 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.