GNU bug report logs - #9116
Bug in unexpand --all of <spaces><tab>

Previous Next

Package: coreutils;

Reported by: Hallvard B Furuseth <h.b.furuseth <at> usit.uio.no>

Date: Mon, 18 Jul 2011 12:43:03 UTC

Severity: normal

Fixed in version 8.13

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Pádraig Brady <P <at> draigBrady.com>
To: Hallvard B Furuseth <h.b.furuseth <at> usit.uio.no>
Cc: 9116 <at> debbugs.gnu.org
Subject: bug#9116: Bug in unexpand --all of <spaces><tab>
Date: Tue, 19 Jul 2011 00:53:12 +0100
On 18/07/11 16:25, Pádraig Brady wrote:
> On 18/07/11 12:18, Hallvard B Furuseth wrote:
>> Unexpand --all of <7 printables, 2-8 spaces, tab, word> loses a tab.
>>
>> perl -lwe 'print 1234567, " " x $_, "\t$_" for (1..9)' | unexpand --all
>> -->
>> 1234567 	1
>> 1234567	2
>> 1234567	3
>> 1234567	4
>> 1234567	5
>> 1234567	6
>> 1234567	7
>> 1234567	8
>> 1234567			9
>>
>> Coreutils-8.12.  Old bug, has existed at least since version 6.8.
> 
> Yep, 5.97 has the same issue at least.
> Interestingly the i18n patch gets it right:
> 
> $ printf "1234567        \t8\n" | unexpand -a
> 1234567         8
> $ printf "1234567        \t8\n" | LANG=C unexpand -a
> 1234567 8
> 
> Looking at this for a few minutes suggests the following patch.
> Though it's probably wrong, as I'm not sure why the current
> code is not converting the trailing space in a field to a tab,
> which is even enforced with test misc/unexpand::infloop-3.
> Note the i18n patch does not maintain this trailing space,
> nor does freebsd, which is what I'd expect.
> 
> $ printf "[ \t\t ]\n" | unexpand -t 2,3 | tr '\t ' ts
> [ttts]
> $ printf "[ \t\t ]\n" | LANG=C unexpand -t 2,3 | tr '\t ' ts
> [stts]
> 
> I'll look at this later this evening.

Actually POSIX is quite specific and my reading
is that a space before tabstop should be preserved
iff it's the only blank before tabstop and it
isn't followed by another blank.

In that sense, both i18n patched unexpand
and current coreutils get this wrong.

The following seems to conform to POSIX
and will need tests/misc/unexpand tweaked.

I'll clean this up and add some tests tomorrow.

Note the change that seemed to introduce
this issue, was to adjust as per POSIX, and
was added in 5.3.0

diff --git a/src/unexpand.c b/src/unexpand.c
index 0014375..53b5a18 100644
--- a/src/unexpand.c
+++ b/src/unexpand.c
@@ -379,13 +379,8 @@ unexpand (void)
                         {
                           column = next_tab_column;

-                          /* Discard pending blanks, unless it was a single
-                             blank just before the previous tab stop.  */
-                          if (! (pending == 1 && one_blank_before_tab_stop))
-                            {
-                              pending = 0;
-                              one_blank_before_tab_stop = false;
-                            }
+                          if (pending)
+                            pending_blank[0] = '\t';
                         }
                       else
                         {
@@ -404,8 +399,11 @@ unexpand (void)

                           /* Replace the pending blanks by a tab or two.  */
                           pending_blank[0] = c = '\t';
-                          pending = one_blank_before_tab_stop;
                         }
+
+                      /* Discard pending blanks, unless it was a single
+                         blank just before the previous tab stop.  */
+                      pending = one_blank_before_tab_stop;
                     }
                 }
               else if (c == '\b')

cheers,
Pâdraig.




This bug report was last modified 13 years and 364 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.