GNU bug report logs - #11631
Head command does not position file pointer correctly for negative line count

Previous Next

Package: coreutils;

Reported by: Anoop Sharma <sendtoanoop <at> gmail.com>

Date: Tue, 5 Jun 2012 09:41:01 UTC

Severity: normal

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 11631 in the body.
You can then email your comments to 11631 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#11631; Package coreutils. (Tue, 05 Jun 2012 09:41:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Anoop Sharma <sendtoanoop <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Tue, 05 Jun 2012 09:41:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Anoop Sharma <sendtoanoop <at> gmail.com>
To: bug-coreutils <at> gnu.org
Subject: Head command does not position file pointer correctly for negative
	line count
Date: Tue, 5 Jun 2012 15:07:19 +0530
[Message part 1 (text/plain, inline)]
Head command does not position file pointer correctly for negative line
count. Here is a demonstration of the problem.

Step 1 - Create a file with 10 lines in it.
$ yes "ABC" | head -c 40 >ip.txt
$

Step 2 - If head behaves correctly, then 2 lines should get printed after
"------------" but nothing gets printed!
$ (head -n -2; echo "------------------------"; cat) <ip.txt
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
------------------------
$

Step 3 - Another try fails. If head behaves correctly, then 8 lines should
get printed after "------------" but nothing gets printed!
$ (head -n -8; echo "------------------------"; cat) <ip.txt
ABC
ABC
------------------------
$



/*****************************************************************************************************************************/
Possible cause of the defect -> Following snippet is copied from head.c
(Function - elide_tail_lines_seekable ). Perhaps, there should be a lseek
after fwrite there...:

              /* Output the initial portion of the buffer
                 in which we found the desired newline byte.
                 Don't bother testing for failure for such a small amount.
                 Any failure will be detected upon close.  */
              fwrite (buffer, 1, n + 1, stdout);
/*****************************************************************************************************************************/
[Message part 2 (text/html, inline)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#11631; Package coreutils. (Tue, 05 Jun 2012 10:07:02 GMT) Full text and rfc822 format available.

Message #8 received at 11631 <at> debbugs.gnu.org (full text, mbox):

From: "Voelker, Bernhard" <bernhard.voelker <at> siemens-enterprise.com>
To: Anoop Sharma <sendtoanoop <at> gmail.com>, "11631 <at> debbugs.gnu.org"
	<11631 <at> debbugs.gnu.org>
Subject: RE: bug#11631: Head command does not position file pointer
	correctly for	negative line count
Date: Tue, 5 Jun 2012 10:03:33 +0000
Anoop Sharma wrote:

> Head command does not position file pointer correctly for negative line
> count. Here is a demonstration of the problem.

The problem doesn't seem to be limited to negative
line counts. I replaced the 10 ABC lines by a number
sequence to demonstrate this issue clearer.

  $ seq 10 | ( head -n -2 ; echo xxx ; cat )
  1
  2
  3
  4
  5
  6
  7
  8
  xxx
  $ seq 10 | ( head -n 2 ; echo xxx ; cat )
  1
  2
  xxx

So head eats all of the input. The info page is silent
about this.

Have a nice day,
Berny




Information forwarded to bug-coreutils <at> gnu.org:
bug#11631; Package coreutils. (Tue, 05 Jun 2012 10:29:01 GMT) Full text and rfc822 format available.

Message #11 received at 11631 <at> debbugs.gnu.org (full text, mbox):

From: Anoop Sharma <sendtoanoop <at> gmail.com>
To: "Voelker, Bernhard" <bernhard.voelker <at> siemens-enterprise.com>
Cc: "11631 <at> debbugs.gnu.org" <11631 <at> debbugs.gnu.org>
Subject: Re: bug#11631: Head command does not position file pointer correctly
	for negative line count
Date: Tue, 5 Jun 2012 15:56:18 +0530
[Message part 1 (text/plain, inline)]
Head command behaves differently with seekable and un-seekable input-data
sources.
Pipes are un-seekable. Head's behavior with input provided using pipes will
be different from its behavior when input is provided from a regular file
(which are seekable).

Therefore, the defect that was raised for regular files can not be
evaluated by examples with pipes.

Here's why head behaves differently depending on whether the file allows
lseek():
-------------------------------------------------------------------------------------------------
1. Head command first reads data into a buffer using read() system call and
then operates upon that buffer.
2. Size of the buffer used by head in read() is 8192 bytes on my machine.
3. It is not an error if read() gets lesser number of bytes than requested;
this may happen for example because fewer bytes are actually available when
read accessed the pipe or because read() was interrupted by a signal.
4. Therefore, only the upper bound of the data read in the buffer is fixed,
not the lower bound.
5. Since read() tries to read as much data as it can (upto buffer size),
therefore, in most cases it reads more data into the buffer than actually
needed by head command's algorithm.
6. When head discovers that it does not need all the data in the buffer,
then head tries to return the extra data back to the file descriptor by
using lseek().
7. However, data can not be returned back for un-seekable files. Therefore,
head has to discard extra data in for un-seekable files. This creates
situations that look as if head has eaten some part of the data.

Head's problem with unseekable files - Commands waiting to execute after
head will never get the extra data that was read by head.
Bigger Problem – How much data will be lost is not fixed because how much
data read() actually reads is not fixed (See point 4 above). It is also
possible that no data is lost!!

I tried the following example and it worked as expected:
$ seq 10 >p
$ ( head -n 2 ; echo xxx ; cat )<p
1
2
xxx
3
4
5
6
7
8
9
10
$


On Tue, Jun 5, 2012 at 3:33 PM, Voelker, Bernhard <
bernhard.voelker <at> siemens-enterprise.com> wrote:

> Anoop Sharma wrote:
>
> > Head command does not position file pointer correctly for negative line
> > count. Here is a demonstration of the problem.
>
> The problem doesn't seem to be limited to negative
> line counts. I replaced the 10 ABC lines by a number
> sequence to demonstrate this issue clearer.
>
>  $ seq 10 | ( head -n -2 ; echo xxx ; cat )
>  1
>  2
>  3
>  4
>  5
>  6
>  7
>  8
>  xxx
>  $ seq 10 | ( head -n 2 ; echo xxx ; cat )
>  1
>  2
>  xxx
>
> So head eats all of the input. The info page is silent
> about this.
>
> Have a nice day,
> Berny
>
[Message part 2 (text/html, inline)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#11631; Package coreutils. (Tue, 05 Jun 2012 10:33:02 GMT) Full text and rfc822 format available.

Message #14 received at 11631 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Anoop Sharma <sendtoanoop <at> gmail.com>
Cc: 11631 <at> debbugs.gnu.org
Subject: Re: bug#11631: Head command does not position file pointer correctly
	for negative line count
Date: Tue, 05 Jun 2012 12:29:38 +0200
Anoop Sharma wrote:
> Head command does not position file pointer correctly for negative line
> count. Here is a demonstration of the problem.
>
> Step 1 - Create a file with 10 lines in it.
> $ yes "ABC" | head -c 40 >ip.txt
> $

Thank you for the report.  That is indeed a bug.
Here's a quick example of how head -n-3 should work:

    $ seq 10 > k; (./head -n-3; echo foo; cat) < k
    1
    2
    3
    4
    5
    6
    7
    foo
    8
    9
    10

Before your suggested change, it did this:

    $ seq 10 > k; (head -n-3; echo foo; cat) < k
    1
    2
    3
    4
    5
    6
    7
    foo

I note that a similar change is *not* required for the end-relative
byte-seekable case:

    $ seq 3 > k; (head -c-2; echo foo; cat) < k
    1
    2
    foo
    3

Here's the start of a proper patch.
To come: mention this in NEWS and add a test.

From 0c156fb347dba3f499ed7b922af1ea357f5558c0 Mon Sep 17 00:00:00 2001
From: Jim Meyering <meyering <at> redhat.com>
Date: Tue, 5 Jun 2012 12:24:49 +0200
Subject: [PATCH] head: with --lines=-N (-n-N) reset file pointer on seekable
 input

* src/head.c (elide_tail_lines_seekable): Reset file pointer
after printing up to an end-relative line-counted offset.
Anoop Sharma reported the problem and suggested the fix.
---
 src/head.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/src/head.c b/src/head.c
index d7e83b7..75a69d8 100644
--- a/src/head.c
+++ b/src/head.c
@@ -667,6 +667,14 @@ elide_tail_lines_seekable (const char *pretty_filename, int fd,
                  Don't bother testing for failure for such a small amount.
                  Any failure will be detected upon close.  */
               fwrite (buffer, 1, n + 1, stdout);
+
+              /* Set file pointer to the byte after what we've output.  */
+              if (lseek (fd, start_pos + n + 1, SEEK_SET) < 0)
+                {
+                  error (0, errno, "%s: failed to reset file pointer",
+                         quote (pretty_filename));
+                  return false;
+                }
               return true;
             }
         }
--
1.7.11.rc1




Information forwarded to bug-coreutils <at> gnu.org:
bug#11631; Package coreutils. (Tue, 05 Jun 2012 10:34:02 GMT) Full text and rfc822 format available.

Message #17 received at 11631 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: "Voelker, Bernhard" <bernhard.voelker <at> siemens-enterprise.com>
Cc: Anoop Sharma <sendtoanoop <at> gmail.com>,
	"11631 <at> debbugs.gnu.org" <11631 <at> debbugs.gnu.org>
Subject: Re: bug#11631: Head command does not position file pointer correctly
	for	negative line count
Date: Tue, 05 Jun 2012 11:30:59 +0100
On 06/05/2012 11:03 AM, Voelker, Bernhard wrote:
> Anoop Sharma wrote:
> 
>> Head command does not position file pointer correctly for negative line
>> count. Here is a demonstration of the problem.
> 
> The problem doesn't seem to be limited to negative
> line counts. I replaced the 10 ABC lines by a number
> sequence to demonstrate this issue clearer.
> 
>   $ seq 10 | ( head -n -2 ; echo xxx ; cat )
>   1
>   2
>   3
>   4
>   5
>   6
>   7
>   8
>   xxx
>   $ seq 10 | ( head -n 2 ; echo xxx ; cat )
>   1
>   2
>   xxx
> 
> So head eats all of the input. The info page is silent
> about this.
> 
> Have a nice day,
> Berny

Well reading from files should be more flexible
than pipes as we can reset positions in files with seek.
This works for -c but not for -n.
I'll look at fixing this issue this evening.

For pipes we need to consume the data to determine
what to throw away, and there is no facility to
replace that. I'll try and document this edge
case in the info too.

cheers,
Pádraig.




Information forwarded to bug-coreutils <at> gnu.org:
bug#11631; Package coreutils. (Tue, 05 Jun 2012 11:33:02 GMT) Full text and rfc822 format available.

Message #20 received at 11631 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: 11631 <at> debbugs.gnu.org
Subject: Re: bug#11631: Head command does not position file pointer correctly
	for negative line count
Date: Tue, 05 Jun 2012 13:30:15 +0200
Pádraig Brady wrote:
> On 06/05/2012 11:29 AM, Jim Meyering wrote:
>> Anoop Sharma wrote:
>>> Head command does not position file pointer correctly for negative line
>>> count. Here is a demonstration of the problem.
>>>
>>> Step 1 - Create a file with 10 lines in it.
>>> $ yes "ABC" | head -c 40 >ip.txt
>>> $
>>
>> Thank you for the report.  That is indeed a bug.
>> Here's a quick example of how head -n-3 should work:
>>
>>     $ seq 10 > k; (./head -n-3; echo foo; cat) < k
>>     1
>>     2
>>     3
>>     4
>>     5
>>     6
>>     7
>>     foo
>>     8
>>     9
>>     10
>>
>> Before your suggested change, it did this:
>>
>>     $ seq 10 > k; (head -n-3; echo foo; cat) < k
>>     1
>>     2
>>     3
>>     4
>>     5
>>     6
>>     7
>>     foo
>>
>> I note that a similar change is *not* required for the end-relative
>> byte-seekable case:
>>
>>     $ seq 3 > k; (head -c-2; echo foo; cat) < k
>>     1
>>     2
>>     foo
>>     3
>>
>> Here's the start of a proper patch.
>> To come: mention this in NEWS and add a test.
>>
>>>From 0c156fb347dba3f499ed7b922af1ea357f5558c0 Mon Sep 17 00:00:00 2001
>> From: Jim Meyering <meyering <at> redhat.com>
>> Date: Tue, 5 Jun 2012 12:24:49 +0200
>> Subject: [PATCH] head: with --lines=-N (-n-N) reset file pointer on seekable
>>  input
>>
>> * src/head.c (elide_tail_lines_seekable): Reset file pointer
>> after printing up to an end-relative line-counted offset.
>> Anoop Sharma reported the problem and suggested the fix.
>> ---
>>  src/head.c | 8 ++++++++
>>  1 file changed, 8 insertions(+)
>>
>> diff --git a/src/head.c b/src/head.c
>> index d7e83b7..75a69d8 100644
>> --- a/src/head.c
>> +++ b/src/head.c
>> @@ -667,6 +667,14 @@ elide_tail_lines_seekable (const char *pretty_filename, int fd,
>>                   Don't bother testing for failure for such a small amount.
>>                   Any failure will be detected upon close.  */
>>                fwrite (buffer, 1, n + 1, stdout);
>> +
>> +              /* Set file pointer to the byte after what we've output.  */
>> +              if (lseek (fd, start_pos + n + 1, SEEK_SET) < 0)
>
> s/start_pos/pos/ ?

Oh!  Good catch.  Thanks!  You're right.
Now I'll have to construct a test that exercises that bug, too.




Information forwarded to bug-coreutils <at> gnu.org:
bug#11631; Package coreutils. (Tue, 05 Jun 2012 18:15:02 GMT) Full text and rfc822 format available.

Message #23 received at 11631 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: 11631 <at> debbugs.gnu.org
Subject: Re: bug#11631: Head command does not position file pointer correctly
	for negative line count
Date: Tue, 05 Jun 2012 20:12:43 +0200
Jim Meyering wrote:
>>> Here's the start of a proper patch.
>>> To come: mention this in NEWS and add a test.

Here's the complete patch.
Figuring out how to trigger the bug in my patch
(s/start_pos/pos/) that Pádraig spotted was interesting.
No prior test case exercised any of the related code,
at least not with the BUFSIZ value of 8KiB that I have here.

As for when the bug was introduced, I confirmed that it's present
at least as far back as textutils-2.1 with this:

  printf '\nok\n' > k; (head -n-1>/dev/null; grep o || echo FAIL) < k

and there was no significant seek-related change prior to that.


From 295ee521bc1a4f473ee8b7b5a4be32c5b5c7386f Mon Sep 17 00:00:00 2001
From: Jim Meyering <meyering <at> redhat.com>
Date: Tue, 5 Jun 2012 12:24:49 +0200
Subject: [PATCH] head: with --lines=-N (-n-N) reset file pointer on seekable
 input
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* src/head.c (elide_tail_lines_seekable): Reset file pointer
after printing up to an end-relative line-counted offset.
Anoop Sharma reported the problem and suggested the fix.
* tests/misc/head-pos: Add coverage via a very similar, existing test.
Also add coverage for a previously untested block of code.
* tests/misc/head-elide-tail ($READ_BUFSIZE): Update to 8192, to
match the value of BUFSIZ I see today on Fedora 17/x86_64 (unrelated
to this fix).
* NEWS (Bug fixes): Mention it.

Improved-by: Pádraig Brady
---
 NEWS                       |  7 +++++++
 src/head.c                 |  8 ++++++++
 tests/misc/head-elide-tail |  2 +-
 tests/misc/head-pos        | 26 +++++++++++++++++++-------
 4 files changed, 35 insertions(+), 8 deletions(-)

diff --git a/NEWS b/NEWS
index 7c7c2c3..c0e5fdb 100644
--- a/NEWS
+++ b/NEWS
@@ -8,6 +8,13 @@ GNU coreutils NEWS                                    -*- outline -*-
   processes will not intersperse their output.
   [the bug dates back to the initial implementation]

+  head --lines=-N (-n-N) now resets the read pointer of a seekable input file.
+  This means that "head -n-3" no longer consumes all of its input, and lines
+  not output by head may be processed by other programs.  For example, this
+  command now prints the final line, 2, while before it would print nothing:
+    seq 2 > k; (head -n-1 > /dev/null; cat) < k
+  [This bug was present in "the beginning".]
+
   ls --color would mis-color relative-named symlinks in /
   [bug introduced in coreutils-8.17]

diff --git a/src/head.c b/src/head.c
index d7e83b7..0d5e1b2 100644
--- a/src/head.c
+++ b/src/head.c
@@ -667,6 +667,14 @@ elide_tail_lines_seekable (const char *pretty_filename, int fd,
                  Don't bother testing for failure for such a small amount.
                  Any failure will be detected upon close.  */
               fwrite (buffer, 1, n + 1, stdout);
+
+              /* Set file pointer to the byte after what we've output.  */
+              if (lseek (fd, pos + n + 1, SEEK_SET) < 0)
+                {
+                  error (0, errno, "%s: failed to reset file pointer",
+                         quote (pretty_filename));
+                  return false;
+                }
               return true;
             }
         }
diff --git a/tests/misc/head-elide-tail b/tests/misc/head-elide-tail
index de4896b..85f509d 100755
--- a/tests/misc/head-elide-tail
+++ b/tests/misc/head-elide-tail
@@ -26,7 +26,7 @@ $ENV{PROG} = 'head';
 @ENV{qw(LANGUAGE LANG LC_ALL)} = ('C') x 3;

 # This should match the definition in head.c.
-my $READ_BUFSIZE = 4096;
+my $READ_BUFSIZE = 8192;

 my @Tests =
   (
diff --git a/tests/misc/head-pos b/tests/misc/head-pos
index 3d96261..fa3284e 100755
--- a/tests/misc/head-pos
+++ b/tests/misc/head-pos
@@ -21,12 +21,24 @@
 print_ver_ head

 (echo a; echo b) > in || framework_failure_
-
-(head -n 1 >/dev/null; cat) < in > out || fail=1
-cat <<EOF > exp
-b
-EOF
-
-compare exp out || fail=1
+echo b > exp || framework_failure_
+
+for i in -1 1; do
+  (head -n $i >/dev/null; cat) < in > out || fail=1
+  compare exp out || fail=1
+done
+
+# Exercise the (start_pos < pos) block in elide_tail_lines_seekable.
+# So far, this is the only test to do that.
+# Do that by creating a file larger than BUFSIZ (I've seen 128K) and
+# elide a suffix of it (by line count) that is also larger than BUFSIZ.
+# 50000 lines times 6 bytes per line gives us enough leeway even on a
+# system with a BUFSIZ of 256K.
+n_lines=50000
+seq 70000 > in2 || framework_failure_
+echo $n_lines > exp-n || framework_failure_
+
+(head -n-$n_lines>/dev/null; wc -l) < in2 > n
+compare exp-n n || fail=1

 Exit $fail
--
1.7.11.rc1




Information forwarded to bug-coreutils <at> gnu.org:
bug#11631; Package coreutils. (Tue, 05 Jun 2012 20:28:01 GMT) Full text and rfc822 format available.

Message #26 received at 11631 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: 11631 <at> debbugs.gnu.org
Subject: Re: bug#11631: Head command does not position file pointer correctly
	for negative line count
Date: Tue, 05 Jun 2012 21:24:58 +0100
On 06/05/2012 07:12 PM, Jim Meyering wrote:
> Jim Meyering wrote:
>>>> Here's the start of a proper patch.
>>>> To come: mention this in NEWS and add a test.
> 
> Here's the complete patch.

Nice tests.
+1

cheers,
Pádraig.




Information forwarded to bug-coreutils <at> gnu.org:
bug#11631; Package coreutils. (Tue, 05 Jun 2012 20:36:01 GMT) Full text and rfc822 format available.

Message #29 received at 11631 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: 11631 <at> debbugs.gnu.org
Subject: Re: bug#11631: Head command does not position file pointer correctly
	for negative line count
Date: Tue, 05 Jun 2012 22:32:52 +0200
Pádraig Brady wrote:
> On 06/05/2012 07:12 PM, Jim Meyering wrote:
>> Jim Meyering wrote:
>>>>> Here's the start of a proper patch.
>>>>> To come: mention this in NEWS and add a test.
>>
>> Here's the complete patch.
>
> Nice tests.
> +1

Thanks, and thanks for the review.  Pushed.




Reply sent to Jim Meyering <jim <at> meyering.net>:
You have taken responsibility. (Tue, 05 Jun 2012 20:38:01 GMT) Full text and rfc822 format available.

Notification sent to Anoop Sharma <sendtoanoop <at> gmail.com>:
bug acknowledged by developer. (Tue, 05 Jun 2012 20:38:02 GMT) Full text and rfc822 format available.

Message #34 received at 11631-done <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: 11631-done <at> debbugs.gnu.org
Subject: Re: bug#11631: Head command does not position file pointer correctly
	for negative line count
Date: Tue, 05 Jun 2012 22:35:13 +0200
Jim Meyering wrote:
> Thanks, and thanks for the review.  Pushed.

And with this message, I've closed the issue.




Information forwarded to bug-coreutils <at> gnu.org:
bug#11631; Package coreutils. (Wed, 06 Jun 2012 04:03:02 GMT) Full text and rfc822 format available.

Message #37 received at 11631 <at> debbugs.gnu.org (full text, mbox):

From: Anoop Sharma <sendtoanoop <at> gmail.com>
To: 11631 <at> debbugs.gnu.org
Subject: Re: bug#11631: closed (Re: bug#11631: Head command does not position
	file pointer correctly for negative line count)
Date: Wed, 6 Jun 2012 09:30:11 +0530
1. The comment in code - "Don't bother testing for failure for such a
small amount. Any failure will be detected upon close." may be
re-looked too, since we are now lseeking after it.

What if we change plain fwrite to:
      if (fwrite (buffer, 1, n + 1, stdout) < (n + 1))
            error (EXIT_FAILURE, errno, _("write error"));

2. Maybe using lseek with SEEK_CUR in place of SEEK_SET would reflect
logic better.

With Thanks for quick responses,
Anoop

On Wed, Jun 6, 2012 at 2:08 AM, GNU bug Tracking System
<help-debbugs <at> gnu.org> wrote:
>
> Your bug report
>
> #11631: Head command does not position file pointer correctly for negative line count
>
> which was filed against the coreutils package, has been closed.
>
> The explanation is attached below, along with your original report.
> If you require more details, please reply to 11631 <at> debbugs.gnu.org.
>
> --
> 11631: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=11631
> GNU Bug Tracking System
> Contact help-debbugs <at> gnu.org with problems
>
>
> ---------- Forwarded message ----------
> From: Jim Meyering <jim <at> meyering.net>
> To: Pádraig Brady <P <at> draigBrady.com>
> Cc: 11631-done <at> debbugs.gnu.org
> Date: Tue, 05 Jun 2012 22:35:13 +0200
> Subject: Re: bug#11631: Head command does not position file pointer correctly for negative line count
> Jim Meyering wrote:
> > Thanks, and thanks for the review.  Pushed.
>
> And with this message, I've closed the issue.
>
>
>
> ---------- Forwarded message ----------
> From: Anoop Sharma <sendtoanoop <at> gmail.com>
> To: bug-coreutils <at> gnu.org
> Cc:
> Date: Tue, 5 Jun 2012 15:07:19 +0530
> Subject: Head command does not position file pointer correctly for negative line count
> Head command does not position file pointer correctly for negative line count. Here is a demonstration of the problem.
>
> Step 1 - Create a file with 10 lines in it.
> $ yes "ABC" | head -c 40 >ip.txt
> $
>
> Step 2 - If head behaves correctly, then 2 lines should get printed after "------------" but nothing gets printed!
> $ (head -n -2; echo "------------------------"; cat) <ip.txt
> ABC
> ABC
> ABC
> ABC
> ABC
> ABC
> ABC
> ABC
> ------------------------
> $
>
> Step 3 - Another try fails. If head behaves correctly, then 8 lines should get printed after "------------" but nothing gets printed!
> $ (head -n -8; echo "------------------------"; cat) <ip.txt
> ABC
> ABC
> ------------------------
> $
>
>
>
> /*****************************************************************************************************************************/
> Possible cause of the defect -> Following snippet is copied from head.c (Function - elide_tail_lines_seekable ). Perhaps, there should be a lseek after fwrite there...:
>
>               /* Output the initial portion of the buffer
>                  in which we found the desired newline byte.
>                  Don't bother testing for failure for such a small amount.
>                  Any failure will be detected upon close.  */
>               fwrite (buffer, 1, n + 1, stdout);
> /*****************************************************************************************************************************/
>
>




Information forwarded to bug-coreutils <at> gnu.org:
bug#11631; Package coreutils. (Wed, 06 Jun 2012 08:05:02 GMT) Full text and rfc822 format available.

Message #40 received at 11631 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Anoop Sharma <sendtoanoop <at> gmail.com>
Cc: 11631 <at> debbugs.gnu.org
Subject: Re: bug#11631: closed (Re: bug#11631: Head command does not position
	file pointer correctly for negative line count)
Date: Wed, 06 Jun 2012 10:02:18 +0200
Anoop Sharma wrote:
> 1. The comment in code - "Don't bother testing for failure for such a
> small amount. Any failure will be detected upon close." may be
> re-looked too, since we are now lseeking after it.
>
> What if we change plain fwrite to:
>       if (fwrite (buffer, 1, n + 1, stdout) < (n + 1))
>             error (EXIT_FAILURE, errno, _("write error"));
>
> 2. Maybe using lseek with SEEK_CUR in place of SEEK_SET would reflect
> logic better.

Regarding #1, thanks for the suggestion.  You're right.

I've written a complete patch in your name, below, adding a commit
log and removing the unnecessary parentheses around "n + 1".
In general, I much prefer a patch to a suggestion like that.
The advantage to you is that if you describe precisely enough
(usually requires a patch), then you're listed as the author
of the commit.  If you also take the time to write a commit
log entry like I did below, then that's even better.
See the file, HACKING, for instructions on how to do that
as well as general contribution guidelines.

Regarding #2, please be precise and write/post a patch.


From 7fcfa754f1ff92d4fed1495a05574b8e47c3b4fb Mon Sep 17 00:00:00 2001
From: Anoop Sharma <sendtoanoop <at> gmail.com>
Date: Wed, 6 Jun 2012 09:42:09 +0200
Subject: [PATCH] head: diagnose write failure immediately, given new lseek
 use

Inserting the lseek call (commit v8.17-13-g295ee52) rendered
inaccurate the comment/code that refrained from diagnosing the
failure of the just-preceding fwrite.
* src/head.c (elide_tail_lines_seekable): Remove now-erroneous
comment, and diagnose fwrite failure.
---
 src/head.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/src/head.c b/src/head.c
index c13c064..ffe7c74 100644
--- a/src/head.c
+++ b/src/head.c
@@ -663,10 +663,9 @@ elide_tail_lines_seekable (const char *pretty_filename, int fd,
                 }

               /* Output the initial portion of the buffer
-                 in which we found the desired newline byte.
-                 Don't bother testing for failure for such a small amount.
-                 Any failure will be detected upon close.  */
-              fwrite (buffer, 1, n + 1, stdout);
+                 in which we found the desired newline byte.  */
+              if (fwrite (buffer, 1, n + 1, stdout) < n + 1)
+                error (EXIT_FAILURE, errno, _("write error"));

               /* Set file pointer to the byte after what we've output.  */
               if (lseek (fd, pos + n + 1, SEEK_SET) < 0)
--
1.7.11.rc1




Information forwarded to bug-coreutils <at> gnu.org:
bug#11631; Package coreutils. (Wed, 06 Jun 2012 11:55:02 GMT) Full text and rfc822 format available.

Message #43 received at 11631 <at> debbugs.gnu.org (full text, mbox):

From: Eric Blake <eblake <at> redhat.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: Anoop Sharma <sendtoanoop <at> gmail.com>, 11631 <at> debbugs.gnu.org
Subject: Re: bug#11631: closed (Re: bug#11631: Head command does not position
	file pointer correctly for negative line count)
Date: Wed, 06 Jun 2012 05:52:23 -0600
[Message part 1 (text/plain, inline)]
On 06/06/2012 02:02 AM, Jim Meyering wrote:

> +++ b/src/head.c
> @@ -663,10 +663,9 @@ elide_tail_lines_seekable (const char *pretty_filename, int fd,
>                  }
> 
>                /* Output the initial portion of the buffer
> -                 in which we found the desired newline byte.
> -                 Don't bother testing for failure for such a small amount.
> -                 Any failure will be detected upon close.  */
> -              fwrite (buffer, 1, n + 1, stdout);
> +                 in which we found the desired newline byte.  */
> +              if (fwrite (buffer, 1, n + 1, stdout) < n + 1)
> +                error (EXIT_FAILURE, errno, _("write error"));

Is testing for fwrite() sufficient?  Shouldn't you actually be testing
for fflush() errors, since fwrite() buffers the output and might not
actually encounter an error until it flushes?  Or even on NFS, where
fflush() may succeed but fclose() fails?  In other words, our atexit()
handler for detecting fclose() failure will already catch things; we may
still be in a situation where fwrite() succeeds, we then do lseek(), but
fclose() fails, in spite of our efforts.  I don't see how this patch
improves anything, other than earlier error reporting.

-- 
Eric Blake   eblake <at> redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

[signature.asc (application/pgp-signature, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#11631; Package coreutils. (Wed, 06 Jun 2012 13:04:01 GMT) Full text and rfc822 format available.

Message #46 received at 11631 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Anoop Sharma <sendtoanoop <at> gmail.com>
Cc: 11631 <at> debbugs.gnu.org
Subject: Re: bug#11631: closed (Re: bug#11631: Head command does not position
	file pointer correctly for negative line count)
Date: Wed, 06 Jun 2012 15:01:25 +0200
Jim Meyering wrote:
> Anoop Sharma wrote:
>> 1. The comment in code - "Don't bother testing for failure for such a
>> small amount. Any failure will be detected upon close." may be
>> re-looked too, since we are now lseeking after it.
>>
>> What if we change plain fwrite to:
>>       if (fwrite (buffer, 1, n + 1, stdout) < (n + 1))
>>             error (EXIT_FAILURE, errno, _("write error"));
...
> Regarding #1, thanks for the suggestion.  You're right.

Actually, the existing code is fine.
I'm glad I didn't push that change in your name.
It's not that it would introduce a bug or anything,
but the rationale was incorrect:

The fwrite affects only the output stream, stdout,
while the lseek operates on the input file descriptor,
so they are independent, and the comment is still valid.
The only possible overlap is errno, but if lseek fails,
we now exit immediately, so whether close_stdout has
an errno or not is irrelevant.




Information forwarded to bug-coreutils <at> gnu.org:
bug#11631; Package coreutils. (Wed, 06 Jun 2012 13:12:02 GMT) Full text and rfc822 format available.

Message #49 received at 11631 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Eric Blake <eblake <at> redhat.com>
Cc: Anoop Sharma <sendtoanoop <at> gmail.com>, 11631 <at> debbugs.gnu.org
Subject: Re: bug#11631: closed (Re: bug#11631: Head command does not position
	file pointer correctly for negative line count)
Date: Wed, 06 Jun 2012 15:09:42 +0200
Eric Blake wrote:
> On 06/06/2012 02:02 AM, Jim Meyering wrote:
>
>> +++ b/src/head.c
>> @@ -663,10 +663,9 @@ elide_tail_lines_seekable (const char *pretty_filename, int fd,
>>                  }
>>
>>                /* Output the initial portion of the buffer
>> -                 in which we found the desired newline byte.
>> -                 Don't bother testing for failure for such a small amount.
>> -                 Any failure will be detected upon close.  */
>> -              fwrite (buffer, 1, n + 1, stdout);
>> +                 in which we found the desired newline byte.  */
>> +              if (fwrite (buffer, 1, n + 1, stdout) < n + 1)
>> +                error (EXIT_FAILURE, errno, _("write error"));
>
> Is testing for fwrite() sufficient?  Shouldn't you actually be testing
> for fflush() errors, since fwrite() buffers the output and might not

Hi Eric,

There is no need (or desire) for explicit fflush here.
Relying on our atexit-invoked close_stdout is enough.

> actually encounter an error until it flushes?  Or even on NFS, where
> fflush() may succeed but fclose() fails?  In other words, our atexit()
> handler for detecting fclose() failure will already catch things; we may
> still be in a situation where fwrite() succeeds, we then do lseek(), but
> fclose() fails, in spite of our efforts.  I don't see how this patch
> improves anything, other than earlier error reporting.

Adding the fwrite test would be important solely if we were required
to diagnose-with-errno a failure that occurs when this fwrite actually
happens to perform a write syscall.  Without the new test, the fwrite
can fail, leading the eventual close_stdout call to emit the dumbed-down
diagnostic (no strerror part) that it must emit when ferror indicates
a previous error yet fclose does not fail.

It's a hard call.  This fwrite is not in a loop (it's printing only
a fraction of a read buffer), so the cost of the test is negligible,
but similarly, there's a relatively small risk that, assuming we'll get
a write failure, it will happen while printing this relatively small
amount of data.

Thanks for making me think more about it.
There are many other unchecked uses of fwrite in head.c,
and I cannot justify adding tests for all of them.

As you probably saw, I have retracted the proposed change.




Information forwarded to bug-coreutils <at> gnu.org:
bug#11631; Package coreutils. (Thu, 07 Jun 2012 14:40:08 GMT) Full text and rfc822 format available.

Message #52 received at 11631 <at> debbugs.gnu.org (full text, mbox):

From: Anoop Sharma <sendtoanoop <at> gmail.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: Eric Blake <eblake <at> redhat.com>, 11631 <at> debbugs.gnu.org
Subject: Re: bug#11631: closed (Re: bug#11631: Head command does not position
	file pointer correctly for negative line count)
Date: Thu, 7 Jun 2012 20:07:14 +0530
The thought behind the proposed change was that lseek should reflect
the amount of data that head has actually been able to print.

For example, how do we want head to behave in a situation like the
following where files more than a particular size are not allowed
(with bash shell on a machine with block size of 1024 bytes)? This
situation can be handled by applying this patch. I agree this example
is custom designed to illustrate my point but what do we gain by not
making the check?:

ulimit -f 1; trap '' SIGXFSZ
(stdbuf -o0 head -n -1025 >someOutFile; cat) <someIpFile

What should cat print now?

By detecting fwrite failure, we can increment file pointer by the
amount that was written successfully.
That was what I originally wanted to accomplish. However, I looked at
the existing implementation of head.c and found that a stock behavior
on fwrite failures was to exit and afraid to rock the boat too much, I
proposed that.

I agree that the checking for fwrite failure is not fool-proof. But it
looks better than ignoring the return value.


On Wed, Jun 6, 2012 at 6:39 PM, Jim Meyering <jim <at> meyering.net> wrote:
> Eric Blake wrote:
>> On 06/06/2012 02:02 AM, Jim Meyering wrote:
>>
>>> +++ b/src/head.c
>>> @@ -663,10 +663,9 @@ elide_tail_lines_seekable (const char *pretty_filename, int fd,
>>>                  }
>>>
>>>                /* Output the initial portion of the buffer
>>> -                 in which we found the desired newline byte.
>>> -                 Don't bother testing for failure for such a small amount.
>>> -                 Any failure will be detected upon close.  */
>>> -              fwrite (buffer, 1, n + 1, stdout);
>>> +                 in which we found the desired newline byte.  */
>>> +              if (fwrite (buffer, 1, n + 1, stdout) < n + 1)
>>> +                error (EXIT_FAILURE, errno, _("write error"));
>>
>> Is testing for fwrite() sufficient?  Shouldn't you actually be testing
>> for fflush() errors, since fwrite() buffers the output and might not
>
> Hi Eric,
>
> There is no need (or desire) for explicit fflush here.
> Relying on our atexit-invoked close_stdout is enough.
>
>> actually encounter an error until it flushes?  Or even on NFS, where
>> fflush() may succeed but fclose() fails?  In other words, our atexit()
>> handler for detecting fclose() failure will already catch things; we may
>> still be in a situation where fwrite() succeeds, we then do lseek(), but
>> fclose() fails, in spite of our efforts.  I don't see how this patch
>> improves anything, other than earlier error reporting.
>
> Adding the fwrite test would be important solely if we were required
> to diagnose-with-errno a failure that occurs when this fwrite actually
> happens to perform a write syscall.  Without the new test, the fwrite
> can fail, leading the eventual close_stdout call to emit the dumbed-down
> diagnostic (no strerror part) that it must emit when ferror indicates
> a previous error yet fclose does not fail.
>
> It's a hard call.  This fwrite is not in a loop (it's printing only
> a fraction of a read buffer), so the cost of the test is negligible,
> but similarly, there's a relatively small risk that, assuming we'll get
> a write failure, it will happen while printing this relatively small
> amount of data.
>
> Thanks for making me think more about it.
> There are many other unchecked uses of fwrite in head.c,
> and I cannot justify adding tests for all of them.
>
> As you probably saw, I have retracted the proposed change.




Information forwarded to bug-coreutils <at> gnu.org:
bug#11631; Package coreutils. (Thu, 07 Jun 2012 17:42:02 GMT) Full text and rfc822 format available.

Message #55 received at 11631 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Anoop Sharma <sendtoanoop <at> gmail.com>
Cc: Eric Blake <eblake <at> redhat.com>, 11631 <at> debbugs.gnu.org
Subject: Re: bug#11631: closed (Re: bug#11631: Head command does not position
	file pointer correctly for negative line count)
Date: Thu, 07 Jun 2012 19:38:58 +0200
Anoop Sharma wrote:
> The thought behind the proposed change was that lseek should reflect
> the amount of data that head has actually been able to print.
>
> For example, how do we want head to behave in a situation like the
> following where files more than a particular size are not allowed
> (with bash shell on a machine with block size of 1024 bytes)? This
> situation can be handled by applying this patch. I agree this example
> is custom designed to illustrate my point but what do we gain by not
> making the check?:
>
> ulimit -f 1; trap '' SIGXFSZ
> (stdbuf -o0 head -n -1025 >someOutFile; cat) <someIpFile
>
> What should cat print now?
>
> By detecting fwrite failure, we can increment file pointer by the
> amount that was written successfully.
> That was what I originally wanted to accomplish. However, I looked at
> the existing implementation of head.c and found that a stock behavior
> on fwrite failures was to exit and afraid to rock the boat too much, I
> proposed that.
>
> I agree that the checking for fwrite failure is not fool-proof. But it
> looks better than ignoring the return value.

While head is ignoring that return value,
it is not really ignoring the failure.  That would be a bug.

Rather, head is relying on the fact that the stream records the failure,
and that our atexit-invoked close_stdout function will detect the prior
failure (via ferror(stdout)) and diagnose it.

In practice, testing for fwrite failure will make no difference,
other than adding a small amount to the size of "head".

Regarding your example, what you've done above (turning off buffering)
is very unusual.  That doesn't seem like a case worth catering to.
And besides, since in general we don't know how much a failing fwrite
function has actually written, there can be no guarantee that the input
stream position somehow reflects what was written.




Information forwarded to bug-coreutils <at> gnu.org:
bug#11631; Package coreutils. (Thu, 07 Jun 2012 17:48:02 GMT) Full text and rfc822 format available.

Message #58 received at 11631 <at> debbugs.gnu.org (full text, mbox):

From: Eric Blake <eblake <at> redhat.com>
To: Anoop Sharma <sendtoanoop <at> gmail.com>
Cc: Jim Meyering <jim <at> meyering.net>, 11631 <at> debbugs.gnu.org
Subject: Re: bug#11631: closed (Re: bug#11631: Head command does not position
	file pointer correctly for negative line count)
Date: Thu, 07 Jun 2012 11:45:27 -0600
[Message part 1 (text/plain, inline)]
[please don't top-post on technical lists]

On 06/07/2012 08:37 AM, Anoop Sharma wrote:
> The thought behind the proposed change was that lseek should reflect
> the amount of data that head has actually been able to print.

But that's not generically possible to know.

> 
> For example, how do we want head to behave in a situation like the
> following where files more than a particular size are not allowed
> (with bash shell on a machine with block size of 1024 bytes)? This
> situation can be handled by applying this patch. I agree this example
> is custom designed to illustrate my point but what do we gain by not
> making the check?:
> 
> ulimit -f 1; trap '' SIGXFSZ
> (stdbuf -o0 head -n -1025 >someOutFile; cat) <someIpFile
> 
> What should cat print now?

Bogus question.  That's a bug in your shell scripting - if you are
worried about partial processing errors, then you must check all
intermediate steps:

ulimit -f 1; trap '' SIGXFSZ
(stdbuf -o0 head -n -1025 >someOutFile && cat) <someIpFile

That is, you should have used && rather than ;, so that cat prints
nothing on error.

-- 
Eric Blake   eblake <at> redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

[signature.asc (application/pgp-signature, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#11631; Package coreutils. (Fri, 08 Jun 2012 20:09:01 GMT) Full text and rfc822 format available.

Message #61 received at 11631 <at> debbugs.gnu.org (full text, mbox):

From: Anoop Sharma <sendtoanoop <at> gmail.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: Eric Blake <eblake <at> redhat.com>, 11631 <at> debbugs.gnu.org
Subject: Re: bug#11631: closed (Re: bug#11631: Head command does not position
	file pointer correctly for negative line count)
Date: Sat, 9 Jun 2012 01:35:46 +0530
On Thu, Jun 7, 2012 at 11:08 PM, Jim Meyering <jim <at> meyering.net> wrote:
> Anoop Sharma wrote:
>> The thought behind the proposed change was that lseek should reflect
>> the amount of data that head has actually been able to print.
>>
>> For example, how do we want head to behave in a situation like the
>> following where files more than a particular size are not allowed
>> (with bash shell on a machine with block size of 1024 bytes)? This
>> situation can be handled by applying this patch. I agree this example
>> is custom designed to illustrate my point but what do we gain by not
>> making the check?:
>>
>> ulimit -f 1; trap '' SIGXFSZ
>> (stdbuf -o0 head -n -1025 >someOutFile; cat) <someIpFile
>>
>> What should cat print now?
>>
>> By detecting fwrite failure, we can increment file pointer by the
>> amount that was written successfully.
>> That was what I originally wanted to accomplish. However, I looked at
>> the existing implementation of head.c and found that a stock behavior
>> on fwrite failures was to exit and afraid to rock the boat too much, I
>> proposed that.
>>
>> I agree that the checking for fwrite failure is not fool-proof. But it
>> looks better than ignoring the return value.
>
> While head is ignoring that return value,
> it is not really ignoring the failure.  That would be a bug.
>
> Rather, head is relying on the fact that the stream records the failure,
> and that our atexit-invoked close_stdout function will detect the prior
> failure (via ferror(stdout)) and diagnose it.
>
> In practice, testing for fwrite failure will make no difference,
> other than adding a small amount to the size of "head".
>
> Regarding your example, what you've done above (turning off buffering)
> is very unusual.  That doesn't seem like a case worth catering to.
> And besides, since in general we don't know how much a failing fwrite
> function has actually written, there can be no guarantee that the input
> stream position somehow reflects what was written.

Thank you. I get it now.




Information forwarded to bug-coreutils <at> gnu.org:
bug#11631; Package coreutils. (Fri, 08 Jun 2012 20:12:01 GMT) Full text and rfc822 format available.

Message #64 received at 11631 <at> debbugs.gnu.org (full text, mbox):

From: Anoop Sharma <sendtoanoop <at> gmail.com>
To: Eric Blake <eblake <at> redhat.com>
Cc: Jim Meyering <jim <at> meyering.net>, 11631 <at> debbugs.gnu.org
Subject: Re: bug#11631: closed (Re: bug#11631: Head command does not position
	file pointer correctly for negative line count)
Date: Sat, 9 Jun 2012 01:38:57 +0530
On Thu, Jun 7, 2012 at 11:15 PM, Eric Blake <eblake <at> redhat.com> wrote:
> [please don't top-post on technical lists]
>
> On 06/07/2012 08:37 AM, Anoop Sharma wrote:
>> The thought behind the proposed change was that lseek should reflect
>> the amount of data that head has actually been able to print.
>
> But that's not generically possible to know.
>
>>
>> For example, how do we want head to behave in a situation like the
>> following where files more than a particular size are not allowed
>> (with bash shell on a machine with block size of 1024 bytes)? This
>> situation can be handled by applying this patch. I agree this example
>> is custom designed to illustrate my point but what do we gain by not
>> making the check?:
>>
>> ulimit -f 1; trap '' SIGXFSZ
>> (stdbuf -o0 head -n -1025 >someOutFile; cat) <someIpFile
>>
>> What should cat print now?
>
> Bogus question.  That's a bug in your shell scripting - if you are
> worried about partial processing errors, then you must check all
> intermediate steps:
>
> ulimit -f 1; trap '' SIGXFSZ
> (stdbuf -o0 head -n -1025 >someOutFile && cat) <someIpFile
>
> That is, you should have used && rather than ;, so that cat prints
> nothing on error.
>
> --
> Eric Blake   eblake <at> redhat.com    +1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>

Thank you. I understand your perspective now.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 07 Jul 2012 11:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 13 years and 45 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.