GNU bug report logs - #61300
wc -c doesn't advance stdin position when it's a regular file

Previous Next

Package: coreutils;

Reported by: Stephane Chazelas <stephane <at> chazelas.org>

Date: Sun, 5 Feb 2023 18:28:02 UTC

Severity: normal

Full log


View this message in rfc822 format

From: Pádraig Brady <P <at> draigBrady.com>
To: Stephane Chazelas <stephane <at> chazelas.org>, 61300 <at> debbugs.gnu.org
Subject: bug#61300: wc -c doesn't advance stdin position when it's a regular file
Date: Sun, 5 Feb 2023 19:59:58 +0000
[Message part 1 (text/plain, inline)]
On 05/02/2023 18:27, Stephane Chazelas wrote:
> "wc -c" without filename arguments is meant to read stdin til
> EOF and report the number of bytes it has read.
> 
> When stdin is on a regular file, GNU wc has that optimisation
> whereby it skips the reading, does a pos = lseek(0,0,SEEK_CUR)
> to find out its current position within the file, fstat(0) and
> reports st_size - pos (assuming st_size > pos).
> 
> However, it does not move the position to the end of the file.
> That means for instance that:
> 
> $ echo test > file
> $ { wc -c; wc -c; } < file
> 5
> 5
> 
> Instead of 5, then 0:
> 
> $ { wc -c; cat; } < file
> 5
> test
> 
> So the optimisation is incomplete.
> 
> It also reports the size of the file even if it could not possibly read it
> because it's not open in read mode:
> 
> { wc -c; } 0>> file
> 5
> 
> IMO, it should only do the optimisation if
> - fcntl(F_GETFL) to check that the file is opened in O_RDONLY or O_RDWR
> - current checks for /proc /sys-like filesystems
> - pos > st_size
> - lseek(0,st_size,SEEK_POS) is successful.
> 
> (that leaves a race window above where it could move the cursor
> backward, but I would think that can be ignored as if something
> else reads at the same time, there's not much we can expect
> anyway).

Yes I agree.

Adjusting would also avoid the following inconsistencies:

$ { wc -c; wc -c; } < file
5
5

$ { wc -l; wc -l; } < file
1
0

$ truncate -s $(getconf PAGESIZE) file
$ { wc -c; wc -c; } < file
4096
0

Hopefully the attached addresses this.
Note it doesn't add the constraint on the input being readable,
which I'll think a bit more about.

cheers,
Pádraig
[wc-update-offset.patch (text/x-patch, attachment)]

This bug report was last modified 2 years and 135 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.