GNU bug report logs - #18621
[BUG] wc -c incorrectly counts bytes in /sys

Previous Next

Package: coreutils;

Reported by: George Shuklin <george.shuklin <at> gmail.com>

Date: Fri, 3 Oct 2014 15:13:01 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Pádraig Brady <P <at> draigBrady.com>
To: George Shuklin <george.shuklin <at> gmail.com>
Cc: 18621 <at> debbugs.gnu.org
Subject: bug#18621: [BUG] wc -c incorrectly counts bytes in /sys
Date: Fri, 03 Oct 2014 17:48:20 +0100
On 10/03/2014 03:47 PM, George Shuklin wrote:
> There is many sysfs (linux) attributes which reported as '4k files' but contains just a few bytes.
> 
> wc file and wc -c shows different sizes.
> 
> Example:
> 
> $cat /sys/kernel/vmcoreinfo
> 1b74c00 1024
> 
> $hexdump -Cv /sys/kernel/vmcoreinfo
> 00000000  31 62 37 34 63 30 30 20  31 30 32 34 0a           |1b74c00 1024.|
> 0000000d
> 
> $ls -la /sys/kernel/vmcoreinfo
> -r--r--r-- 1 root root 4096 Oct  3 17:40 /sys/kernel/vmcoreinfo
> 
> 
> Here wc output:
> 
> $ wc /sys/kernel/vmcoreinfo
>    1    2   13 /sys/kernel/vmcoreinfo
> 
> and wc -c:
> 
> $ wc -c /sys/kernel/vmcoreinfo
> 4096 /sys/kernel/vmcoreinfo
> 
> 4096 is not 13, and manual page for wc says that third number is byte count.
> 
> I think problem is in  cnt(const char *file)  function:
> 
>     if (dochar || domulti) {
>         if (fstat(fd, &sb)) {
>             warn("%s: fstat", file);
>             (void)close(fd);
>             return (1);
>         }
>         if (S_ISREG(sb.st_mode)) {
>             (void)printf(" %7lld", (long long)sb.st_size);
>             tcharct += sb.st_size;
>             (void)close(fd);
>             return (0);
>         }
>     }

I'm not sure where the above code comes from,
by coreutils trunk has the same behavior with these files.
We could avoid it with the following patch.
Note in the case where "real" small files don't
take up space in the file system, this will involve a redundant read,
however that will only be the case for small files so shouldn't
be problematic.

thanks,
Pádraig.

diff --git a/src/wc.c b/src/wc.c
index 1ff007d..bf1ce76 100644
--- a/src/wc.c
+++ b/src/wc.c
@@ -235,6 +235,7 @@ wc (int fd, char const *file_x, struct fstatus *fstatus)
         fstatus->failed = fstat (fd, &fstatus->st);

       if (! fstatus->failed && S_ISREG (fstatus->st.st_mode)
+          && fstatus->st.st_blocks
           && (current_pos = lseek (fd, 0, SEEK_CUR)) != -1
           && (end_pos = lseek (fd, 0, SEEK_END)) != -1)
         {





This bug report was last modified 10 years and 289 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.