From unknown Fri Aug 15 03:56:42 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#37093 <37093@debbugs.gnu.org> To: bug#37093 <37093@debbugs.gnu.org> Subject: Status: wc runs 100% cpu when in pipeline or tee >(wc) Reply-To: bug#37093 <37093@debbugs.gnu.org> Date: Fri, 15 Aug 2025 10:56:42 +0000 retitle 37093 wc runs 100% cpu when in pipeline or tee >(wc) reassign 37093 coreutils submitter 37093 Edward Huff severity 37093 normal tag 37093 notabug thanks From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 20 00:44:49 2019 Received: (at submit) by debbugs.gnu.org; 20 Aug 2019 04:44:49 +0000 Received: from localhost ([127.0.0.1]:60760 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hzw0m-0007Fz-PC for submit@debbugs.gnu.org; Tue, 20 Aug 2019 00:44:49 -0400 Received: from lists.gnu.org ([209.51.188.17]:35134) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hzw0l-0007Fp-GP for submit@debbugs.gnu.org; Tue, 20 Aug 2019 00:44:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54751) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1hzw0k-00082f-1v for bug-coreutils@gnu.org; Tue, 20 Aug 2019 00:44:47 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM, HTML_MESSAGE autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hzw0i-0001xV-RM for bug-coreutils@gnu.org; Tue, 20 Aug 2019 00:44:45 -0400 Received: from mail-io1-xd29.google.com ([2607:f8b0:4864:20::d29]:45586) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hzw0i-0001x3-Kw for bug-coreutils@gnu.org; Tue, 20 Aug 2019 00:44:44 -0400 Received: by mail-io1-xd29.google.com with SMTP id t3so9364771ioj.12 for ; Mon, 19 Aug 2019 21:44:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=KOCKzhJdE7u+8N1E0js48G/YlCO4LtPDac+VAmHB9PE=; b=QHHE8PSDLxhD6T5keyPUBY5dXNCvSAEt75wEb7+PV57GgQ8hOplJo6rQGaNxa8vVzR 2vZKhpyU8qznjFPecOgcthYAJm4Vd88nmiNwIBDTiDe5IafCAkumVUMwUIEEaQcSe6nh 7qGkKJ8wy4KI/tF494/jsGnzAXxxUT0gITxhsu6omExwAPusycUwgExO8VAZB3AV/ysw 2DDJRoJrrIi0UMQvcQipnvW5o+3F9HsVqtaTu1xvmB+w0sC4EGx4Z/0e/M81lS8+HlfS X0IKOuQn5kFeqp7Qf8bGa3CK9YmbpvHL8g8S2I+s4BX02XZFEBlOxWok4H86NUAZIuEw V1Ug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=KOCKzhJdE7u+8N1E0js48G/YlCO4LtPDac+VAmHB9PE=; b=HMGx9L3PZ9X65Cs5gGrv/htabGdTZGY5LZT0zGIJaBqj7r7VGJsgkHundtGRDokglg hthHqBxOwdv+WS5MTqCQkMyMTaXF0Rwc1c0ImgYaEqD2kHtsBeCGrENUrRI/hftnmWLS yMtacq43Yl8x3L+jSj6uuKvQi6rM/VPh68una3Fd7nXRSNQQgULUNs8u2nXQPYFbJ2Cw rvQAb/EH8CjMDQVdng11wemg9vNRo72cZqQvl7FlcJPXnJL/etC9M3eiRtkdBOXyWDSh fPSCCVOqmU3J1tMUuFDPAKj+WqJyEkCPs2MAUr2Twdx1zPG7C4FjJaVijvJeH2AwixbP JghA== X-Gm-Message-State: APjAAAUTbdR9/1Y/gOAItsHw4U+NLkpuwUNY66Y9wZoUD4UvYRap7btN tQnxMFKNP49pkuhJzos0x9759p914toDS7yhFoQr7cE= X-Google-Smtp-Source: APXvYqxBeshwAnmNQZmRDsomv+hgJOu41BldDlB1bPCuUzETQmezkCxFn6VCGliNSa2vxQUJVo5MHbumAvzISy9gozU= X-Received: by 2002:a02:77d0:: with SMTP id g199mr1534764jac.140.1566276282512; Mon, 19 Aug 2019 21:44:42 -0700 (PDT) MIME-Version: 1.0 From: Edward Huff Date: Tue, 20 Aug 2019 00:44:30 -0400 Message-ID: Subject: wc runs 100% cpu when in pipeline or tee >(wc) To: bug-coreutils@gnu.org Content-Type: multipart/alternative; boundary="000000000000707aa40590851dac" X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::d29 X-Spam-Score: -1.3 (-) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) --000000000000707aa40590851dac Content-Type: text/plain; charset="UTF-8" In the demo below, dd uses 0.665s to write 1GiB of zeros. sha256sum uses 4.285s to calculate the sha256 of 1GiB of zeros. wc uses 32.160s to count 1GiB of zeros. Linux localhost 5.2.8-200.fc30.x86_64 #1 SMP Sat Aug 10 13:21:39 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux coreutils-8.31-2.fc30.x86_64 dd (coreutils) 8.31 wc (GNU coreutils) 8.31 sha256sum (GNU coreutils) 8.31 baseline results: $ dd if=/dev/zero count=$((1024*1024)) bs=1024 | tee >(sha256sum>&2) | wc 1048576+0 records in 1048576+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 32.5007 s, 33.0 MB/s 49bc20df15e412a64472421e13fe86ff1c5165e18b2afccf160d4dc19fe68a14 - 0 0 1073741824 $ Demo script: $ cat wc-at-100pct #!/bin/bash set -xv rm pipe* tee* { { time dd if=/dev/zero count=$((1024*1024)) bs=1024 } 2>>pipedd } | { tee >( { time sha256sum } >teesha256 2>&1 ) } | { { time wc } > pipewc 2>&1 } { { time dd if=/dev/zero count=$((1024*1024)) bs=1024 } 2>>pipedd } | { tee >( { time wc } >teewc 2>&1 ) } | { { time sha256sum } > pipesha256sum 2>&1 } head pipe* tee* $ Results: ./wc-at-100pct rm pipe* tee* + rm pipedd pipesha256sum pipewc teesha256 teewc { { time dd if=/dev/zero count=$((1024*1024)) bs=1024 } 2>>pipedd } | { tee >( { time sha256sum } >teesha256 2>&1 ) } | { { time wc } > pipewc 2>&1 } + tee /dev/fd/63 { { time dd if=/dev/zero count=$((1024*1024)) bs=1024 } 2>>pipedd } | { tee >( { time wc } >teewc 2>&1 ) } | { { time sha256sum } > pipesha256sum 2>&1 } + tee /dev/fd/63 head pipe* tee* + head pipedd pipesha256sum pipewc teesha256 teewc ==> pipedd <== + dd if=/dev/zero count=1048576 bs=1024 1048576+0 records in 1048576+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 32.5495 s, 33.0 MB/s real 0m32.550s user 0m0.665s sys 0m1.503s + dd if=/dev/zero count=1048576 bs=1024 1048576+0 records in ==> pipesha256sum <== + sha256sum 49bc20df15e412a64472421e13fe86ff1c5165e18b2afccf160d4dc19fe68a14 - real 0m31.457s user 0m4.285s sys 0m0.562s ==> pipewc <== + wc 0 0 1073741824 real 0m32.555s user 0m32.160s sys 0m0.247s ==> teesha256 <== ++ sha256sum 49bc20df15e412a64472421e13fe86ff1c5165e18b2afccf160d4dc19fe68a14 - real 0m32.553s user 0m4.333s sys 0m0.704s ==> teewc <== ++ wc 0 0 1073741824 real 0m31.456s user 0m31.121s sys 0m0.221s --000000000000707aa40590851dac Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
In the demo below, dd uses 0.665s to write 1GiB of ze= ros.
sha256sum uses 4.285s to calculate the sha256 of 1GiB of zer= os.
wc uses 32.160s to count 1GiB of zeros.

Linux localhost 5.2.8-200.fc30.x86_64 #1 SMP Sat Aug 10 13:21:39 UT= C 2019 x86_64 x86_64 x86_64 GNU/Linux
coreutils-8.31-2.fc30.x= 86_64
dd (coreutils) 8.31
wc (GNU coreutils) 8.31
sha256sum= (GNU coreutils) 8.31

baseline results:
$ dd if=3D/dev/zero count=3D$((1024*1024)) bs=3D1024 | tee >(sha= 256sum>&2) | wc
1048576+0 records in
1048576+0 records out
= 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 32.5007 s, 33.0 MB/s
49bc20df= 15e412a64472421e13fe86ff1c5165e18b2afccf160d4dc19fe68a14 =C2=A0-
=C2=A0 = =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 0 1073741824
$

Demo script:
$ cat wc-at-100pct
#!/bin/bashset -xv
rm pipe* tee*
{
=C2=A0 =C2=A0{
=C2=A0 =C2=A0 =C2=A0 t= ime dd if=3D/dev/zero count=3D$((1024*1024)) bs=3D1024
=C2=A0 =C2=A0} 2&= gt;>pipedd
} | {
=C2=A0 =C2=A0tee >(
=C2=A0 =C2=A0 =C2=A0 {<= br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0time sha256sum
=C2=A0 =C2=A0 =C2=A0= } >teesha256 2>&1
=C2=A0 =C2=A0)
} | {
=C2=A0 =C2=A0{=C2=A0 =C2=A0 =C2=A0 time wc
=C2=A0 =C2=A0} > pipewc 2>&1}

{
=C2=A0 =C2=A0{
=C2=A0 =C2=A0 =C2=A0 time dd if=3D/dev/zer= o count=3D$((1024*1024)) bs=3D1024
=C2=A0 =C2=A0} 2>>pipedd
} |= {
=C2=A0 =C2=A0tee >(
=C2=A0 =C2=A0 =C2=A0 {
=C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0time wc
=C2=A0 =C2=A0 =C2=A0 } >teewc 2>&1=C2=A0 =C2=A0)
} | {
=C2=A0 =C2=A0{
=C2=A0 =C2=A0 =C2=A0 time sha= 256sum
=C2=A0 =C2=A0} > pipesha256sum 2>&1
}

head pi= pe* tee*
$

Results:
./= wc-at-100pct
rm pipe* tee*
+ rm pipedd pipesha256sum pipewc teesha25= 6 teewc
{
=C2=A0 =C2=A0{
=C2=A0 =C2=A0 =C2=A0 time dd if=3D/dev/ze= ro count=3D$((1024*1024)) bs=3D1024
=C2=A0 =C2=A0} 2>>pipedd
} = | {
=C2=A0 =C2=A0tee >(
=C2=A0 =C2=A0 =C2=A0 {
=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0time sha256sum
=C2=A0 =C2=A0 =C2=A0 } >teesha256 = 2>&1
=C2=A0 =C2=A0)
} | {
=C2=A0 =C2=A0{
=C2=A0 =C2=A0 = =C2=A0 time wc
=C2=A0 =C2=A0} > pipewc 2>&1
}
+ tee /dev= /fd/63

{
=C2=A0 =C2=A0{
=C2=A0 =C2=A0 =C2=A0 time dd if=3D/dev= /zero count=3D$((1024*1024)) bs=3D1024
=C2=A0 =C2=A0} 2>>pipedd} | {
=C2=A0 =C2=A0tee >(
=C2=A0 =C2=A0 =C2=A0 {
=C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0time wc
=C2=A0 =C2=A0 =C2=A0 } >teewc 2>&= 1
=C2=A0 =C2=A0)
} | {
=C2=A0 =C2=A0{
=C2=A0 =C2=A0 =C2=A0 time= sha256sum
=C2=A0 =C2=A0} > pipesha256sum 2>&1
}
+ tee /= dev/fd/63

head pipe* tee*
+ head pipedd pipesha256sum pipewc tees= ha256 teewc
=3D=3D> pipedd <=3D=3D
+ dd if=3D/dev/zero count=3D= 1048576 bs=3D1024
1048576+0 records in
1048576+0 records out
10737= 41824 bytes (1.1 GB, 1.0 GiB) copied, 32.5495 s, 33.0 MB/s

real 0m32= .550s
user 0m0.665s
sys 0m1.503s
+ dd if=3D/dev/zero count=3D10485= 76 bs=3D1024
1048576+0 records in

=3D=3D> pipesha256sum <= =3D=3D
+ sha256sum
49bc20df15e412a64472421e13fe86ff1c5165e18b2afccf16= 0d4dc19fe68a14 =C2=A0-

real 0m31.457s
user 0m4.285s
sys 0m0.56= 2s

=3D=3D> pipewc <=3D=3D
+ wc
=C2=A0 =C2=A0 =C2=A0 0 = =C2=A0 =C2=A0 =C2=A0 0 1073741824

real 0m32.555s
user 0m32.160ssys 0m0.247s

=3D=3D> teesha256 <=3D=3D
++ sha256sum
49= bc20df15e412a64472421e13fe86ff1c5165e18b2afccf160d4dc19fe68a14 =C2=A0-
<= br>real 0m32.553s
user 0m4.333s
sys 0m0.704s

=3D=3D> teewc = <=3D=3D
++ wc
=C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 0 107374= 1824

real 0m31.456s
user 0m31.121s
sys 0m0.221s

<= div>
--000000000000707aa40590851dac-- From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 20 02:34:54 2019 Received: (at 37093) by debbugs.gnu.org; 20 Aug 2019 06:34:54 +0000 Received: from localhost ([127.0.0.1]:60804 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hzxjK-0001aj-Ac for submit@debbugs.gnu.org; Tue, 20 Aug 2019 02:34:54 -0400 Received: from mout.kundenserver.de ([212.227.126.187]:55919) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hzxjH-0001aT-TX for 37093@debbugs.gnu.org; Tue, 20 Aug 2019 02:34:52 -0400 Received: from [192.168.101.10] ([91.1.217.117]) by mrelayeu.kundenserver.de (mreue011 [212.227.15.167]) with ESMTPSA (Nemesis) id 1MTzve-1hqicD0qIs-00R0Oa; Tue, 20 Aug 2019 08:34:43 +0200 Subject: Re: bug#37093: wc runs 100% cpu when in pipeline or tee >(wc) To: Edward Huff , 37093@debbugs.gnu.org References: From: Bernhard Voelker Openpgp: preference=signencrypt Autocrypt: addr=mail@bernhard-voelker.de; prefer-encrypt=mutual; keydata= mQENBFPirzMBCACyzYldTjQ4ufFOkByY5Nn5USb5GFoL48nWBwNHjd9KUbtRRNlQiPNKd6hK Gvd3BGi5aoFKA4ytfRk6jbAbW3jVb3R8wYaV08mOy4KVEKxqN4bxsXlMjNChXVR+rtKDmfI+ oPTL+cPH2X6gW4W02IRbVw0uUhNm6zEedC/gNrY/mTlf1enZ46jxZ7BTUZaG+kx38UMISIMB zSzLRtdkwgmHj4jS3p1fF2cwRqLclIfMjKGpbNFPEXeXKWrCLcqHw78795eAR9q0YvrDkfIn GdDBwfb3VM4NdulwIFzvYZMSXvSbbyPLB5YkHU5aAWQHUse4WlfT5ccDpbzUYldRAvF9ABEB AAG0K0Jlcm5oYXJkIFZvZWxrZXIgPG1haWxAYmVybmhhcmQtdm9lbGtlci5kZT6JATkEEwEC ACMFAlPirzMCGwMHCwkIBwMCAQYVCAIJCgsEFgIDAQIeAQIXgAAKCRBGUC73lpFxle5wCACC dbs0QaJ0vR3Sff2cKdTk41rUq3YfWngsR///IOU0C5DdkePmCnJE/lUsUy0LRTxcUDLxQR+x QHU8ssRT0JUO9726dI3miy36UdsgmBYaOtLvQcidGmW1R7o0PYYf04+TFtyqKgngOUBPpMgR 6o4UsQxy/OD4bN1WDqOgIjL+D/qJpkKmgp6L6+hhaBCpiOFKRmmV7YyQ3SqVlfQNiHs5ZtkR nXpIjgZARV+GllKucI17bO0CGmTJZ1tstVy0+W3DQT1lbBkTTc++5LONM99D3jjn23l1ocOp folR53F7I4cb2RNfT23v1I59RH37lB9wMOqrKj0UjYAC2YoPGQ3BuQENBFPirzMBCADXLWWp QihBldY6reca8ZKdc3T9qXEOa3akE3DWKztIBmNJhtYOjmpLYajQTkGa7UoJTnbmZE2Rn6ZE oNnvb0gcFNAIcY95KOI+bjOR8HEgh4cx2REXh6L6olIgyXqt/KFusE4wtVZAFxZl+30HzN6n D+1HvrjXxPJRX6MsIYOYyyX9/6OofwJK6QHODYGp8WL2olHDnmsXg4AT6Wlr7qKpKrQELlcF R4xkvdmgL/Ghw/tK0yJTxMIcewCCZWLPOXRmFRbvAadZWPAgVsJ63siNyUlVnVMSzDgTJl+s l/DMabXpqrJQx3/1Yy6mTaDs3XZT/wmBKaTLXx/LByaPxQQ7ABEBAAGJAR8EGAECAAkFAlPi rzMCGwwACgkQRlAu95aRcZWVPwgAqZT6iTXkoP37wYb41323RzhBcJ8JSk4cyBDBUXX0lMrM 3qhiClKG7phpxVdu817Gwc6Hsecg7FfjQAV8MHQ0ZFeEFdk3b2rKBqfsStc+h49/xF3Fb+if CzR9qeQF82fMSxkg18++7hMcHCMO/hPZ/Q0xRi+lrSr2QKDJQuLzSyVU14TxrCkevZjEhtma VNvcJlJzCbiBXee9Fpc5jITUXPFG8E8dxqo1n+duOyIMgozrAnzP7X5V/Ob/Ozf/aGGX9+Jd inyfCX18nWcHALKMU/36Eua/ylalf/2c2YkBp9KCLVmGgPkUgW52EeRPgroIsiwu+rwCSV6Z UyCJ+OymCg== Message-ID: <30e1d649-a044-d408-3509-71dc792e863c@bernhard-voelker.de> Date: Tue, 20 Aug 2019 08:34:41 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Provags-ID: V03:K1:0kH/7aspCc7rteg3gG3OFbJ3/x+VK0VlCIbpBW/bIhB1Shdzz0M eM1u13RutuALPAsrFZ3b5Y9S/tE5bxgnYGZaRzZwK4kmY6VFPKHSI78ioowCLEO63V1I2es n5ylDteALbteoqn4G7I7D+1TKef+FtPn3SKbtvPqy9O3plwTJXycZ5ZUMtouu92sd/tMx6O XFOQFoFYkYVJif3mYWlgA== X-Spam-Flag: NO X-UI-Out-Filterresults: notjunk:1;V03:K0:jSpdKk+dmIA=:x0QGMfk59t5DelRZy1BgLx 4C5Xbpx42ITrZqK51v+FfGLv/BfDzkWCqL+2vidCXwL6Exk/deKEOw1d8z6J9oW7PiLHh5+/o f+JKGdogipx5JMZZZQvO+B8KjQJY77mp+w/AFeWGmMfM1fOj2xZ+fmJQo3KGOqeiyW2/2jji8 xhisf52Tx8ynNYzt++p/GZ6f45BciCYQgs42C/dN3Nv0UB5yfci547vF7m9cErXc9CXrzu8f6 bt6rN+iG/Z/U6LZw5S5mj3eDaf6Eswpb8CdAQX6/0e1vGlwxn/UOwLIQjpsyB6wCdkea4eAFT 4VmV+AXnrafoXogcUy8tO3SqI4FqNas3dYS8HJhQvDs1gt40FR+OygPgx0YzcRR5/gnqcirzE J6m87P79fAokZhiHllTRmpB7zg85SJoyylNI+9wZa+waAqrTfBzNCtsJ089D1v7jcqsTgROoU 0gL3HL9oSz1pCUyznRnWAj/h3aYOoTVb7nyeCNBEHGxYCHbrKdVt0oiezA3MAzQGjsXYmlujT pABo/r+jo/dnPdkcsFYnQk7dCkFHv+y/Y7cNSU2o5YQUnEAS4w3OmCuB8teFBU1jSkjfkGzpF pks0+80b6EKlodtrTe7UIn/DA5Cv9TS/HdkLgCwc8mNhGSN3uvtbHntKU5LzEH4xyyErYmlYU VWR0ZCtLGd1xsiszJ9mzFr1edleee8ca6LoCq2o9ygINgayKoq2jA/JFT/4WOOex5a0nF7TDQ hT80+X+FCOirYG5XgV5W7B3x/VmMVJMK7Vu30boYi57hD6wH3fe/DFYqYK/Ra5xVP3yDPveIC 6wb6KS1 X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 37093 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) On 8/20/19 6:44 AM, Edward Huff wrote: > In the demo below, dd uses 0.665s to write 1GiB of zeros. > sha256sum uses 4.285s to calculate the sha256 of 1GiB of zeros. > wc uses 32.160s to count 1GiB of zeros. > > Linux localhost 5.2.8-200.fc30.x86_64 #1 SMP Sat Aug 10 13:21:39 UTC 2019 > x86_64 x86_64 x86_64 GNU/Linux > coreutils-8.31-2.fc30.x86_64 > dd (coreutils) 8.31 > wc (GNU coreutils) 8.31 > sha256sum (GNU coreutils) 8.31 > > baseline results: > $ dd if=/dev/zero count=$((1024*1024)) bs=1024 | tee >(sha256sum>&2) | wc > 1048576+0 records in > 1048576+0 records out > 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 32.5007 s, 33.0 MB/s > 49bc20df15e412a64472421e13fe86ff1c5165e18b2afccf160d4dc19fe68a14 - > 0 0 1073741824 > $ > > Demo script: > $ cat wc-at-100pct > #!/bin/bash > set -xv > rm pipe* tee* > { > { > time dd if=/dev/zero count=$((1024*1024)) bs=1024 > } 2>>pipedd > } | { > tee >( > { > time sha256sum > } >teesha256 2>&1 > ) > } | { > { > time wc > } > pipewc 2>&1 > } > > { > { > time dd if=/dev/zero count=$((1024*1024)) bs=1024 > } 2>>pipedd > } | { > tee >( > { > time wc > } >teewc 2>&1 > ) > } | { > { > time sha256sum > } > pipesha256sum 2>&1 > } > > head pipe* tee* > $ > > Results: > ./wc-at-100pct > rm pipe* tee* > + rm pipedd pipesha256sum pipewc teesha256 teewc > { > { > time dd if=/dev/zero count=$((1024*1024)) bs=1024 > } 2>>pipedd > } | { > tee >( > { > time sha256sum > } >teesha256 2>&1 > ) > } | { > { > time wc > } > pipewc 2>&1 > } > + tee /dev/fd/63 > > { > { > time dd if=/dev/zero count=$((1024*1024)) bs=1024 > } 2>>pipedd > } | { > tee >( > { > time wc > } >teewc 2>&1 > ) > } | { > { > time sha256sum > } > pipesha256sum 2>&1 > } > + tee /dev/fd/63 > > head pipe* tee* > + head pipedd pipesha256sum pipewc teesha256 teewc > ==> pipedd <== > + dd if=/dev/zero count=1048576 bs=1024 > 1048576+0 records in > 1048576+0 records out > 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 32.5495 s, 33.0 MB/s > > real 0m32.550s > user 0m0.665s > sys 0m1.503s > + dd if=/dev/zero count=1048576 bs=1024 > 1048576+0 records in > > ==> pipesha256sum <== > + sha256sum > 49bc20df15e412a64472421e13fe86ff1c5165e18b2afccf160d4dc19fe68a14 - > > real 0m31.457s > user 0m4.285s > sys 0m0.562s > > ==> pipewc <== > + wc > 0 0 1073741824 > > real 0m32.555s > user 0m32.160s > sys 0m0.247s > > ==> teesha256 <== > ++ sha256sum > 49bc20df15e412a64472421e13fe86ff1c5165e18b2afccf160d4dc19fe68a14 - > > real 0m32.553s > user 0m4.333s > sys 0m0.704s > > ==> teewc <== > ++ wc > 0 0 1073741824 > > real 0m31.456s > user 0m31.121s > sys 0m0.221s I'm not sure what your report is trying to demonstrate exactly. Here, your script just takes ~5-6 seconds for each pass on a ~5 year-old "Intel(R) Core(TM) i5-4570 CPU @ 3.20GHz". Furthermore, replacing 'time' by 'env time -v' shows that wc(1) is taking less CPU than sha256sum - exactly as one would expect: $ grep CPU * | column -t pipedd: Percent of CPU this job got: 27% pipedd: Percent of CPU this job got: 29% pipesha256sum: Percent of CPU this job got: 99% pipewc: Percent of CPU this job got: 65% teesha256: Percent of CPU this job got: 99% teewc: Percent of CPU this job got: 66% > coreutils-8.31-2.fc30.x86_64 It looks like you are using the Fedora package of coreutils ... which may have some additional patches compared to upstream GNU coreutils. Did you try the upstream version as well? Have a nice day, Berny From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 20 02:41:27 2019 Received: (at 37093) by debbugs.gnu.org; 20 Aug 2019 06:41:27 +0000 Received: from localhost ([127.0.0.1]:60812 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hzxpf-0001k2-GI for submit@debbugs.gnu.org; Tue, 20 Aug 2019 02:41:27 -0400 Received: from mail-pg1-f195.google.com ([209.85.215.195]:39758) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hzxpc-0001ji-Nu; Tue, 20 Aug 2019 02:41:25 -0400 Received: by mail-pg1-f195.google.com with SMTP id u17so2625937pgi.6; Mon, 19 Aug 2019 23:41:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=8oIiIroIHuKPZKW6xjzusQfAfzo8ePXC2dsqSwCDYmM=; b=HYamOIMRnTGxEXTbyBJyTMhZ8DyUGjKQvJzYk7ABjKylmwt/pbsPGegz0Hiy+Z+eqt AWd+lnuStg35tc6AMo1NmfVBKBOqZ1OXMYHk6XB5oR0JYuLM5DV5IgeMv4nqPFQt5smk AAyoMEoCiXyN9FnNWUVbHas4p6Ls0Y5j9L30aC6T4YJWdQCKTC5DZvWHhlQZBRt6qSFE E8aATeoScpi463UDOTuuJpq0wU5a6qTK8Pw6pXsKS3sLh7dgDsg1c8omFYgCcuri04mN P9jy8GWufDGhwd5/Kea/XFhPCrPVMjR/c3yPiXoEcPsoryseODm01WTYaUAJ8+GAlTBX MeAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=8oIiIroIHuKPZKW6xjzusQfAfzo8ePXC2dsqSwCDYmM=; b=GVwscFmRw0OPq/jz6Ni9LCbqdQB6vsNiyg0a0xyKYGol1ocwSyRd3Xpr8ArydXqS3h CmS8Cdu4D+TJLotssEiMQgJYdSgwExlZhgJ24H+DlFd/85S9karM5uPFEovAXVF6m7Jc OXsJs73Nd8EWGLohM1GO4aUSm7Fz5DwL5LSu+KYElhlSP1yRSo7wKi+M5+4e+f26UYAl bL2WdNIIQBCotKEk+GVMSTsHNCJ9Bq6eDUxGPoaNb4a2lwdRGNqde20pTrMsEREhp9p/ rQkulOQvEj3J4l23d2Y5Fr4K/QNkLA+NsMCSa71SjZ4zAWsVsdDiAXniJlauuVSqjMO8 uiZg== X-Gm-Message-State: APjAAAUrvHuchcS1ehMCsXmD2XIU+YqAaj5sOAnXVgkML/02IzMBL9qr kKDSaYmZuUAO5xt2mPNbiPmuG5Hr X-Google-Smtp-Source: APXvYqzVa0hx3KTSxdXSK3OeWiKJ2lcdbU7Dx723KvaUZvNv5yg2L979vEHaJNeYIunE7p0ilBYEwA== X-Received: by 2002:a63:1d2:: with SMTP id 201mr23991718pgb.307.1566283277319; Mon, 19 Aug 2019 23:41:17 -0700 (PDT) Received: from tomato.moose.housegordon.com (moose.housegordon.com. [184.68.105.38]) by smtp.googlemail.com with ESMTPSA id m37sm1824501pjb.0.2019.08.19.23.41.15 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 19 Aug 2019 23:41:16 -0700 (PDT) Subject: Re: bug#37093: wc runs 100% cpu when in pipeline or tee >(wc) To: Edward Huff , 37093@debbugs.gnu.org References: From: Assaf Gordon Message-ID: Date: Tue, 20 Aug 2019 00:41:15 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 37093 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) tag 37093 notabug close 37093 stop Hello, On 2019-08-19 10:44 p.m., Edward Huff wrote: > In the demo below, dd uses 0.665s to write 1GiB of zeros. > sha256sum uses 4.285s to calculate the sha256 of 1GiB of zeros. > wc uses 32.160s to count 1GiB of zeros. [...] > baseline results: > $ dd if=/dev/zero count=$((1024*1024)) bs=1024 | tee >(sha256sum>&2) | wc > 1048576+0 records in > 1048576+0 records out > 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 32.5007 s, 33.0 MB/s > 49bc20df15e412a64472421e13fe86ff1c5165e18b2afccf160d4dc19fe68a14 - > 0 0 1073741824 > $ First, Try to avoid UTF8 locales (i.e., force a C/POSIX locale with LC_ALL=C) which makes 'wc' much faster. On my computer: With UTF8 locale: $ dd if=/dev/zero count=$((1024*1024)) bs=1024 \ | tee >(sha256sum>&2) | time --portability wc 1048576+0 records in 1048576+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 46.5928 s, 23.0 MB/s 49bc20df15e412a64472421e13fe86ff1c5165e18b2afccf160d4dc19fe68a14 - 0 0 1073741824 real 46.59 user 46.37 sys 0.19 With C locale: $ dd if=/dev/zero count=$((1024*1024)) bs=1024 \ | tee >(sha256sum>&2) | LC_ALL=C time --portability wc 1048576+0 records in 1048576+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 8.60285 s, 125 MB/s 49bc20df15e412a64472421e13fe86ff1c5165e18b2afccf160d4dc19fe68a14 - 0 0 1073741824 real 8.60 user 5.22 sys 0.26 Second, The "word counting" feature in 'wc' is the main cpu-hog. If you avoid that (i.e. counting only lines, or only characters), 'wc' is even faster (and it automatically ignores UTF8 issues): $ dd if=/dev/zero count=$((1024*1024)) bs=1024 \ | tee >(sha256sum>&2) \ | \time --portability wc -c 1048576+0 records in 1048576+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 7.59429 s, 141 MB/s 49bc20df15e412a64472421e13fe86ff1c5165e18b2afccf160d4dc19fe68a14 - 1073741824 real 7.59 user 0.10 sys 0.71 Notice that the "real time" wasn't changed much (from 8.6s to 7.59s), but the actual work performed by 'wc' (measured in "user time") is down drastically. Third, If you are comfortable with compiling Coreutils from source, you can build it using optimized hashing function from OpenSSL, like so: ./configure --with-openssl make Then, "sha256sum" will be faster (about 2x fast on my computer). If you don't want to re-compile it, consider using "openssl" directly to calculate the checksum, like so: dd if=/dev/zero count=1K bs=1M | tee >(openssl sha256>&2) | wc -c Fourth, To save few more microseconds, consider using dd with larger block size (bs=) and fewer blocks (count=), e.g.: $ time dd if=/dev/zero of=/dev/null count=1M bs=1K 1048576+0 records in 1048576+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.865853 s, 1.2 GB/s real 0m0.868s user 0m0.288s sys 0m0.579s $ time dd if=/dev/zero of=/dev/null count=1K bs=1M 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.0998688 s, 10.8 GB/s real 0m0.102s user 0m0.000s sys 0m0.102s This won't reduce the total time by much, but will result in fewer sys-calls, and less CPU kernel time (at least by a tiny bit). The effect is more noticeable when reading or writing to a physical disk. ---- Lastly, If you use GNU time instead of the shell's built-in 'time' function, you can specify custom output format, and easily show the timing of each program in the pipeline. Example: $ FMT="\n=== CMD: %C ===\nreal %e\tuser %U\tsys %S\n" $ \time -f "$FMT" dd if=/dev/zero count=1M bs=1K \ | \time -f "$FMT" tee >(\time -f "$FMT" sha256sum>&2) \ | \time -f "$FMT" wc -c 1048576+0 records in 1048576+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 7.77339 s, 138 MB/s === CMD: dd if=/dev/zero count=1048576 bs=1024 === real 7.77 user 0.36 sys 1.65 === CMD: tee /dev/fd/63 === real 7.77 user 0.10 sys 1.30 49bc20df15e412a64472421e13fe86ff1c5165e18b2afccf160d4dc19fe68a14 - === CMD: sha256sum === real 7.77 user 7.47 sys 0.27 1073741824 === CMD: wc -c === real 7.77 user 0.05 sys 0.76 As such, I'm closing this as "not a bug", but discussion can continue by replying to this thread. regards, - assaf From unknown Fri Aug 15 03:56:42 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Tue, 17 Sep 2019 11:24:08 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator