From debbugs-submit-bounces@debbugs.gnu.org Sun Apr 24 11:59:01 2022 Received: (at submit) by debbugs.gnu.org; 24 Apr 2022 15:59:01 +0000 Received: from localhost ([127.0.0.1]:60031 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1niedZ-0001pl-1z for submit@debbugs.gnu.org; Sun, 24 Apr 2022 11:59:01 -0400 Received: from lists.gnu.org ([209.51.188.17]:39692) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nidPb-0003mA-UT for submit@debbugs.gnu.org; Sun, 24 Apr 2022 10:40:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:34454) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nidPb-00037D-Mb for bug-coreutils@gnu.org; Sun, 24 Apr 2022 10:40:31 -0400 Received: from mail-ej1-f50.google.com ([209.85.218.50]:36389) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nidPZ-0006Z0-Tq; Sun, 24 Apr 2022 10:40:31 -0400 Received: by mail-ej1-f50.google.com with SMTP id k23so25079688ejd.3; Sun, 24 Apr 2022 07:40:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=UOlDfQv5y1pjE3Q6PU0Kj0ab7baGE7N/HOtq2tq27YA=; b=W0u/yNsvasgC6uV9gtu702iV8NQuTNBMBD7uhkHJiiOiUm4PIDgsmHzHxaYC7NjVGr L/lsftUW+o0xjqOx+QVxHecr/viHn9T/Jq4ZUyce8kdkHB61KiBWCKWdbske0WP7enbb tniqgCl7YRmkhOxJClQZvpGQ3BciaO3Kn/bNdatSKnuqizmHIMWuQrsNsP8YCN3CpDLE vZtYAqw3yJAJDBdqdQwqeW+CaNbHEQw3fTfLQamuEAFQEtcXBY1hWywUzkZq1kZ8Z4Zo avQ6FlqLCuI+giPQSu2s5+N1cTouEavlbHkK9ByV+83rBOmHYpO2sveF3ACVBH8vkSlh xYww== X-Gm-Message-State: AOAM532BGK0/p4KH27kqSeZzxzpninyomWS3rQ9h4jQ+hzygWuqDE9i3 tXm0t5BbovKN0P8E32t8I7UDVDC9fPmpKUTpzsEBaU44mPU= X-Google-Smtp-Source: ABdhPJwsrvU4C0kqDwQ82yOwgS+wt/hqxhYpgfIAmFoYvEarJD5lamW38ptZZMO/dKhUt6CY9Ij2mdPEdEUDzqc2p0Y= X-Received: by 2002:a17:906:4fc4:b0:6da:b4c6:fadb with SMTP id i4-20020a1709064fc400b006dab4c6fadbmr12582033ejw.282.1650811227288; Sun, 24 Apr 2022 07:40:27 -0700 (PDT) MIME-Version: 1.0 From: Adam Holt Date: Sun, 24 Apr 2022 10:40:01 -0400 Message-ID: Subject: "split -n K/N " BUG: Last Chunk incomplete if input file >= 262144 bytes To: bug-coreutils@gnu.org, =?UTF-8?Q?Torbj=C3=B6rn_Granlund?= , Richard Stallman Content-Type: multipart/alternative; boundary="000000000000cb36f605dd6770e0" Received-SPF: pass client-ip=209.85.218.50; envelope-from=aholt888@gmail.com; helo=mail-ej1-f50.google.com X-Spam_score_int: -11 X-Spam_score: -1.2 X-Spam_bar: - X-Spam_report: (-1.2 / 5.0 requ) BAYES_00=-1.9, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.249, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-Spam-Score: -0.6 (/) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Sun, 24 Apr 2022 11:58:59 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.6 (-) --000000000000cb36f605dd6770e0 Content-Type: text/plain; charset="UTF-8" Hello ! Where do I report a serious data loss bug with GNU's split command? Example: $ dd if=/dev/random of=file bs=262144 count=1 # Create file containing 262144 bytes $ split -n 1/2 file | wc -c 131072 $ split -n 2/2 file | wc -c 0 # SHOULD BE 131072 split -n 1/3 file | wc -c 87381 split -n 2/3 file | wc -c 87381 split -n 3/3 file | wc -c 0 # SHOULD BE 87382 The Last Chunk is completely missing, as you can see in both above examples. Additionally, if the input file is larger than 2^18 = 262144 bytes, the Last Chunk generated by "split -n K/N file" is then truncated (i.e. many bytes are missing, from the beginning of the Last Chunk). Here's the version number I'm running: $ split --version split (GNU coreutils) 8.32 Copyright (C) 2020 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later < https://gnu.org/licenses/gpl.html>. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Torbjorn Granlund and Richard M. Stallman. Thanks so much for your help forwarding this to anybody who might be able to confirm and ideally resolve this for all ! Regards, Adam -- https://internet-in-a-box.org https://twitter.com/internet_in_box --000000000000cb36f605dd6770e0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello !

Where do I report a serious dat= a loss bug with GNU's split command?

Example:<= /div>

$ dd if=3D/dev/random of=3Dfile bs=3D262144 count=3D1=C2= =A0 =C2=A0 # Create file containing 262144 bytes

$ split -n 1/2 file | wc -c
131072
$ split -n 2/2 file | wc -c
= 0=C2=A0 =C2=A0 # SHOULD BE=C2=A0131072

split -= n 1/3 file | wc -c
87381
split -n 2/3 file | wc -c
8738= 1
split -n 3/3 file | wc -c
0=C2=A0 =C2=A0 # SHOULD BE 873= 82

The Last Chunk is comple= tely missing, as you can see in both above examples.

Additionally, if the input=C2=A0file is larger than 2^18 =3D 262144 byte= s, the Last Chunk generated by "split -n K/N file" is then trunca= ted (i.e. many bytes are missing,=C2=A0from the beginning of the Last Chunk= ).

Here's the version number I'm runn= ing:

$ split --version
split (GNU coreutils) 8.= 32
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: = GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you = are free to change and redistribute it.
There is NO WARRANTY, to the ext= ent permitted by law.

Written by Torbjorn Granlund and Richard M. St= allman.

Thanks so much fo= r=C2=A0your help forwarding this to anybody who might be able to confirm an= d ideally resolve=C2=A0this for all !

Regards,
Adam

--
--000000000000cb36f605dd6770e0-- From debbugs-submit-bounces@debbugs.gnu.org Sun Apr 24 19:14:34 2022 Received: (at 55093) by debbugs.gnu.org; 24 Apr 2022 23:14:34 +0000 Received: from localhost ([127.0.0.1]:60334 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nilR4-0006ru-Jh for submit@debbugs.gnu.org; Sun, 24 Apr 2022 19:14:34 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:54998) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nilR1-0006rd-Ni for 55093@debbugs.gnu.org; Sun, 24 Apr 2022 19:14:32 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 6553B16009A; Sun, 24 Apr 2022 16:14:25 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id Mey-XUOCkXgy; Sun, 24 Apr 2022 16:14:24 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id B98C21600C5; Sun, 24 Apr 2022 16:14:24 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id pJWSjoFIyNYk; Sun, 24 Apr 2022 16:14:24 -0700 (PDT) Received: from [192.168.1.9] (cpe-172-91-119-151.socal.res.rr.com [172.91.119.151]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 9586516009A; Sun, 24 Apr 2022 16:14:24 -0700 (PDT) Message-ID: Date: Sun, 24 Apr 2022 16:14:24 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Subject: Re: bug#55093: "split -n K/N " BUG: Last Chunk incomplete if input file >= 262144 bytes Content-Language: en-US To: Adam Holt References: From: Paul Eggert Organization: UCLA Computer Science Department In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 55093 Cc: 55093@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) On 4/24/22 07:40, Adam Holt wrote: > split (GNU coreutils) 8.32 That's an old version, dated 2020. Please try the current version coreutils 9.1, which has bug fixes in this area. Also, there's no need to cc. rms and tg; they're not working on 'split' any more. Thanks. From debbugs-submit-bounces@debbugs.gnu.org Sun Apr 24 19:14:55 2022 Received: (at control) by debbugs.gnu.org; 24 Apr 2022 23:14:56 +0000 Received: from localhost ([127.0.0.1]:60338 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nilRP-0006sW-Pw for submit@debbugs.gnu.org; Sun, 24 Apr 2022 19:14:55 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:55054) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nilRO-0006sG-LD for control@debbugs.gnu.org; Sun, 24 Apr 2022 19:14:54 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 61A6416009A for ; Sun, 24 Apr 2022 16:14:49 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id S88GOTK64hKq for ; Sun, 24 Apr 2022 16:14:48 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id C9BDC1600D1 for ; Sun, 24 Apr 2022 16:14:48 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id qJZl49bXg_H3 for ; Sun, 24 Apr 2022 16:14:48 -0700 (PDT) Received: from [192.168.1.9] (cpe-172-91-119-151.socal.res.rr.com [172.91.119.151]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id ABD1916009A for ; Sun, 24 Apr 2022 16:14:48 -0700 (PDT) Message-ID: <788942ec-e439-e34b-23a5-ff5e38c300ce@cs.ucla.edu> Date: Sun, 24 Apr 2022 16:14:48 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Content-Language: en-US To: control@debbugs.gnu.org From: Paul Eggert Subject: 55093 needs more info Organization: UCLA Computer Science Department Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) tags 55093 moreinfo From debbugs-submit-bounces@debbugs.gnu.org Sun Apr 24 21:07:07 2022 Received: (at 55093) by debbugs.gnu.org; 25 Apr 2022 01:07:07 +0000 Received: from localhost ([127.0.0.1]:60412 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ninBw-0003Ni-50 for submit@debbugs.gnu.org; Sun, 24 Apr 2022 21:07:07 -0400 Received: from mail-ej1-f53.google.com ([209.85.218.53]:37623) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nim8k-0001he-Ko for 55093@debbugs.gnu.org; Sun, 24 Apr 2022 19:59:43 -0400 Received: by mail-ej1-f53.google.com with SMTP id kq17so3396670ejb.4 for <55093@debbugs.gnu.org>; Sun, 24 Apr 2022 16:59:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=nByibFyJYri2u+0jTRpKEK7TKOCPhL9NyvThC8lMIGQ=; b=xE+uiE3oJKMdCp7BFkUjRuPvx5ShuKxkzhxhnHsxkfnDwixy0miugc8Wv8brpmYPf8 RBD4VmC1J2n7mTllkW7cBHuOwSw8EKpHV3CRrrjicHuCookrdKyAfU4y/fdRGK5u/rCn wuL9ZoOZuATMOVZ4tlq6Nk+demtB9L6BlkKuqn7n6VURhMyQ8/ipvyPmtDYtur956AUw WDSH+eq4GDGq7njay0JnKtOHQhcEs8dntJJv8BNYsRsD2fhnfLTndS85c565PMcrntaK WC4aDNcpsU7e89LktyaRG6K7ZNOQ0erz+k0HECJYayPze85G3jZzuyJwHwhkQ7e4kN1v iD0w== X-Gm-Message-State: AOAM533/XZosQLPo9H3If1TCYMTXYAy4E7Lb5PZotXY12gY39J34jYaO 7Za0/DEHCmUnbvHr16TcjslgZYSX1rLO4xMbKDk= X-Google-Smtp-Source: ABdhPJxn0Knt64rl0GUdmoVOiZ3X1FhHp7WRygC8hbJhWuH8nGZ+fqckDktYXTH/P4Giw36WLFaNoJzAECF1AhFeP0c= X-Received: by 2002:a17:906:4fc4:b0:6da:b4c6:fadb with SMTP id i4-20020a1709064fc400b006dab4c6fadbmr14136706ejw.282.1650844776671; Sun, 24 Apr 2022 16:59:36 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Adam Holt Date: Sun, 24 Apr 2022 19:59:11 -0400 Message-ID: Subject: Re: bug#55093: "split -n K/N " BUG: Last Chunk incomplete if input file >= 262144 bytes To: Paul Eggert Content-Type: multipart/alternative; boundary="0000000000007e303905dd6f4001" X-Spam-Score: 0.7 (/) X-Debbugs-Envelope-To: 55093 X-Mailman-Approved-At: Sun, 24 Apr 2022 21:07:03 -0400 Cc: 55093@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.3 (/) --0000000000007e303905dd6f4001 Content-Type: text/plain; charset="UTF-8" On Sun, Apr 24, 2022 at 7:14 PM Paul Eggert wrote: > On 4/24/22 07:40, Adam Holt wrote: > > > split (GNU coreutils) 8.32 > > That's an old version, dated 2020. Please try the current version > coreutils 9.1, which has bug fixes in this area. Wow, coreutils 9.1 indeed fixes this data loss issue with "split -n K/N", Thanks Paul! -- https://internet-in-a-box.org https://twitter.com/internet_in_box --0000000000007e303905dd6f4001 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Sun, Apr 24, 2022 at 7:14 PM Paul Egge= rt <eggert@cs.ucla.edu> wro= te:
On 4/24/22 07:40, Adam Holt wrote:

> split (GNU coreutils) 8.32

That's an old version, dated 2020. Please try the current version
coreutils 9.1, which has bug fixes in this area.

Wow, coreutils 9.1 indeed fixes this data loss issue with "split= -n K/N", Thanks Paul!

--0000000000007e303905dd6f4001--