From unknown Sat Sep 06 19:26:09 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#69748 <69748@debbugs.gnu.org> To: bug#69748 <69748@debbugs.gnu.org> Subject: Status: Does diff not work on big enough files? Reply-To: bug#69748 <69748@debbugs.gnu.org> Date: Sun, 07 Sep 2025 02:26:09 +0000 retitle 69748 Does diff not work on big enough files? reassign 69748 diffutils submitter 69748 Robert Boyer severity 69748 normal tag 69748 notabug thanks From debbugs-submit-bounces@debbugs.gnu.org Tue Mar 12 11:19:03 2024 Received: (at submit) by debbugs.gnu.org; 12 Mar 2024 15:19:03 +0000 Received: from localhost ([127.0.0.1]:43400 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rk3uA-0003mg-Pt for submit@debbugs.gnu.org; Tue, 12 Mar 2024 11:19:03 -0400 Received: from lists.gnu.org ([209.51.188.17]:55672) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rk3u6-0003mF-HA for submit@debbugs.gnu.org; Tue, 12 Mar 2024 11:19:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rk3tX-0008Up-Jq for bug-diffutils@gnu.org; Tue, 12 Mar 2024 11:18:23 -0400 Received: from mail-lf1-x12e.google.com ([2a00:1450:4864:20::12e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rk3tU-0006OS-HT; Tue, 12 Mar 2024 11:18:23 -0400 Received: by mail-lf1-x12e.google.com with SMTP id 2adb3069b0e04-513af1a29b1so2171009e87.1; Tue, 12 Mar 2024 08:18:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1710256698; x=1710861498; darn=gnu.org; h=cc:to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=MoKaOSPHypfcMJMWBaim+oLlcqqb1zk6veKJSxtQBaY=; b=lds2/KfF+JOJ/kG7Q7dtFbxk7fFQTJ8QYQP9ZMoYPkvzKVa9C4tdhT0me1Lkvp1fRV kgRzBxMf1vg5DJwPMNEfRI1LqUdD6niX7KVYwJDUC6kCoMu8jaeyY2xed4lNjAXTVujs PMVCopwRwuboyftDGeBBegaetulWvWyEYIE8n1dQMaax/vPqNk222wMrKZsnWtdAYdvT nhiIQhxu3Ph9oMaECvkIrW+0/K5MBcV5xNhQ20zdfKd4ik5sBmIRyYwStNFqsi5OYslt 2+lYvv/CaHYsG8qM03lnZEtK3K2/WDAILGfMxdFQnd7ezDcFi6rESik3Vf7IJcj0y54Z 2vPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710256698; x=1710861498; h=cc:to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=MoKaOSPHypfcMJMWBaim+oLlcqqb1zk6veKJSxtQBaY=; b=NVKEGi8SuGaE3+/pi9xzRMu85lfNReVvQEVa+HHZpr190273+oGTOxp9eUifA2jj0N QVCnQF+AkOcZjeHh0gYbjxBaMqEss8WO9hNM4EqDzJ62AV/poJTJ4SV3ICi4EadxHal7 Q45NCTAGu+awofRLjUuyqEtGFlJLCSTpo2PvnkMrgzP8ZniLjGqRVHxOqUwBcjm8NWCZ XUUP7/0ALFQY4wLGimUBXMsTbGMsiumh9/+myQECwWf96CpnkY1zepcgO5zaWxw75olu G1KDqL6+LTFKvH1A4TCXPP9nUV2dSqImE+bg8NmlvSYPJSq2vg/9fo54ULXbwXq85vLF dLNg== X-Gm-Message-State: AOJu0YyQ7zKxDkXyhpL1qbTVyrYSxkAuRYdHR4YbgMkd1WCXZdD2VYIC jfk84iDIixj8B3oARywhNDimncsviqRadrRTaKbFIkHexs01EZq/qvPCYY3GCkB7vUX8WvRO9JH SI3oDILiYdFFmYD6JZ0g87zLZbOKi4QvwrNE= X-Google-Smtp-Source: AGHT+IF5YqaFCXrsKZ7x6PlUCW2dzqxXJAEPMBBFCMrm+wR7vpShhoOA+xVeb4SNCKRi0iQxGFbzrNVJu35F733glOQ= X-Received: by 2002:a05:6512:742:b0:512:dfa1:6a1c with SMTP id c2-20020a056512074200b00512dfa16a1cmr5872695lfs.10.1710256697529; Tue, 12 Mar 2024 08:18:17 -0700 (PDT) MIME-Version: 1.0 From: Robert Boyer Date: Tue, 12 Mar 2024 10:17:35 -0500 Message-ID: Subject: Does diff not work on big enough files? To: bug-diffutils@gnu.org Content-Type: multipart/alternative; boundary="000000000000ee61f40613782ad4" Received-SPF: pass client-ip=2a00:1450:4864:20::12e; envelope-from=robertstephenboyer@gmail.com; helo=mail-lf1-x12e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.3 (-) X-Debbugs-Envelope-To: submit Cc: rms@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) --000000000000ee61f40613782ad4 Content-Type: text/plain; charset="UTF-8" I am not sure whether to call this a bug, but it is a difficulty for me. It is simply incredible to me that diff might not work! If one cannot count on diff to work, is there anything one can count on? Does diff just not work on big enough files? Apparently yes. > diff the-primes-below-10000000000.lisp billion-primes.txt diff: the-primes-below-10000000000.lisp: Cannot allocate memory > ls -l the-primes-below-10000000000.lisp billion-primes.txt -rw-r----- 1 bob chronos-access 501959790 Mar 10 14:08 billion-primes.txt -rw-r----- 1 bob chronos-access 5403267048 Mar 12 09:55 the-primes-below-10000000000.lisp > > free total used free shared buff/cache available Mem: 6736088 1458180 5060628 16568 217280 5277908 Swap: 0 0 0 > I am running on a $300 Lenovo Chromebook using their default Gnu Linux. Bob > cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 156 model name : Intel(R) Celeron(R) N4500 @ 1.10GHz stepping : 0 microcode : 0x1 cpu MHz : 1113.600 cache size : 4096 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 27 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq vmx ssse3 cx16 sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave rdrand hypervisor lahf_lm 3dnowprefetch cpuid_fault ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust smep erms rdseed smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves arat umip gfni rdpid movdiri movdir64b md_clear arch_capabilities vmx flags : vnmi preemption_timer posted_intr invvpid ept_x_only ept_ad ept_1gb flexpriority apicv tsc_offset vtpr mtf vapic ept vpid unrestricted_guest vapic_reg vid shadow_vmcs pml bugs : spectre_v1 spectre_v2 spec_store_bypass swapgs srbds mmio_stale_data bogomips : 2227.20 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 156 model name : Intel(R) Celeron(R) N4500 @ 1.10GHz stepping : 0 microcode : 0x1 cpu MHz : 1113.600 cache size : 4096 KB physical id : 0 siblings : 2 core id : 1 cpu cores : 2 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 27 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq vmx ssse3 cx16 sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave rdrand hypervisor lahf_lm 3dnowprefetch cpuid_fault ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust smep erms rdseed smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves arat umip gfni rdpid movdiri movdir64b md_clear arch_capabilities vmx flags : vnmi preemption_timer posted_intr invvpid ept_x_only ept_ad ept_1gb flexpriority apicv tsc_offset vtpr mtf vapic ept vpid unrestricted_guest vapic_reg vid shadow_vmcs pml bugs : spectre_v1 spectre_v2 spec_store_bypass swapgs srbds mmio_stale_data bogomips : 2227.20 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: > --000000000000ee61f40613782ad4 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I am not sure whether to call thi= s a bug, but it is a difficulty for me.=C2=A0 It is simply incredible to me= =C2=A0that diff might not work= !
If one cannot count on diff to work, = is there anything one can count on?=C2=A0Does diff just not work on big enough files? Apparently yes.

> diff the-p= rimes-below-10000000000.lisp billion-primes.txt
diff: the-primes-below-1= 0000000000.lisp: Cannot allocate memory
> ls -l the-primes-below-1000= 0000000.lisp billion-primes.txt
-rw-r----- 1 bob chronos-access =C2=A050= 1959790 Mar 10 14:08 billion-primes.txt
-rw-r----- 1 bob chronos-access = 5403267048 Mar 12 09:55 the-primes-below-10000000000.lisp
>=C2=A0

> free
=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0total =C2=A0 =C2=A0 =C2=A0 =C2=A0u= sed =C2=A0 =C2=A0 =C2=A0 =C2=A0free =C2=A0 =C2=A0 =C2=A0shared =C2=A0buff/c= ache =C2=A0 available
Mem: =C2=A0 =C2=A0 =C2=A0 =C2=A0 6736088 =C2=A0 = =C2=A0 1458180 =C2=A0 =C2=A0 5060628 =C2=A0 =C2=A0 =C2=A0 16568 =C2=A0 =C2= =A0 =C2=A0217280 =C2=A0 =C2=A0 5277908
Swap: =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 0
>=C2=A0

I am running on a $3= 00 Lenovo Chromebook using their default Gnu Linux.

Bob

> cat /proc/c= puinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
m= odel : 156
model name : Intel(R) Celeron(R) N4500 @ 1.10GHz
stepping= : 0
microcode : 0x1
cpu MHz : 1113.600
cache size : 4096 KB
p= hysical id : 0
siblings : 2
core id : 0
cpu cores : 2
apicid = : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid lev= el : 27
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic se= p mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx rd= tscp lm constant_tsc arch_perfmon rep_good nopl xtopology nonstop_tsc cpuid= tsc_known_freq pni pclmulqdq vmx ssse3 cx16 sse4_1 sse4_2 x2apic movbe pop= cnt tsc_deadline_timer aes xsave rdrand hypervisor lahf_lm 3dnowprefetch cp= uid_fault ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority e= pt vpid ept_ad fsgsbase tsc_adjust smep erms rdseed smap clflushopt clwb sh= a_ni xsaveopt xsavec xgetbv1 xsaves arat umip gfni rdpid movdiri movdir64b = md_clear arch_capabilities
vmx flags : vnmi preemption_timer posted_intr= invvpid ept_x_only ept_ad ept_1gb flexpriority apicv tsc_offset vtpr mtf v= apic ept vpid unrestricted_guest vapic_reg vid shadow_vmcs pml
bugs : s= pectre_v1 spectre_v2 spec_store_bypass swapgs srbds mmio_stale_data
bogo= mips : 2227.20
clflush size : 64
cache_alignment : 64
address size= s : 39 bits physical, 48 bits virtual
power management:

processor= : 1
vendor_id : GenuineIntel
cpu family : 6
model : 156
model= name : Intel(R) Celeron(R) N4500 @ 1.10GHz
stepping : 0
microcode : = 0x1
cpu MHz : 1113.600
cache size : 4096 KB
physical id : 0
si= blings : 2
core id : 1
cpu cores : 2
apicid : 1
initial apici= d : 1
fpu : yes
fpu_exception : yes
cpuid level : 27
wp : yes=
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov = pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx rdtscp lm constant_tsc= arch_perfmon rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni = pclmulqdq vmx ssse3 cx16 sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_tim= er aes xsave rdrand hypervisor lahf_lm 3dnowprefetch cpuid_fault ssbd ibrs = ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsb= ase tsc_adjust smep erms rdseed smap clflushopt clwb sha_ni xsaveopt xsavec= xgetbv1 xsaves arat umip gfni rdpid movdiri movdir64b md_clear arch_capabi= lities
vmx flags : vnmi preemption_timer posted_intr invvpid ept_x_only = ept_ad ept_1gb flexpriority apicv tsc_offset vtpr mtf vapic ept vpid unrest= ricted_guest vapic_reg vid shadow_vmcs pml
bugs : spectre_v1 spectre_v2= spec_store_bypass swapgs srbds mmio_stale_data
bogomips : 2227.20
cl= flush size : 64
cache_alignment : 64
address sizes : 39 bits physical= , 48 bits virtual
power management:

>=C2=A0
=C2=A0
--000000000000ee61f40613782ad4-- From debbugs-submit-bounces@debbugs.gnu.org Tue Mar 12 15:59:26 2024 Received: (at 69748) by debbugs.gnu.org; 12 Mar 2024 19:59:27 +0000 Received: from localhost ([127.0.0.1]:43889 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rk8HW-0008Sn-Kn for submit@debbugs.gnu.org; Tue, 12 Mar 2024 15:59:26 -0400 Received: from mail.cs.ucla.edu ([131.179.128.66]:45170) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rk8HR-0008SV-SI for 69748@debbugs.gnu.org; Tue, 12 Mar 2024 15:59:25 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.cs.ucla.edu (Postfix) with ESMTP id 4FA263C011BD4; Tue, 12 Mar 2024 12:58:41 -0700 (PDT) Received: from mail.cs.ucla.edu ([127.0.0.1]) by localhost (mail.cs.ucla.edu [127.0.0.1]) (amavis, port 10032) with ESMTP id 1a6Vet23Rutt; Tue, 12 Mar 2024 12:58:41 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by mail.cs.ucla.edu (Postfix) with ESMTP id E40693C011BD9; Tue, 12 Mar 2024 12:58:40 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.cs.ucla.edu E40693C011BD9 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.ucla.edu; s=9D0B346E-2AEB-11ED-9476-E14B719DCE6C; t=1710273520; bh=oPUt8QJAZciCWvSITMeukHR4ibMXUxbg6NYKAdz4I28=; h=Message-ID:Date:MIME-Version:To:From; b=Ecrx4IiWUI7fUAKSozxNzk2oCW1uYmw3DklaATSx8z3ImFMbhEL9EiAxSe4MCMkHt ntE2Rk7rZHtrZLaOsrjnuQvQmzVSJl+N5pGfS2xLX9ElsNyCpEHtzafPTZdiYYiPOf U6J1/VwLOoZHrFTxqx13owN+tEAt3nEVohIlmndL18vAz7VzlBgs3a8aO96XaV9k+o wP73vYBm1/e7S6K6NL4OMfqb7l7iZ08geXkXppovAZDdXK6vcwaV+/4rkpZIR8BD19 rOYKWUn3/DnLUuazxTATz5IV/igfcoOhDWb33K8ecPACRCpshLsz3kWZnW6i/DHh01 xGLnh+izdnJbQ== X-Virus-Scanned: amavis at mail.cs.ucla.edu Received: from mail.cs.ucla.edu ([127.0.0.1]) by localhost (mail.cs.ucla.edu [127.0.0.1]) (amavis, port 10026) with ESMTP id bZXrPgPSQj7J; Tue, 12 Mar 2024 12:58:40 -0700 (PDT) Received: from [131.179.64.200] (Penguin.CS.UCLA.EDU [131.179.64.200]) by mail.cs.ucla.edu (Postfix) with ESMTPSA id B58B33C011BD4; Tue, 12 Mar 2024 12:58:40 -0700 (PDT) Message-ID: Date: Tue, 12 Mar 2024 12:58:39 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [bug-diffutils] bug#69748: Does diff not work on big enough files? Content-Language: en-US To: Robert Boyer , 69748@debbugs.gnu.org References: From: Paul Eggert Organization: UCLA Computer Science Department In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 69748 Cc: rms@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) On 3/12/24 08:17, Robert Boyer wrote: > It is simply incredible to me that diff might not work! Like any other program, 'diff' needs enough resources to run. You're trying to compare a 5 GiB file on a Chromebook that has (let me guess) 4 GiB of RAM and 32 GB of flash, most of which is occupied by ChromeOS and other stuff. If so, there isn't enough room for 'diff' to do its job with its current algorithm and you'll have to either use a bigger machine or solve a smaller problem. It's possible to imagine a different 'diff' algorithm that would take less RAM but a lot more time, presumably because it would do more I/O to a temporary file. But if the available flash is small enough, even that wouldn't work. I doubt whether it'd be worth the time to develop the code for this alternative approach. From debbugs-submit-bounces@debbugs.gnu.org Tue Mar 12 16:20:39 2024 Received: (at control) by debbugs.gnu.org; 12 Mar 2024 20:20:39 +0000 Received: from localhost ([127.0.0.1]:43914 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rk8c3-0000YW-Gg for submit@debbugs.gnu.org; Tue, 12 Mar 2024 16:20:39 -0400 Received: from mail.cs.ucla.edu ([131.179.128.66]:50376) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rk8c2-0000YJ-1Y for control@debbugs.gnu.org; Tue, 12 Mar 2024 16:20:38 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.cs.ucla.edu (Postfix) with ESMTP id AFC963C011BD9 for ; Tue, 12 Mar 2024 13:19:57 -0700 (PDT) Received: from mail.cs.ucla.edu ([127.0.0.1]) by localhost (mail.cs.ucla.edu [127.0.0.1]) (amavis, port 10032) with ESMTP id pu6g7gcxwghO for ; Tue, 12 Mar 2024 13:19:57 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by mail.cs.ucla.edu (Postfix) with ESMTP id 6C2E03C011BD4 for ; Tue, 12 Mar 2024 13:19:57 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.cs.ucla.edu 6C2E03C011BD4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.ucla.edu; s=9D0B346E-2AEB-11ED-9476-E14B719DCE6C; t=1710274797; bh=LhodsTZMy7ERWAk5+CPPgo0WUc0A7cGOEBFvozZ8N1Y=; h=Message-ID:Date:MIME-Version:To:From; b=VYOVxY3CH5aICfBGYPJdtcRmEU86/CFsX4oUaDu8RNVx6jSmus54MlUkL3N/Bd8Wu uasCShVBLDkZjBVT/f3E3yXsnGVCpsli9oA9tyn0hbuI1dtH978npdTdM3LBlrqaHi rJuybUScjiRn21iWjXbKLmcwd95LeLJ4aFz+L2m0WOoihcgpJyA5QXctpIo0ws4F8J xRbOVf2kv04kCxNC0YkTTz/cY1/7BsWM2C+05+5CzpuTf0nckzCF+APxKF/jg+d8aB awKisvHPKPTBxWGEU9EAD0+g2Jjo2oT95xmW06m81nmtjs5Hn8Zo27CCPAZKBQ34iW DkZ3qehm1D5JA== X-Virus-Scanned: amavis at mail.cs.ucla.edu Received: from mail.cs.ucla.edu ([127.0.0.1]) by localhost (mail.cs.ucla.edu [127.0.0.1]) (amavis, port 10026) with ESMTP id J5VJQnDzDGMB for ; Tue, 12 Mar 2024 13:19:57 -0700 (PDT) Received: from [131.179.64.200] (Penguin.CS.UCLA.EDU [131.179.64.200]) by mail.cs.ucla.edu (Postfix) with ESMTPSA id 4B28C3C011BD9 for ; Tue, 12 Mar 2024 13:19:57 -0700 (PDT) Message-ID: <868e50e1-b17d-4c7e-8027-c03504462de4@cs.ucla.edu> Date: Tue, 12 Mar 2024 13:19:57 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: GNU bug control From: Paul Eggert Subject: diffutils bug maintenance Organization: UCLA Computer Science Department Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) tags 69748 notabug tags 69752 notabug close 69748 close 69752 From debbugs-submit-bounces@debbugs.gnu.org Tue Mar 12 16:24:36 2024 Received: (at 69748) by debbugs.gnu.org; 12 Mar 2024 20:24:36 +0000 Received: from localhost ([127.0.0.1]:43923 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rk8fs-0000fe-2T for submit@debbugs.gnu.org; Tue, 12 Mar 2024 16:24:36 -0400 Received: from mail-ed1-f54.google.com ([209.85.208.54]:52244) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rk8fm-0000fO-NR for 69748@debbugs.gnu.org; Tue, 12 Mar 2024 16:24:34 -0400 Received: by mail-ed1-f54.google.com with SMTP id 4fb4d7f45d1cf-56840d872aeso4632644a12.0 for <69748@debbugs.gnu.org>; Tue, 12 Mar 2024 13:23:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1710274970; x=1710879770; darn=debbugs.gnu.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=pQfU8svWY3QcWEpO/Dug8yyit5SfSi6jALYBuEmXjoc=; b=QUD6gZnp1B38Tbw4uljnrp3Bzmk/DGd0v8b5VIq5U8mRgbmrBEP94f7l6+NOljB9Km LoHPbRTLorQ0HcZAu4hF84C2maNOaIZ0xWJlrr7oK5VnShXMP0KXBSbSbK7kCAEj70+p GSGVXsp4ARMa7wzSsyRtuRKd4x/Qo/JiBNYksUJlzKUsOLW1bvB1TQMhqg2ySAgVOuxv CMIs3rPEUzlLOeFjz0CA5c2aWMrx4TkqKMV1JPnlc9qAUu5f6UzDPCmJ72VbZ0xDjEa1 lRYMk6O3MS5k3+gWgCdAuBWMOUWBS5pVYBwvijYyqOHEyQAxdboFIEQP83vm69F9e+6q fICg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710274970; x=1710879770; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=pQfU8svWY3QcWEpO/Dug8yyit5SfSi6jALYBuEmXjoc=; b=kMyWKF+FyVwlGL8xqe7owHBEiL1UVeJIuza2AkGQ520tSQe4z45oOQU4gI7P8kwzPT IixKMPGbhQ82f4Nry5TqvTXlYk1iVp/EmA0z1zZz/vkU/alLGqEh0osHoy0Y9+wt11zo js0AIefhv5tcrd9FZyZCKXF0plB7urwN7exaFF/RJkVUT8MiccsglC6Rx9wrGdKLbvOt MjAVoBW4C10WZR6HbhDPvfaRmtirUAdwm8+JrJexvbp3Yt/i0pZCrjgDj1LPzHClYfBa L3rOWatg0Ky6NlXmdFvPKi31pvutfiMIIMUfnyerUtuBHzDkGo18LMKaXzEaN4WwUOPP LFoQ== X-Gm-Message-State: AOJu0YxqVQl2JzcrAUxiEX088M0Rzsb78wSeF+A0Fz8iEydyK/I7NxFK xScMFCJNh4iZG5CK35SvBLXuWeDVdC1VaZ4RI9IHbdTi4B+OOQ64MFibaRr3pYSi7ZQZr8wsSBI JP2MBaqDfJmxgDITXI4EVoz2uoyU= X-Google-Smtp-Source: AGHT+IEd81ji9+32h+XRAcOsShd+Xo+6zJHcoTgMuzcn9a4A9aL1HmXceQtppJqwlaHzMg/6Z8G6AxJCHqcBH0GUUM4= X-Received: by 2002:a50:9b58:0:b0:568:18a8:2291 with SMTP id a24-20020a509b58000000b0056818a82291mr7007768edj.23.1710274969741; Tue, 12 Mar 2024 13:22:49 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Robert Boyer Date: Tue, 12 Mar 2024 15:22:12 -0500 Message-ID: Subject: Re: [bug-diffutils] bug#69748: Does diff not work on big enough files? To: Paul Eggert Content-Type: multipart/alternative; boundary="0000000000000a374706137c6c8d" X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 69748 Cc: 69748@debbugs.gnu.org, rms@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) --0000000000000a374706137c6c8d Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Are you trying to be funny? Or are you simply stupid? You are much too brilliant and famous to be stupid, so I am assuming you were trying to be funny, a parody of the overworked bug fixer. In an almost immediate follow up message, I already solved the problem, and it worked perfectly for me trying to compare an old file of the primes below a billion with a new file of the primes below ten billion. Fortunately, this little gem of a program helped me believe that I had computed at least the primes below a billion correctly. What a relief! > there isn't enough room for 'diff' to do its job with its current algorithm Probably very sadly true, so you must improve your algorithm, and here is how. It won't hurt, I promise. >From my previous message: Here is a better version of diff, better only in the sense that it works on all files. But what do I know? Nothing. This is Common Lisp. I was running in SBCL. (defun my-diff (file1 file2) (let ((s1 (open file1 :element-type '(integer 0 255))) (s2 (open file2 :element-type '(integer 0 255))) (c1 0) (c2 0)) (declare (fixnum c1 c2)) (loop (setq c1 (read-byte s1 nil 256)) (setq c2 (read-byte s2 nil 256)) (cond ((and (eql c1 256) (eql c2 256)) (return "no difference"))) (cond ((eql c1 256) (return "file1 hit eof first"))) (cond ((eql c2 256) (return "file2 hit eof first"))) (cond ((eql c1 c2)) (t (return (format nil "difference at position ~s; c1 =3D ~s, c2 =3D= ~s." (file-position s1) c1 c2))))))) On Tue, Mar 12, 2024 at 2:58=E2=80=AFPM Paul Eggert wr= ote: > On 3/12/24 08:17, Robert Boyer wrote: > > > It is simply incredible to me that diff might not work! > > Like any other program, 'diff' needs enough resources to run. You're > trying to compare a 5 GiB file on a Chromebook that has (let me guess) 4 > GiB of RAM and 32 GB of flash, most of which is occupied by ChromeOS and > other stuff. If so, there isn't enough room for 'diff' to do its job > with its current algorithm and you'll have to either use a bigger > machine or solve a smaller problem. > > It's possible to imagine a different 'diff' algorithm that would take > less RAM but a lot more time, presumably because it would do more I/O to > a temporary file. But if the available flash is small enough, even that > wouldn't work. I doubt whether it'd be worth the time to develop the > code for this alternative approach. > --0000000000000a374706137c6c8d Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Are you try= ing to be funny? Or are you simply stupid?=C2=A0 You are much too brilliant= and famous
to be stupid, so I am= assuming you were trying to be funny, a parody of the overworked=C2=A0bug = fixer.

In an almost immediate follow up message, I already so= lved the problem, and it
worked p= erfectly for me trying to compare an old file of the primes below a billion= with a new
file of the primes be= low ten billion.=C2=A0 Fortunately, this little gem of a program helped me<= /font>
believe that I had computed at le= ast the primes below a billion correctly.=C2=A0 What a relief!
=

> there isn't en= ough room for 'diff' to do its job with its current algorithm=C2=A0=

Probably very= sadly true, so you must improve your algorithm, and here is how. It won= 9;t hurt, I promise.

From my previous message:

Here is a be= tter version of diff, better only in the sense that it works on all files.= =C2=A0 But what do I know?=C2=A0 Nothing.

This is Common Lisp.=C2= =A0 I was running in SBCL.

(defun my-diff (file1 file2)
=C2= =A0 (let ((s1 (open file1 :element-type '(integer 0 255)))
=C2=A0 = =C2=A0 =C2=A0 =C2=A0 (s2 (open file2 :element-type '(integer 0 255)))=C2=A0 =C2=A0 =C2=A0 =C2=A0 (c1 0)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 (c2 0))=
=C2=A0 =C2=A0 (declare (fixnum c1 c2))
=C2=A0 =C2=A0 (loop
=C2=A0= =C2=A0 =C2=A0(setq c1 (read-byte s1 nil 256))
=C2=A0 =C2=A0 =C2=A0(setq= c2 (read-byte s2 nil 256))
=C2=A0 =C2=A0 =C2=A0(cond ((and (eql c1 256)= (eql c2 256)) (return "no difference")))
=C2=A0 =C2=A0 =C2=A0= (cond ((eql c1 256) (return "file1 hit eof first")))
=C2=A0 = =C2=A0 =C2=A0(cond ((eql c2 256) (return "file2 hit eof first")))=
=C2=A0 =C2=A0 =C2=A0(cond ((eql c1 c2))
=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0(t (return (format nil
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 "= ;difference at position ~s; c1 =3D ~s, c2 =3D ~s."
=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 (file-position s1) c1 c2)))))))
<= br>
On Tue,= Mar 12, 2024 at 2:58=E2=80=AFPM Paul Eggert <eggert@cs.ucla.edu> wrote:
On 3/12/24 08:17, Robert Boyer wrote:

> It is simply incredible to me that diff might not work!

Like any other program, 'diff' needs enough resources to run. You&#= 39;re
trying to compare a 5 GiB file on a Chromebook that has (let me guess) 4 GiB of RAM and 32 GB of flash, most of which is occupied by ChromeOS and other stuff. If so, there isn't enough room for 'diff' to do it= s job
with its current algorithm and you'll have to either use a bigger
machine or solve a smaller problem.

It's possible to imagine a different 'diff' algorithm that woul= d take
less RAM but a lot more time, presumably because it would do more I/O to a temporary file. But if the available flash is small enough, even that wouldn't work. I doubt whether it'd be worth the time to develop th= e
code for this alternative approach.
--0000000000000a374706137c6c8d-- From unknown Sat Sep 06 19:26:09 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Wed, 10 Apr 2024 11:24:09 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator