From unknown Sat Jun 14 05:06:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#61055: file-needed/recurive does not canonicalize paths Resent-From: Lars-Dominik Braun Original-Sender: "Debbugs-submit" Resent-CC: ludo@gnu.org, bug-guix@gnu.org Resent-Date: Wed, 25 Jan 2023 10:25:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 61055 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: 61055@debbugs.gnu.org Cc: ludo@gnu.org X-Debbugs-Original-To: bug-guix@gnu.org X-Debbugs-Original-Xcc: ludo@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.167464225523623 (code B ref -1); Wed, 25 Jan 2023 10:25:02 +0000 Received: (at submit) by debbugs.gnu.org; 25 Jan 2023 10:24:15 +0000 Received: from localhost ([127.0.0.1]:58424 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pKcwx-00068x-68 for submit@debbugs.gnu.org; Wed, 25 Jan 2023 05:24:15 -0500 Received: from lists.gnu.org ([209.51.188.17]:47188) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pKcwv-00068p-Ng for submit@debbugs.gnu.org; Wed, 25 Jan 2023 05:24:14 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pKcwu-0004Fd-Uc for bug-guix@gnu.org; Wed, 25 Jan 2023 05:24:12 -0500 Received: from mout-p-201.mailbox.org ([2001:67c:2050:0:465::201]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_CHACHA20_POLY1305:256) (Exim 4.90_1) (envelope-from ) id 1pKcws-0006Mi-Gv for bug-guix@gnu.org; Wed, 25 Jan 2023 05:24:12 -0500 Received: from smtp202.mailbox.org (smtp202.mailbox.org [IPv6:2001:67c:2050:b231:465::202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 4P20LS5300z9slX for ; Wed, 25 Jan 2023 11:24:00 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6xq.net; s=MBO0001; t=1674642240; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=R+m5fh4Wr1iXuN7udTOIVHcxKQiEEC3gpQ2HJxMlBqY=; b=EQ+Dxmm+KfxhJX+iE6mgWDyH5HOgZMZQ2jyeSTQcz2kGZ+L0cKKvn6m7G4fI5OlbT2Zv1X zouGH0zTtMMjVuVDQdZZPbnV+ddNvizwtXab940o24AMgjFp6bs3p7XedndFfYDIKoALHq /dO7Z5CEBWvVsZng/AE0s1y5Mij8pCqCSh1Ox66xGi1j1VU6KNY6C4vpuAs3cr2KHJ187b l2atC8EjEvDLNGsT7TEglvUdiDiuecuMaR85IUGHd9rqIT3vi51GU63u8O54dr6QwAFBkK iUR9Sp6smahz1/3Y2GkEObkYFtwsZkytXTehpMWbXh9Yx/FhsEFrkVMnbi09lg== Date: Wed, 25 Jan 2023 11:23:58 +0100 From: Lars-Dominik Braun Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 4P20LS5300z9slX Received-SPF: pass client-ip=2001:67c:2050:0:465::201; envelope-from=lars@6xq.net; helo=mout-p-201.mailbox.org X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.6 (-) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.6 (--) Hi, (CC-ing Ludo, who wrote the code according to git logs) during testing of wip-haskell I observed the make-dynamic-linker-cache phase is taking alot of time (up to two minutes on a fast machine with SSD). Looking at ghc-hindent for example [1]: starting phase `make-dynamic-linker-cache' created '/gnu/store/2nrzbaxmqs2rq9yv52bpyn2azb3qj6h1-ghc-hindent-5.3.4/etc/ld.so.cache' from 10085 library search path entries phase `make-dynamic-linker-cache' succeeded after 119.5 seconds And while Haskell packages link to a pretty large number of dynamic libraries (116 in this case), 10000 search path entries seems wrong. Running just (file-needed/recursive "/gnu/store/2nrzbaxmqs2rq9yv52bpyn2azb3qj6h1-ghc-hindent-5.3.4/bin/hindent") takes a long time and reveals entries like /gnu/store/1cyk8j2nd6r0cvm6kx1408kd763yf8h5-ghc-9.2.5/lib/ghc-9.2.5/Cabal-3.6.3.0/../directory-1.3.6.2/../unix-2.7.2.2/../bytestring-0.11.3.1/../template-haskell-2.18.0.0/../pretty-1.1.3.6/../array-0.5.4.0/../base-4.16.4.0/../ghc-bignum-1.2/../ghc-prim-0.8.0/libHSghc-prim-0.8.0-ghc9.2.5.so so it looks like it deduplicates values, but does not canonicalize paths. A relatively straight-forward fix could be the following change, but I don’t know if that would cause any issues, since canonicalize-path throws an exception if the resulting path does not exist. It’s also a world rebuild since pretty much any package uses this phase (and the reason and I cannot test it on a larger scale). ---snip--- diff --git a/guix/build/gremlin.scm b/guix/build/gremlin.scm index 2a74d51dd9..6eb8f688ea 100644 --- a/guix/build/gremlin.scm +++ b/guix/build/gremlin.scm @@ -285,8 +285,8 @@ (define (file-needed/recursive file) (if (and runpath needed) (let* ((runpath (map (cute expand-origin <> (dirname file)) runpath)) - (resolved (map (cut search-path runpath <>) - needed)) + (resolved (map (lambda (x) (and=> x canonicalize-path)) (map (cut search-path runpath <>) + needed))) (failed (filter-map (lambda (needed resolved) (and (not resolved) (not (libc-library? needed)) ---snap--- Cheers, Lars [1] https://ci.guix.gnu.org/build/366156/log/raw From unknown Sat Jun 14 05:06:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#61055: file-needed/recurive does not canonicalize paths Resent-From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Mon, 30 Jan 2023 16:33:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 61055 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: Lars-Dominik Braun Cc: 61055@debbugs.gnu.org Received: via spool by 61055-submit@debbugs.gnu.org id=B61055.167509634220705 (code B ref 61055); Mon, 30 Jan 2023 16:33:01 +0000 Received: (at 61055) by debbugs.gnu.org; 30 Jan 2023 16:32:22 +0000 Received: from localhost ([127.0.0.1]:50213 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pMX4w-0005Ns-3a for submit@debbugs.gnu.org; Mon, 30 Jan 2023 11:32:22 -0500 Received: from eggs.gnu.org ([209.51.188.92]:39736) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pMX4s-0005NZ-Fp for 61055@debbugs.gnu.org; Mon, 30 Jan 2023 11:32:20 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pMX4m-0002ZE-0e; Mon, 30 Jan 2023 11:32:12 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:In-Reply-To:Date:References:Subject:To: From; bh=H1iM88RAWkxrspNn+taXSdWjj43C9BCF4dHt5bAK6vk=; b=iIkrerr5DkUc237QK7U1 clb3CDwvLvhyEadtqz5vq+E0oPT0bqhhVN4MdGCYzn/MOwbaMz4/TXybVrWhrlCM27ShtGor2xbBO 8QpigOqHJIYsLjjOm84I++RpLVcYe/CbLaYX+Q+PQLcz/Z0ya5/kJqYy71goIlOAtjTYKn0Izg7/t uSt4EGz20lCxnGmZuJy9I8xuzPBO359oXPI/KKeVpEX3eKoh6qh3X1FB97wsHjfvdPoOTBcVczMuX qdX2dsTk13gdRF8C7eFpYorujkhftJiRCLmynKBCLy8ckM/7J6V6FQp0g+6uqWqttQHuBM1QtYdfq Wj+7HgyPrzWUvg==; Received: from [193.50.110.131] (helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pMX4R-0007Tn-3z; Mon, 30 Jan 2023 11:32:10 -0500 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= References: Date: Mon, 30 Jan 2023 17:31:49 +0100 In-Reply-To: (Lars-Dominik Braun's message of "Wed, 25 Jan 2023 11:23:58 +0100") Message-ID: <87ilgoyoei.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Hi Lars, Lars-Dominik Braun skribis: > during testing of wip-haskell I observed the make-dynamic-linker-cache > phase is taking alot of time (up to two minutes on a fast machine with > SSD). Looking at ghc-hindent for example [1]: > > starting phase `make-dynamic-linker-cache' > created '/gnu/store/2nrzbaxmqs2rq9yv52bpyn2azb3qj6h1-ghc-hindent-5.3.= 4/etc/ld.so.cache' from 10085 library search path entries > phase `make-dynamic-linker-cache' succeeded after 119.5 seconds > > And while Haskell packages link to a pretty large number of dynamic > libraries (116 in this case), 10000 search path entries seems wrong. Runn= ing just > > (file-needed/recursive "/gnu/store/2nrzbaxmqs2rq9yv52bpyn2azb3qj6h1-g= hc-hindent-5.3.4/bin/hindent") > > takes a long time and reveals entries like > /gnu/store/1cyk8j2nd6r0cvm6kx1408kd763yf8h5-ghc-9.2.5/lib/ghc-9.2.5/Cabal= -3.6.3.0/../directory-1.3.6.2/../unix-2.7.2.2/../bytestring-0.11.3.1/../tem= plate-haskell-2.18.0.0/../pretty-1.1.3.6/../array-0.5.4.0/../base-4.16.4.0/= ../ghc-bignum-1.2/../ghc-prim-0.8.0/libHSghc-prim-0.8.0-ghc9.2.5.so > so it looks like it deduplicates values, but does not canonicalize > paths. A relatively straight-forward fix could be the following change, > but I don=E2=80=99t know if that would cause any issues, since canonicali= ze-path > throws an exception if the resulting path does not exist. It=E2=80=99s al= so > a world rebuild since pretty much any package uses this phase (and the > reason and I cannot test it on a larger scale). Right. Other arguments against systematic canonicalization: (1) =E2=80=98canonicalize-path=E2=80=99 is costly, (2) developers and tools mig= ht choose to write =E2=80=98x/y/../z=E2=80=99 for a good reason and changing that could = break their expectations. Can you see how we end up with those entries? These at DT_NEEDED entries, not DT_RUNPATH, right? If so, that probably means that ghc at some points invokes the linker along the lines of: ld -o hindent ../foo/../bar/../baz/libbaz.so Could you check in build logs exactly how that executable gets linked? Is there a way we could canonicalize there, or, better, get the build system to do something like: ld -o hindent -L ../foo/../bar/../baz -lbaz ? That way DT_NEEDED would be just =E2=80=9Clibbaz.so=E2=80=9D instead of = the complete file name. DT_RUNPATH would contain the weird file name, but that=E2=80=99s probably okay. HTH, Ludo=E2=80=99. From unknown Sat Jun 14 05:06:25 2025 X-Loop: help-debbugs@gnu.org Subject: bug#61055: file-needed/recurive does not canonicalize paths Resent-From: Lars-Dominik Braun Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Wed, 01 Feb 2023 08:36:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 61055 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: Ludovic =?UTF-8?Q?Court=C3=A8s?= Cc: 61055@debbugs.gnu.org Received: via spool by 61055-submit@debbugs.gnu.org id=B61055.16752405388179 (code B ref 61055); Wed, 01 Feb 2023 08:36:01 +0000 Received: (at 61055) by debbugs.gnu.org; 1 Feb 2023 08:35:38 +0000 Received: from localhost ([127.0.0.1]:56322 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pN8af-00027q-3b for submit@debbugs.gnu.org; Wed, 01 Feb 2023 03:35:38 -0500 Received: from mout-p-101.mailbox.org ([80.241.56.151]:36406) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pN8aZ-00027Z-Rz for 61055@debbugs.gnu.org; Wed, 01 Feb 2023 03:35:35 -0500 Received: from smtp102.mailbox.org (smtp102.mailbox.org [IPv6:2001:67c:2050:b231:465::102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-101.mailbox.org (Postfix) with ESMTPS id 4P6Fbv3VlWz9sTn; Wed, 1 Feb 2023 09:35:23 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6xq.net; s=MBO0001; t=1675240523; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yIpHIwS6/4nJIfkd7zmWaiujQxtA1NQcb0J1/6znUKM=; b=IEUnlrxkZiIB9n6+niSdakEkbURPIPZiXuElwKJDUHKgRW8/ib8TlbEAAbyL72drE+YT/V oDgi2gBYwTwotz0JJs59WHvrYhPIQkqgX1iBVql7BJCY7yUI5E3qH+lz+Bzxr3xhHrsqaX 8RzIQfZcQbskFvQmjagvryWJFG1sLTJmqiFO3GVQiLqmEUrUZ5y8Jo9QKV1SXq97m657UD YlAaQCHelxsVZ/zNu6Q5QtUMHKGxrtQAwnxUeAEPqt/9afOCgAsWl2hkftxi1TebBvae+I qGDA78AKH0t6OCz8wLZVsmgF8eOjcYylHZUVo6+OWgB+5sOjTwRIalCF9rEPYA== Date: Wed, 1 Feb 2023 09:35:21 +0100 From: Lars-Dominik Braun Message-ID: References: <87ilgoyoei.fsf@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <87ilgoyoei.fsf@gnu.org> X-Rspamd-Queue-Id: 4P6Fbv3VlWz9sTn X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) Hi Ludo, > Can you see how we end up with those entries? These at DT_NEEDED > entries, not DT_RUNPATH, right? they definitely do not come from the hindent binary. `readelf -d` looks like this ---snip--- 0x0000000000000001 (NEEDED) Shared library: [libHSpath-io-1.7.0-Y7nuUr9RcCC0rgElOk2Zd-ghc9.2.5.so] 0x0000000000000001 (NEEDED) Shared library: [libHSunix-compat-0.5.4-2Fa5hW7FPv81iCQxP5gtt5-ghc9.2.5.so] 0x0000000000000001 (NEEDED) Shared library: [libHStemporary-1.3-K9lHfUrZ43CE7BhnE8cSVB-ghc9.2.5.so] … 0x000000000000001d (RUNPATH) Library runpath: [/gnu/store/2nrzbaxmqs2rq9yv52bpyn2azb3qj6h1-ghc-hindent-5.3.4/lib/x86_64-linux-ghc-9.2.5:/gnu/store/1cyk8j2nd6r0cvm6kx1408kd763yf8h5-ghc-9.2.5/lib/ghc-9.2.5/Cabal-3.6.3.0:/gnu/store/lk1mgbds7rf9ilv425msz840ky8cfn79-ghc-diff-0.4.1/lib/x86_64-linux-ghc-9.2.5:/gnu/store/866s8qcghallds7azzcx075a06mr64h4-ghc-hunit-1.6.2.0/lib/x86_64-linux-ghc-9.2.5:/gnu/store/rgm9h2a4v1zgwm1mbfb36ycjbkywlxj6-ghc-onetuple-0.3.1/lib/x86_64-linux-ghc-9.2.5:… ---snap--- and I believe GHC adds `-L/gnu/store/…` and `-lHSXXX` to the linker invokation only, which should be correct. However GHC’s bundled libraries are linked differently and result in a `readelf -d /gnu/store/1cyk8j2nd6r0cvm6kx1408kd763yf8h5-ghc-9.2.5/lib/ghc-9.2.5/Cabal-3.6.3.0/libHSCabal-3.6.3.0-ghc9.2.5.so` like this: ---snip--- 0x0000000000000001 (NEEDED) Shared library: [libHSprocess-1.6.16.0-ghc9.2.5.so] 0x0000000000000001 (NEEDED) Shared library: [libHSparsec-3.1.15.0-ghc9.2.5.so] 0x0000000000000001 (NEEDED) Shared library: [libHStext-1.2.5.0-ghc9.2.5.so] … 0x000000000000001d (RUNPATH) Library runpath: [$ORIGIN/../process-1.6.16.0:$ORIGIN/../parsec-3.1.15.0:$ORIGIN/../text-1.2.5.0:$ORIGIN/../mtl-2.2.2:$ORIGIN/../transformers-0.5.6.2:$ORIGIN/../directory-1.3.6.2:$ORIGIN/../unix-2.7.2.2:$ORIGIN/../time-1.11.1.1:$ORIGIN/../filepath-1.4.2.2:$ORIGIN/../binary-0.8.9.0:$ORIGIN/../containers-0.6.5.1:$ORIGIN/../bytestring-0.11.3.1:$ORIGIN/../template-haskell-2.18.0.0:$ORIGIN/../pretty-1.1.3.6:$ORIGIN/../ghc-boot-th-9.2.5:$ORIGIN/../deepseq-1.4.6.1:$ORIGIN/../array-0.5.4.0:$ORIGIN/../base-4.16.4.0:$ORIGIN/../ghc-bignum-1.2:$ORIGIN/../ghc-prim-0.8.0:$ORIGIN/../rts:/gnu/store/5h2w4qi9hk1qzzgi1w83220ydslinr4s-glibc-2.33/lib:/gnu/store/094bbaq6glba86h1d4cj16xhdi6fk2jl-gcc-10.3.0-lib/lib:/gnu/store/fwbiihd2sbhai63y1pvvdh0f2bakfzrf-gmp-6.2.1/lib:/gnu/store/094bbaq6glba86h1d4cj16xhdi6fk2jl-gcc-10.3.0-lib/lib/gcc/x86_64-unknown-linux-gnu/10.3.0/../../..] ---snap--- These obviously will get expanded to /gnu/store/1cyk8j2nd6r0cvm6kx1408kd763yf8h5-ghc-9.2.5/lib/ghc-9.2.5/Cabal-3.6.3.0/../process-1.6.16.0 And there’s about 40 of these bundled libraries referencing each other, which is why the problem is amplified. I still believe this is a bug in file-needed/recursive, because it recurses, but does not correctly keep track of “visited” shared libraries (by not canonicalizing paths). Its documentation also says that it returns a “list of absolute .so file names” – which I would expect not to have relative path elements in. The output of `ldd` does so too. So yes, canonicalize-path may be expensive, but evaluating 10000 shared libraries is too. Cheers, Lars PS: This is not a problem for wip-haskell any more, since I switched to static linking, but it’s going to bite someone else.