From unknown Fri Aug 15 14:15:36 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#75796 <75796@debbugs.gnu.org> To: bug#75796 <75796@debbugs.gnu.org> Subject: Status: Very inefficient use of --color escape sequences Reply-To: bug#75796 <75796@debbugs.gnu.org> Date: Fri, 15 Aug 2025 21:15:36 +0000 retitle 75796 Very inefficient use of --color escape sequences reassign 75796 grep submitter 75796 Peter White severity 75796 normal thanks From debbugs-submit-bounces@debbugs.gnu.org Thu Jan 23 23:18:02 2025 Received: (at submit) by debbugs.gnu.org; 24 Jan 2025 04:18:02 +0000 Received: from localhost ([127.0.0.1]:43147 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1tbB8q-0007Xj-5K for submit@debbugs.gnu.org; Thu, 23 Jan 2025 23:18:02 -0500 Received: from lists.gnu.org ([2001:470:142::17]:58702) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1tbAFN-0004zd-Bb for submit@debbugs.gnu.org; Thu, 23 Jan 2025 22:20:44 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tbAFG-0006Hb-K9 for bug-grep@gnu.org; Thu, 23 Jan 2025 22:20:34 -0500 Received: from mout02.posteo.de ([185.67.36.66]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tbAFD-00029H-Lm for bug-grep@gnu.org; Thu, 23 Jan 2025 22:20:34 -0500 Received: from submission (posteo.de [185.67.36.169]) by mout02.posteo.de (Postfix) with ESMTPS id D3AE1240101 for ; Fri, 24 Jan 2025 04:20:28 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.net; s=2017; t=1737688828; bh=x82/oxh+uGQhfOGahN3R2mYVCxgbR6jnF3QIlD+7qkU=; h=Date:From:To:Subject:Message-ID:MIME-Version:Content-Type: Content-Disposition:From; b=k1bA6aTnWQ31G/4hoTqkWMjWEIlmX2ytZ6fLDp2sGxkBb8F/DDGCIJT0ANP/202SZ L4yCQoKV5Nk4SZ1h5gL/XHVh9cg1KdI7e2KCHMIX8H+i5bUS4B+SCDph4H/n1rtYdj v2OgS5LNSpkC1TmeREqYdUJu/deZQNQuZo/06Pi0EhAc8bDaci1unNUzxNAqknkylO IjNa065cdJAt175dI0DEvnAJPGu3ORDRP2v6EGLexeTvuWH4mpUbySeykySKd0hE2Q 7F6+PtDlJaU5PKalgdk5bm4mz7L21kWKZvJ0zeGpyX1gS/NJ3ehk1yIqceGRp6ls+E X0CBLa3R0dGpg== Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4YfNNr49KBz6tvt for ; Fri, 24 Jan 2025 04:20:28 +0100 (CET) Date: Fri, 24 Jan 2025 03:20:28 +0000 From: Peter White To: bug-grep@gnu.org Subject: Very inefficient use of --color escape sequences Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Received-SPF: pass client-ip=185.67.36.66; envelope-from=peter.white@posteo.net; helo=mout02.posteo.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Thu, 23 Jan 2025 23:17:58 -0500 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) Hi there, I just stumbled on this by accidentally redirecting the output of a colored grep invocation to a file like so: $ grep --color=always . /sys/kernel/mm/transparent_hugepage/* >thp-status.txt 2>/dev/null # for convenience the 1st line of actual uncolored output /sys/kernel/mm/transparent_hugepage/defrag:always defer defer+madvise [madvise] never Subsequently opening said file in vim I was overwhelmed with escape sequences. From the looks of it every character in the match gets its very own color escape sequence as opposed to the string indicating the path to the file (everything before ':') which is actually kind of readable b/c it starts with only one escape sequence which gets reset just before ':'. Now I know that this is not really how one is supposed to use the program and its output but it does show that the way grep works when coloring the match is rather inefficient. As I said, I only discovered this by accident, as in, otherwise I hadn't noticed any performance issues or other ill-effects. The interesting part is that this way the color escapes outweigh the actual payload text quite substantially: $ du -b grep* 2471 grep-color-always.txt 359 grep-color-auto.txt This the same output as above, the names should say it all. That's just shy of factor 7! I think that is quite some overhead for what the actual purpose is. Now I don't necessarily agree with "benchmarking" terminal emulators but did read a little about it. And maybe I just don't have a brutal enough use case to make this an actual performance issue on my hardware. But I believe fixing this might just result in some improvements in that area, FWIW. I am using GNU grep 3.11, which seems to be the current stable release, on the current Ubuntu 24.04 LTS release. So maybe this has been addressed in a later devel version? A cursory search for the issue turned up empty though. That's why I would rather avoid compiling from source for now, unless of course for testing a possible fix. Again, I don't *need* a fix and would be happy to wait another year for the next Ubuntu LTS, given the non-noticeable impact on my end. So is this something that cannot be done any other way b/c of the way matching works - as in: "if it can't be done elegantly use brute force"? Or could this be classified as an actual bug, albeit a low priority one? Peter White From debbugs-submit-bounces@debbugs.gnu.org Thu Jan 23 23:39:37 2025 Received: (at 75796) by debbugs.gnu.org; 24 Jan 2025 04:39:38 +0000 Received: from localhost ([127.0.0.1]:43188 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1tbBTl-0008WI-LR for submit@debbugs.gnu.org; Thu, 23 Jan 2025 23:39:37 -0500 Received: from mout02.posteo.de ([185.67.36.66]:57351) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1tbBTi-0008Vz-El for 75796@debbugs.gnu.org; Thu, 23 Jan 2025 23:39:35 -0500 Received: from submission (posteo.de [185.67.36.169]) by mout02.posteo.de (Postfix) with ESMTPS id 6C434240101 for <75796@debbugs.gnu.org>; Fri, 24 Jan 2025 05:39:26 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.net; s=2017; t=1737693566; bh=0vGE75Y+qY3EPuaHMIKy8099Ac1dgrK+KQSfv6pwwjA=; h=Date:From:To:Subject:Message-ID:MIME-Version:Content-Type: Content-Disposition:From; b=dxpVnME/7FXx+qp+OqZfe5rYCmXaM6pdlFrVCI6kwhbMUFMjmhtbVnOw9mIaik/xW ryX4U7Ka1uT1W0vYXAsafYalcC7OkO3JD69QcU5cPXVAG4C86KYL9dDHtNacBfrOId N1O65O1yRJYYyQAWHW4JXommghPJKPolhzsTjIRjvxZagErTU2CnTqBwmfgRLH1UYD ZVh9fzdkwzwGHrc8ymHrGuaRg/iMWIQTvUvqhPTveBklGJoCM9dzpSQ1C6IPHrU2aO L/nsqtZkqN1gZTHI3Be6X3OeOadp8ADcsOYjc8ZWHZLI8EGbiblBL2+UQkLr4uoQjO jkQdHhYnnMDIw== Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4YfQ7y0wJSz6tvs; Fri, 24 Jan 2025 05:39:25 +0100 (CET) Date: Fri, 24 Jan 2025 04:39:23 +0000 From: Peter White To: 75796@debbugs.gnu.org Subject: Re: bug#75796: Very inefficient use of --color escape sequences Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 75796 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) On Fri, Jan 24, 2025 at 03:20:28AM +0000, Peter White wrote: > Hi there, > > I just stumbled on this by accidentally redirecting the output of a > colored grep invocation to a file like so: > > $ grep --color=always . /sys/kernel/mm/transparent_hugepage/* >thp-status.txt 2>/dev/null > # for convenience the 1st line of actual uncolored output > /sys/kernel/mm/transparent_hugepage/defrag:always defer defer+madvise [madvise] never > > Subsequently opening said file in vim I was overwhelmed with escape > sequences. From the looks of it every character in the match gets its > very own color escape sequence as opposed to the string indicating the > path to the file (everything before ':') I did have a look at the code, after all, and even tried a premature "fix" which broke the foad1 test in the debian source package. But that made me realize that this is not a bug at all and grep did as designed by highlighting every single match which just happened to be every single character because of '.' being the pattern. I am sorry for wasting anyone's time and can only hope that this retraction reaches them before they might go chasing ghosts. Peter White