From unknown Sun Jun 22 11:38:09 2025 X-Loop: help-debbugs@gnu.org Subject: bug#32228: sed -i does not buffer, causing excessive writes Resent-From: Vidar Holen Original-Sender: "Debbugs-submit" Resent-CC: bug-sed@gnu.org Resent-Date: Fri, 20 Jul 2018 20:25:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 32228 X-GNU-PR-Package: sed X-GNU-PR-Keywords: To: 32228@debbugs.gnu.org X-Debbugs-Original-To: bug-sed@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.153211826925590 (code B ref -1); Fri, 20 Jul 2018 20:25:02 +0000 Received: (at submit) by debbugs.gnu.org; 20 Jul 2018 20:24:29 +0000 Received: from localhost ([127.0.0.1]:50007 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fgbwx-0006ee-A8 for submit@debbugs.gnu.org; Fri, 20 Jul 2018 16:24:27 -0400 Received: from eggs.gnu.org ([208.118.235.92]:38038) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fgbww-0006eQ-0N for submit@debbugs.gnu.org; Fri, 20 Jul 2018 16:24:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fgbwq-0006ML-5c for submit@debbugs.gnu.org; Fri, 20 Jul 2018 16:24:20 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=BAYES_40,T_DKIM_INVALID autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:38439) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fgbwq-0006MB-20 for submit@debbugs.gnu.org; Fri, 20 Jul 2018 16:24:20 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57241) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fgbwp-00020L-3S for bug-sed@gnu.org; Fri, 20 Jul 2018 16:24:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fgbwo-0006KD-9l for bug-sed@gnu.org; Fri, 20 Jul 2018 16:24:19 -0400 Received: from mail-ed1-x52f.google.com ([2a00:1450:4864:20::52f]:36843) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fgbwn-0006Ig-Vn for bug-sed@gnu.org; Fri, 20 Jul 2018 16:24:18 -0400 Received: by mail-ed1-x52f.google.com with SMTP id t3-v6so10664419eds.3 for ; Fri, 20 Jul 2018 13:24:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vidarholen-net.20150623.gappssmtp.com; s=20150623; h=mime-version:from:date:message-id:subject:to; bh=ZZqGalZgG+4CjUajSZqUtOU0NKwp6D1RQiC1xNn0V48=; b=zGW7R9RYevbG+i0Cv/0b8iQ/Tkwty6kT55kg0nVZLxTfxBk4RPuA36PsAjbMRAGpRc Qvhn7En4mFiY/3EW7fGp23UPkWDrkk6Rtka2w7gJKJw2gyH+dVZmxwyoyfdrEIwZoMKX SUmo1Ln1b3EgCJkgU79SV694PKs4LGkMybHq9gxvcAoNcpwbRywnTUOypFGsTnqPGtc7 +jDxTjiV+L563iAVTNyPtz0gXTB/clqE7HVsgNgHgN/yvO1HUj9+mh8sO33Yz8FHuKG5 w2lXX+9rusZXccNvAcahq4ElwiZ5dMiWxdC1h49+dqiWA6exejdxaW5et+tknMCv3fNc d7Ig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=ZZqGalZgG+4CjUajSZqUtOU0NKwp6D1RQiC1xNn0V48=; b=hAjLG1CtxyiK8OXLVkiaMqpjlEFjubt2Y8m/1asuDMtamz4vJ+7ptnnIfxrcDtaEGe Ga+n1oa2U0tYqsVreWtHWq/+jTzPBBejs83wR+BinTJff1pDr5fjZznQ7NmxPvKPPmuQ +RKrZYWbg+xYnEgJ9QOeUGG6HfTnb5+kPFCmrvJ4IIiOnzfn3m2z/EOaTHx0ss19c8Yi ItOkoNiSUxWrSnV9KjwTq6OMxHHoQghXdPoSXmgI5xmjXuh6hwsXQmKk5G/1lPQbjz6t YqF9uG3LwlkldAkzXY53YTlvO1r8eAuBrU1ktYc1iaPExJkyapMn+DOO1Rz51x7E90zg Y7mA== X-Gm-Message-State: AOUpUlGeYpzi7bcimM8n0sqGPdPyqrYgALF/Bt6Mr7K9N266Hb9Wpt3I y2rg7Mg1UIR15q/y8YZ0LHeiPwxFrr0imPi+Yjp8NBzA/Fs= X-Google-Smtp-Source: AAOMgpfRpi2quktdr0Zpsg2mD6VwvRSP65EbjM3t5rG296BByG+CepBmUDSrdrrNSX14pw4COHbTv6JcEaqH00SHgig= X-Received: by 2002:aa7:c396:: with SMTP id k22-v6mr4017817edq.149.1532118256058; Fri, 20 Jul 2018 13:24:16 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:aa7:d443:0:0:0:0:0 with HTTP; Fri, 20 Jul 2018 13:23:35 -0700 (PDT) X-Originating-IP: [2620:10d:c090:200::4:65a7] From: Vidar Holen Date: Fri, 20 Jul 2018 13:23:35 -0700 Message-ID: Content-Type: text/plain; charset="UTF-8" X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -6.0 (------) Hi, I'm using noticing that `sed -i` does not do any kind of output buffering. While behavior is correct, this causes an excessive number of small writes: $ strace -e write ./sed -i 's/foo/bar/g' file.txt write(4, " GNU GENERAL "..., 47) = 47 write(4, " Version 3"..., 47) = 47 write(4, "\n", 1) = 1 Compare this to redirection: $ strace -e write ./sed 's/foo/bar/g' file.txt > tmpfile write(1, " GNU GENERAL "..., 4096) = 4096 write(1, "om or adapt all or part of the w"..., 4096) = 4096 write(1, ".\n\n You may make, run and propa"..., 4096) = 4096 On my system, this makes `sed -i` take 7x longer than redirection+mv: 3.7s vs 0.5s for 100MB, for 675 vs 10 writes on the same input. I'm seeing this on Debian with the latest sed commit (c52a676) and libc 2.27-2. The root cause appears to be `ck_mkstemp` using `fdopen`, which unlike `fopen` does not buffer by default. Enabling buffering should be simple and effective, resulting in a nice speedup. Regards, Vidar Holen From unknown Sun Jun 22 11:38:09 2025 X-Loop: help-debbugs@gnu.org Subject: bug#32228: sed -i does not buffer, causing excessive writes Resent-From: Assaf Gordon Original-Sender: "Debbugs-submit" Resent-CC: bug-sed@gnu.org Resent-Date: Sat, 21 Jul 2018 01:49:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 32228 X-GNU-PR-Package: sed X-GNU-PR-Keywords: To: Vidar Holen , 32228@debbugs.gnu.org Received: via spool by 32228-submit@debbugs.gnu.org id=B32228.153213771930423 (code B ref 32228); Sat, 21 Jul 2018 01:49:02 +0000 Received: (at 32228) by debbugs.gnu.org; 21 Jul 2018 01:48:39 +0000 Received: from localhost ([127.0.0.1]:50091 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fgh0g-0007uc-Ik for submit@debbugs.gnu.org; Fri, 20 Jul 2018 21:48:38 -0400 Received: from mail-pf1-f193.google.com ([209.85.210.193]:37186) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fgh0e-0007uN-Td for 32228@debbugs.gnu.org; Fri, 20 Jul 2018 21:48:37 -0400 Received: by mail-pf1-f193.google.com with SMTP id a26-v6so150338pfo.4 for <32228@debbugs.gnu.org>; Fri, 20 Jul 2018 18:48:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language; bh=JASeJtCL2zTzHM0OKJGi2YzkhkXxi55sgzL1UKS5TjU=; b=a8S1JJ23Z18NihpZQEhqzxEyXCJRuVPZ5O4QyJZB6eiBUgXZd3CFUpzIxqANVEdtYf tsMJhg9TMbnEv4Sx5sPPG6AgKJlkE2yLsBAh5tZoxhBSPujW8kZRjfUe6ws73jF4Vj/N l2AFICTpui5OQ6Lh9Ox3IOo+Cc2Ee8qaP7oqO82sGK7OCqVqE/pKZc/ZfNRrLVkB8dry poIShhU08k/x/b8IVRF2Wg1JKFh0oOg6aE7FPRepX8u7r3YWoTFZUKj1Cu9AlV8psGsR LQO8CWMIX8vbbOqZJtqGGl21j5nWJpHeCZiOj6GMaxHsXbgbAFQHCp2pW5VdeyhRFj+q UuZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language; bh=JASeJtCL2zTzHM0OKJGi2YzkhkXxi55sgzL1UKS5TjU=; b=COBJ5Zftu6G7+mRTSqncgFqT4i7/G0ZFUojoc2/na3g4sQAdDZArEX2PxCivaCuGzg FHhK41wYuIClK0qYzpMTHW9XmGmF6VAUA4nmEf1EyEaG3UGujkCFlYPVaP8boaTSibPL UIIN9b3E1rPuVNCyvsmcdDoNcHI9aG4e3duR3tbOtgzwuoGEJ737YplsR95+GHkflVxr 9NGpw+6DXiL1iJ9/6LjVL5ssienvVA577OSHM7c5+GYtPMsPkfFbLrTiViYwsDhQzxK3 eGJCOYBBf5ImKaXI5W/gV5tW9kMER5fkfxBbiKWdLURarFN5qPClSgU0WA9StzKkScGe YhLw== X-Gm-Message-State: AOUpUlHtVu0Xw8UMYO/zhmixn9RCMmVl94AmJQwQCVaZMdXXwrpLCIKq SQILXN3ATnr3/39f60/rd00Rh4f0 X-Google-Smtp-Source: AAOMgpdc5YJWjhKBQpvCLAbA7Ej5XPGiTJNmmD8dST2lwyCnS85zkkmxFip6zT9CKw5oTALQ5Xa1ag== X-Received: by 2002:a62:3e1a:: with SMTP id l26-v6mr4403054pfa.214.1532137710455; Fri, 20 Jul 2018 18:48:30 -0700 (PDT) Received: from tomato.housegordon.com (moose.housegordon.com. [184.68.105.38]) by smtp.googlemail.com with ESMTPSA id p11-v6sm5337745pfj.72.2018.07.20.18.48.28 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 20 Jul 2018 18:48:29 -0700 (PDT) References: From: Assaf Gordon Message-ID: Date: Fri, 20 Jul 2018 19:48:28 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/mixed; boundary="------------BCAF6B841317A3EBC01DEB8F" Content-Language: en-US X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) This is a multi-part message in MIME format. --------------BCAF6B841317A3EBC01DEB8F Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Hello, On 20/07/18 02:23 PM, Vidar Holen wrote: > I'm using noticing that `sed -i` does not do any kind of output > buffering. While behavior is correct, this causes an excessive number of > small writes: > I'm seeing this on Debian with the latest sed commit (c52a676) and libc > 2.27-2. The root cause appears to be `ck_mkstemp` using `fdopen`, which > unlike `fopen` does not buffer by default. Thank you for reporting this issue, and providing clear way to reproduce it. I can confirm seeing the same behavior on Debian 9.4 with glibc 2.24 (and latest sed). I think the technical culprit is slightly different: It is not due to 'ck_mkstemp/fdopen', but because sed explicitly flushes the output after every line, except if the output is STDOUT. This happens in execute.c:flush_output(): https://opengrok.housegordon.com/source/xref/sed/sed/execute.c#flush_output Changing this function enables default buffering with "sed -i" (using whatever the buffering default is for glibc). This change does have a side-effect: It also changes the buffering of other write commands such as 'w', 'W' and 's///w'. It might have other side effects I'm haven't spotted. Based on the ChangeLog-2014 file, I see this function was added in 2003. I wonder if there are good reasons to explicitly flush all output - ideas anyone? If you can give this a quick test and let me know if you also notice improvements on your system - would be appreciated. Comments and suggestions welcomed, - assaf --------------BCAF6B841317A3EBC01DEB8F Content-Type: text/x-patch; name="0001-sed-do-not-flush-output-stream-unless-in-unbuffered-.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename*0="0001-sed-do-not-flush-output-stream-unless-in-unbuffered-.pa"; filename*1="tch" >From 993d7fc501788abeef762626d297516698bb09f8 Mon Sep 17 00:00:00 2001 From: Assaf Gordon Date: Fri, 20 Jul 2018 19:24:12 -0600 Subject: [PATCH] sed: do not flush output stream unless in unbuffered mode Previously sed would explicitly flush the output after every output line, except if the output was stdout in unbuffered mode. In practice this was equivalent to forcing line-buffering, and was espcially was noticable with "sed -i" (where the output is a temporary file). With this change, explicit flushing only happens with "sed -u", regardless of the type of output file, making "sed -i" much faster. This change also affect other write commands such as 'w'/'W' and 's///w'. Reported by Vidar Holen in https://lists.gnu.org/r/bug-sed/2018-07/msg00014.html . * NEWS: Mention this. * sed/execute.c (flush_output): Never flush output unless in unbuffered mode, regardless of which file it is. --- NEWS | 7 +++++++ sed/execute.c | 2 +- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/NEWS b/NEWS index ac166dd..de4b8ee 100644 --- a/NEWS +++ b/NEWS @@ -2,6 +2,13 @@ GNU sed NEWS -*- outline -*- * Noteworthy changes in release ?.? (????-??-??) [?] +** Improvements + + sed now uses fully-buffered output (instead of line-buffered) when + writing to files. This should noticeably improve performance of "sed -i" + and other write commands. + Buffering can be disabled (as before) with "sed -u". + ** Bug fixes sed no longer accesses invalid memory (heap overflow) when given invalid diff --git a/sed/execute.c b/sed/execute.c index 7a4850f..1cc1d3f 100644 --- a/sed/execute.c +++ b/sed/execute.c @@ -415,7 +415,7 @@ output_missing_newline(struct output *outf) static inline void flush_output(FILE *fp) { - if (fp != stdout || unbuffered) + if (unbuffered) ck_fflush(fp); } -- 2.11.0 --------------BCAF6B841317A3EBC01DEB8F-- From unknown Sun Jun 22 11:38:09 2025 X-Loop: help-debbugs@gnu.org Subject: bug#32228: sed -i does not buffer, causing excessive writes Resent-From: Vidar Holen Original-Sender: "Debbugs-submit" Resent-CC: bug-sed@gnu.org Resent-Date: Sat, 21 Jul 2018 02:28:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 32228 X-GNU-PR-Package: sed X-GNU-PR-Keywords: To: Assaf Gordon Cc: 32228@debbugs.gnu.org Received: via spool by 32228-submit@debbugs.gnu.org id=B32228.15321400521397 (code B ref 32228); Sat, 21 Jul 2018 02:28:02 +0000 Received: (at 32228) by debbugs.gnu.org; 21 Jul 2018 02:27:32 +0000 Received: from localhost ([127.0.0.1]:50097 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fghcJ-0000MT-Re for submit@debbugs.gnu.org; Fri, 20 Jul 2018 22:27:32 -0400 Received: from mail-ed1-f41.google.com ([209.85.208.41]:44878) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fghcG-0000MF-PJ for 32228@debbugs.gnu.org; Fri, 20 Jul 2018 22:27:30 -0400 Received: by mail-ed1-f41.google.com with SMTP id f23-v6so11142364edr.11 for <32228@debbugs.gnu.org>; Fri, 20 Jul 2018 19:27:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vidarholen-net.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=wMIBp6jQMFFcPoZZhV1QW3EHEh0D7fd9RSvMjdvlNFA=; b=JvybtjV0aSFt6URCA2H/c8XBr3Woa/62ySiSNeSdVXuC/9d3e+lkSeKPrNM0Xafq1S oHpZSf6Vnt3AdByPJwT5iYkjXxVfd1o2F2o1dwuxqUG6Gj5hdYzZEytaMeM9fAuechIW Ris5dnbJoernIT8R5c2UBpp/lHRm6XP2YrxKoJdbFjZ8186aj64SPLJlVg55V1ph608U mqdOpsoeK+/gNv4bvNJb8A+cUCgehL8U1cWdwinrpDNO45eGcEhdg2mxdTfOAniQ+0yO QtbxNUgrlGUaZpy4438qUMs3IduT6uXTN2OjS3jjp+IGi0VlMvkj3co+uyf520/Ayrhv 2L6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=wMIBp6jQMFFcPoZZhV1QW3EHEh0D7fd9RSvMjdvlNFA=; b=t7IKQBEdwKpcynV9GsTCN1dQwYbQlH5IOn8ne9I5wNrgkLQLLQvis+VuUc3GmN6+sk l0475hw/DYvzecCPx1r2MxskrQUsfMPCr8dIjkDvN98AZLV48AcAgaXM2R0pTzN6LAsR 30ndmIhaj+94h7d69+RcD+fs7UoejAZwQafcc9MqYFOfKaWfWMky6twOBfXSHWIaNYKv MWWaLa22XhG4m0KHegdDP2AnRAyUYsXnZHrOKmGSEsO8pKzSPk9AW3EHAx/AzalaZJMj SRhut3EHhei+44SJLsLha7ovfUTMOBRKxgD/f0a0OexS/fUGoCqBON4OX0tohEgDuTjL IOsA== X-Gm-Message-State: AOUpUlGx7lhw5eYEu2w+NEIVI7HIeQHgAIyHs1RpTys22XAMpM5vYv5e cuU7Xe9PEIZ89RyubS5bB7jsCHi6xtiHp24aULeucw== X-Google-Smtp-Source: AAOMgpdovp6kx73zr+3kTBoGDPax71onYioXmurZuPwtz4yL5StXsv6YbLjOt9cwCSt2A9zyv/dGCPsH0ajKFHojvbQ= X-Received: by 2002:a50:ae98:: with SMTP id e24-v6mr4594262edd.16.1532140042958; Fri, 20 Jul 2018 19:27:22 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a50:c018:0:0:0:0:0 with HTTP; Fri, 20 Jul 2018 19:26:42 -0700 (PDT) X-Originating-IP: [2601:647:5801:491d:1d30:7001:cda:42f2] In-Reply-To: References: From: Vidar Holen Date: Fri, 20 Jul 2018 19:26:42 -0700 Message-ID: Content-Type: multipart/alternative; boundary="00000000000001cec2057179275f" X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) --00000000000001cec2057179275f Content-Type: text/plain; charset="UTF-8" Thanks for looking into it. I can confirm that this patch significantly improves the performance of `sed -i` for me. On Fri, Jul 20, 2018 at 6:48 PM, Assaf Gordon wrote: > Hello, > > On 20/07/18 02:23 PM, Vidar Holen wrote: > >> I'm using noticing that `sed -i` does not do any kind of output >> buffering. While behavior is correct, this causes an excessive number of >> small writes: >> > > I'm seeing this on Debian with the latest sed commit (c52a676) and libc >> 2.27-2. The root cause appears to be `ck_mkstemp` using `fdopen`, which >> unlike `fopen` does not buffer by default. >> > > Thank you for reporting this issue, and providing clear way to reproduce > it. I can confirm seeing the same behavior on Debian 9.4 with glibc 2.24 > (and latest sed). > > I think the technical culprit is slightly different: > It is not due to 'ck_mkstemp/fdopen', but because sed explicitly flushes > the output after every line, except if the output is STDOUT. > > This happens in execute.c:flush_output(): > https://opengrok.housegordon.com/source/xref/sed/sed/execute > .c#flush_output > > Changing this function enables default buffering with "sed -i" > (using whatever the buffering default is for glibc). > > This change does have a side-effect: > It also changes the buffering of other write commands such as 'w', 'W' > and 's///w'. > It might have other side effects I'm haven't spotted. > Based on the ChangeLog-2014 file, I see this function was added in 2003. > > I wonder if there are good reasons to explicitly flush all output - > ideas anyone? > > If you can give this a quick test and let me know if you also notice > improvements on your system - would be appreciated. > > > Comments and suggestions welcomed, > - assaf > > > > > --00000000000001cec2057179275f Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thanks for looking into it. I can confirm that this patch = significantly
improves the performance of `sed -i` for me.
<= div class=3D"gmail_extra">
On Fri, Jul 20, 20= 18 at 6:48 PM, Assaf Gordon <assafgordon@gmail.com> wrot= e:
Hello,

On 20/07/18 02:23 PM, Vidar Holen wrote:
I'm using noticing that `sed -i` does not do any kind of output
buffering. While behavior is correct, this causes an excessive number of small writes:

I'm seeing this on Debian with the latest sed commit (c52a676) and libc=
2.27-2. The root cause appears to be `ck_mkstemp` using `fdopen`, which
unlike `fopen` does not buffer by default.

Thank you for reporting this issue, and providing clear way to reproduce it. I can confirm seeing the same behavior on Debian 9.4 with glibc 2.24 (a= nd latest sed).

I think the technical culprit is slightly different:
It is not due to 'ck_mkstemp/fdopen', but because sed explicitly fl= ushes
the output after every line, except if the output is STDOUT.

This happens in execute.c:flush_output():
https://opengrok.housegor= don.com/source/xref/sed/sed/execute.c#flush_output

Changing this function enables default buffering with "sed -i" (using whatever the buffering default is for glibc).

This change does have a side-effect:
It also changes the buffering of other write commands such as 'w', = 'W'
and 's///w'.
It might have other side effects I'm haven't spotted.
Based on the ChangeLog-2014 file, I see this function was added in 2003.
I wonder if there are good reasons to explicitly flush all output -
ideas anyone?

If you can give this a quick test and let me know if you also notice
improvements on your system - would be appreciated.


Comments and suggestions welcomed,
=C2=A0- assaf





--00000000000001cec2057179275f-- From unknown Sun Jun 22 11:38:09 2025 X-Loop: help-debbugs@gnu.org Subject: bug#32228: sed -i does not buffer, causing excessive writes Resent-From: Jim Meyering Original-Sender: "Debbugs-submit" Resent-CC: bug-sed@gnu.org Resent-Date: Thu, 02 Aug 2018 14:42:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 32228 X-GNU-PR-Package: sed X-GNU-PR-Keywords: To: Assaf Gordon Cc: 32228@debbugs.gnu.org, Vidar Holen Received: via spool by 32228-submit@debbugs.gnu.org id=B32228.153322086714943 (code B ref 32228); Thu, 02 Aug 2018 14:42:02 +0000 Received: (at 32228) by debbugs.gnu.org; 2 Aug 2018 14:41:07 +0000 Received: from localhost ([127.0.0.1]:39090 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1flEmo-0003sx-RJ for submit@debbugs.gnu.org; Thu, 02 Aug 2018 10:41:07 -0400 Received: from mail-wr1-f65.google.com ([209.85.221.65]:40708) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1flEmm-0003sK-W6 for 32228@debbugs.gnu.org; Thu, 02 Aug 2018 10:41:05 -0400 Received: by mail-wr1-f65.google.com with SMTP id h15-v6so2388532wrs.7 for <32228@debbugs.gnu.org>; Thu, 02 Aug 2018 07:41:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=Fud/660uCH7E+GF97c305xz5EQheQZa7893UJSlgOVQ=; b=orr19KzV8XDJ/ntYYxIXPOde4CCRibwwKwrW3weLgqBeEwc8QvkL2pTIVig5erUh7v eRO0sk2AWIQWm64IOV8g9ufSN4gyO/OlUmKETajcnFrz4+v+KB+XXwqcNlwf20NM+tIG D6DpluZTQiya/J+CVtYtJFYFXviy6rUkCKrBVcsu5YbE7zqFO2I/tOSpjtZq5CAqD48t tVw6xrmO1EnNaJjTFKK4kg5QsasoEkvNJ3brVMmNBjNEPNX+lEUAr1WUvpRPzrGAsJvZ tgd3JnMjhqqg3y+tdOHmlJk7OLhCzo4V8kkm9pSRiq9YMowN9eh9PF4ivjJ5YBqm23Yu f+yw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=Fud/660uCH7E+GF97c305xz5EQheQZa7893UJSlgOVQ=; b=fbd4QaVXNoSrOVixahuntS+bYENnCpA5LzIgdYgi54/jiGpYIJPqJG/UArvdfAzLHG ovjJDwA8VVIZb3yGXgbZpuTdZ8CAiqmo0cpG/YcYrGI8iYQrIod8CSwCDJK3vs/ntqIu XYeeQyRHFMs58HAxQP1qJ4aut+L+/kDgH0Q9H9TNt4Ukqs5ytQsp7vBpoTnVpeYMmYNB lFlRGAIvsb+l1eYel7EKtk4EXR9xf/EcFw0qWrointCa7/6SAy4kfbvDLzsWUIExYje3 5nADnn9fS9HWS2bAIOVmcLV3GOog3JERYa9xKUCCK/y601CCzV+hN6lveRk2JFHay4lr ZpNw== X-Gm-Message-State: AOUpUlGVMSdhi38ImhWn9cDtguOlvsW45T1lf47INf/h0Cq/Wbx6X2jP DhzmVWRsofzKptAQGVfwzaZGWMnJ+qTqFVWGXu0= X-Google-Smtp-Source: AAOMgpenASxChWEKPqNZ4ZAQno5BHkSEvdGSY9DEmqaXc3UmEyqB1Om6Woq9H1Hua5hO6D4/vEHwHhSSd4MoICIX9Ic= X-Received: by 2002:adf:fdcd:: with SMTP id i13-v6mr2144698wrs.276.1533220859034; Thu, 02 Aug 2018 07:40:59 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:adf:ec4e:0:0:0:0:0 with HTTP; Thu, 2 Aug 2018 07:40:38 -0700 (PDT) In-Reply-To: References: From: Jim Meyering Date: Thu, 2 Aug 2018 16:40:38 +0200 X-Google-Sender-Auth: X2MKxtVaFXA29coSUkIO5RxnLNk Message-ID: Content-Type: text/plain; charset="UTF-8" X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) On Sat, Jul 21, 2018 at 3:48 AM, Assaf Gordon wrote: > Hello, > > On 20/07/18 02:23 PM, Vidar Holen wrote: >> >> I'm using noticing that `sed -i` does not do any kind of output >> buffering. While behavior is correct, this causes an excessive number of >> small writes: > > >> I'm seeing this on Debian with the latest sed commit (c52a676) and libc >> 2.27-2. The root cause appears to be `ck_mkstemp` using `fdopen`, which >> unlike `fopen` does not buffer by default. > > > Thank you for reporting this issue, and providing clear way to reproduce > it. I can confirm seeing the same behavior on Debian 9.4 with glibc 2.24 > (and latest sed). > > I think the technical culprit is slightly different: > It is not due to 'ck_mkstemp/fdopen', but because sed explicitly flushes > the output after every line, except if the output is STDOUT. > > This happens in execute.c:flush_output(): > https://opengrok.housegordon.com/source/xref/sed/sed/execute.c#flush_output > > Changing this function enables default buffering with "sed -i" > (using whatever the buffering default is for glibc). > > This change does have a side-effect: > It also changes the buffering of other write commands such as 'w', 'W' > and 's///w'. > It might have other side effects I'm haven't spotted. > Based on the ChangeLog-2014 file, I see this function was added in 2003. > > I wonder if there are good reasons to explicitly flush all output - > ideas anyone? Thank you, Assaf. This change looks fine. I too tried to determine why that code was written that way to no avail. I have no idea what the motivation might have been. From unknown Sun Jun 22 11:38:09 2025 X-Loop: help-debbugs@gnu.org Subject: bug#32228: sed -i does not buffer, causing excessive writes Resent-From: Assaf Gordon Original-Sender: "Debbugs-submit" Resent-CC: bug-sed@gnu.org Resent-Date: Sat, 04 Aug 2018 01:19:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 32228 X-GNU-PR-Package: sed X-GNU-PR-Keywords: To: Jim Meyering Cc: 32228@debbugs.gnu.org, Vidar Holen Received: via spool by 32228-submit@debbugs.gnu.org id=B32228.153334552429413 (code B ref 32228); Sat, 04 Aug 2018 01:19:02 +0000 Received: (at 32228) by debbugs.gnu.org; 4 Aug 2018 01:18:44 +0000 Received: from localhost ([127.0.0.1]:40312 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fllDQ-0007eG-5o for submit@debbugs.gnu.org; Fri, 03 Aug 2018 21:18:44 -0400 Received: from mail-pl0-f47.google.com ([209.85.160.47]:42491) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fllDO-0007dt-3D; Fri, 03 Aug 2018 21:18:42 -0400 Received: by mail-pl0-f47.google.com with SMTP id z7-v6so3263934plo.9; Fri, 03 Aug 2018 18:18:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=2/ITsCPQd5wXSVa0gkFHU+sa0aD/GTEJRdQuNZ3Bf+A=; b=TCpWp8r8BhD3lV0mfGmTX9gm3aHNPu7W0ynUBTvnHCQ8xBgrln9zAYR/Mp3brSm8Ob aU1f0HyUMP4qC/FyDev/jSOTVGUU5GAtFeZVcHRbJNZ9Bxo3WFaegl9Y1aVct/eZ30EX U0ELjYihTMstb7t1nhJrBWtJypReJ44dh4udQheTdr+I32iBTveD4r12wywM+RdpS4kw x+jC87hTcoObuQlSTGzbUU2AAyAS/lOp8YxM5s+N9hhDak9rcDSkmq0o6F42DgHbYYW0 V2wht2uYmMtCf+Fmm1p86omi+z1KpJautkhVJgvmeWcJSaUG4EMrisE+1Eqpjnd8rkbt hqkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=2/ITsCPQd5wXSVa0gkFHU+sa0aD/GTEJRdQuNZ3Bf+A=; b=mAZE2E+Bdb+9Skx5FRL0BTxJ70OMrnbjEmGrfs4CkZ/GHFniODhKfaYpNCaMDShJ9A NSXEBVIHAUX9WdXl2qP+CXo6i6R0V0+Re3xR7CtHxBBK4NJg7Hic8+hh9QfUaZ3EPTlS 8wwhjniM6gzyFKOVs7L728s0sy+q9qfnTmzBrGL8yLhNyCHxtGTZHQbaHG1ggHQKcEgX 2vgYvx8DWxUokwImjkimt77iFlI6Vb8KV7bz5SZvRtWSV7t2IC1V98drty2L5b77R0+N OTICurjFYk9roVrrIOF51RbwSTbvZwZ3brKLf+RnXyi2bWDHbB+B3rK0Jp14Gnzb7ZeH R95g== X-Gm-Message-State: AOUpUlEM9MpiBpjAViOgiNI4FGJ2AZnpXzD65RsTwYqObj4dcla0XRbM EhM3z/wEnDN3HAiPddhV/sk2WRFz X-Google-Smtp-Source: AAOMgpcyYQf+oDwVOt4r/crLBSqFtrnkVzqGpj2eSUWK3AIFo6WQBxRvSScnpu6baVo+s7OD0A7iIg== X-Received: by 2002:a17:902:4d46:: with SMTP id o6-v6mr5631410plh.59.1533345515874; Fri, 03 Aug 2018 18:18:35 -0700 (PDT) Received: from tomato.housegordon.com (moose.housegordon.com. [184.68.105.38]) by smtp.googlemail.com with ESMTPSA id h10-v6sm11386668pfj.78.2018.08.03.18.18.33 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 03 Aug 2018 18:18:34 -0700 (PDT) References: From: Assaf Gordon Message-ID: Date: Fri, 3 Aug 2018 19:18:33 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) tags 32228 fixed close 32228 stop On 02/08/18 08:40 AM, Jim Meyering wrote: > On Sat, Jul 21, 2018 at 3:48 AM, Assaf Gordon wrote: >> >> On 20/07/18 02:23 PM, Vidar Holen wrote: >>> >>> I'm using noticing that `sed -i` does not do any kind of output >>> buffering. >> >> I wonder if there are good reasons to explicitly flush all output - >> ideas anyone? > > Thank you, Assaf. This change looks fine. > I too tried to determine why that code was written that way to no > avail. I have no idea what the motivation might have been. Thanks for the review. Pushed here: https://git.savannah.gnu.org/cgit/sed.git/commit/?id=0144eeeb regards, - assaf