From unknown Sun Jul 27 06:47:10 2025 X-Loop: help-debbugs@gnu.org Subject: bug#49698: Search for URL containing certain word Resent-From: Julius Hamilton Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Thu, 22 Jul 2021 19:41:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 49698 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: 49698@debbugs.gnu.org X-Debbugs-Original-To: bug-grep@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.162698282131461 (code B ref -1); Thu, 22 Jul 2021 19:41:01 +0000 Received: (at submit) by debbugs.gnu.org; 22 Jul 2021 19:40:21 +0000 Received: from localhost ([127.0.0.1]:41661 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1m6eYL-0008BJ-Pj for submit@debbugs.gnu.org; Thu, 22 Jul 2021 15:40:21 -0400 Received: from lists.gnu.org ([209.51.188.17]:48904) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1m6e3r-0005DD-Ad for submit@debbugs.gnu.org; Thu, 22 Jul 2021 15:08:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:44076) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m6e3r-0002xU-5K for bug-grep@gnu.org; Thu, 22 Jul 2021 15:08:47 -0400 Received: from mail-oi1-x230.google.com ([2607:f8b0:4864:20::230]:44009) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m6e3p-0002Jc-CX for bug-grep@gnu.org; Thu, 22 Jul 2021 15:08:46 -0400 Received: by mail-oi1-x230.google.com with SMTP id w188so7676614oif.10 for ; Thu, 22 Jul 2021 12:08:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=DerQlYshUM1K92AWVG9xJXItJ+JD9r0Nfxz2Hj+66rQ=; b=WdNlTbmRiBqCkD/vIAoeX2+CRzepguw7ePOGgaEp8Q19Jj/YLyi4NWIsPZenf14Z7F aml9E++upbt23PuM3EGDuTqXKhNownPVZDrpJXz/uT7nBfagW0kNNDon3mx+9qYt8VFv LnU85R+fEapDIsbz3HV+J4XkKHYkL9lyPVGDDKPji9PktTec7ueJA/Es38yiuJWQnrxE hoiTAbcvlwvo3wPH+oQ0KJWocCJDPyFs50IIxrhZVEz7i/Odgr+YzYzds3+gB+vqBewZ QlhuxorDNK/K0QJsd2u2Vk7E7P/fTFsWmhZplHOkOpAt6bFLixx0s6AQYOawCpbAOqNx hjHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=DerQlYshUM1K92AWVG9xJXItJ+JD9r0Nfxz2Hj+66rQ=; b=m+VXFeJY2AQDr5v6EbutK9HtTY2O+Va7/SZQmzyUtI4CTuA9iO4g/rF3IY+ETdYjdS cXzmQALHwW0Q5VDTbPJsCY0cjuGC+MAP6un9yOQuXIyB9YLU8AqZJN9R4LfWLH2Dwxc1 JSu8oqxs0zi7Su+5c5WjyHnvdk1II6FIoapBcKWoxiUQVLErXw8cd/LsBTr9XBorTrp4 E/JDAdTr2a887JLvPiNqr6+VyJQrRp6veRdrh082N0me/zjYQqHZ60NWzYFqQiAK41S+ OZm5L+Xfcr9Jt1oSW9KcDHD14nZvgmuQ0PXyEdWdu+NQ1Pcy/S3QQBr4VATxd1bA2rV7 jzWA== X-Gm-Message-State: AOAM533YUX2zDI824DdnFtHudkU3BKCczpJpLC5Il8dZw8u+zCuhnNda dncwnykVMzZSHjfMS3EtrHD34TgzsE0jpdWiC8ZMgqAa X-Google-Smtp-Source: ABdhPJwJLzMJnvsQwVfrCbkvTNhOJnRpenpevQd1yBY4b1Kkr9H2HH5ixNG7VXcQDOFlR6H3KHdumy4/wZVzoNoSfqQ= X-Received: by 2002:a05:6808:3c5:: with SMTP id o5mr1029937oie.112.1626980923502; Thu, 22 Jul 2021 12:08:43 -0700 (PDT) MIME-Version: 1.0 From: Julius Hamilton Date: Thu, 22 Jul 2021 21:08:32 +0200 Message-ID: Content-Type: multipart/alternative; boundary="0000000000000074c105c7bb0449" Received-SPF: pass client-ip=2607:f8b0:4864:20::230; envelope-from=julkhami@gmail.com; helo=mail-oi1-x230.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.3 (-) X-Mailman-Approved-At: Thu, 22 Jul 2021 15:40:16 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) --0000000000000074c105c7bb0449 Content-Type: text/plain; charset="UTF-8" Hey, I'm new to grep so I'd love any tips on how to search for text in the following way. I'd like to find a certain URL that is somewhere in a large text file. I would like to find it by specifying "a URL which contains word X somewhere within it", or even "a URL which is located within 3 lines of the word X". I'd like to copy that URL and then write it to the top of the file. I am considering doing this with Vim search commands, yet the underlying regex would be the same, so I think this would be a good place to ask. How would you do this with grep? Or a similar tool? Thanks very much, Julius --0000000000000074c105c7bb0449 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hey,

I'm= new to grep so I'd love any tips on how to search for text in the foll= owing way.

I'd like = to find a certain URL that is somewhere in a large text file. I would like = to find it by specifying "a URL which contains word X somewhere within= it", or even "a URL which is located within 3 lines of the word = X".

I'd like to= copy that URL and then write it to the top of the file.

I am considering doing this with Vim searc= h commands, yet the underlying regex would be the same, so I think this wou= ld be a good place to ask.

How would you do this with grep? Or a similar tool?

Thanks very much,
Jul= ius

--0000000000000074c105c7bb0449-- From unknown Sun Jul 27 06:47:10 2025 X-Loop: help-debbugs@gnu.org Subject: bug#49698: Search for URL containing certain word Resent-From: Seth David Schoen Original-Sender: "Debbugs-submit" Resent-CC: bug-grep@gnu.org Resent-Date: Thu, 22 Jul 2021 23:49:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 49698 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: Julius Hamilton Cc: 49698@debbugs.gnu.org Received: via spool by 49698-submit@debbugs.gnu.org id=B49698.16269976856062 (code B ref 49698); Thu, 22 Jul 2021 23:49:01 +0000 Received: (at 49698) by debbugs.gnu.org; 22 Jul 2021 23:48:05 +0000 Received: from localhost ([127.0.0.1]:41858 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1m6iQ8-0001Zh-MI for submit@debbugs.gnu.org; Thu, 22 Jul 2021 19:48:04 -0400 Received: from frotz.zork.net ([69.164.197.204]:41702) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1m6i8C-0007Oh-Mu for 49698@debbugs.gnu.org; Thu, 22 Jul 2021 19:29:33 -0400 Received: by frotz.zork.net (Postfix, from userid 1008) id D11AB11992; Thu, 22 Jul 2021 23:29:31 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 frotz.zork.net D11AB11992 Date: Thu, 22 Jul 2021 16:29:31 -0700 From: Seth David Schoen Message-ID: <20210722232931.GR3621002@frotz.zork.net> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Score: 0.3 (/) X-Mailman-Approved-At: Thu, 22 Jul 2021 19:48:03 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) Julius Hamilton writes: > Hey, > > I'm new to grep so I'd love any tips on how to search for text in the > following way. > > I'd like to find a certain URL that is somewhere in a large text file. I > would like to find it by specifying "a URL which contains word X somewhere > within it", or even "a URL which is located within 3 lines of the word X". > > I'd like to copy that URL and then write it to the top of the file. > > I am considering doing this with Vim search commands, yet the underlying > regex would be the same, so I think this would be a good place to ask. > > How would you do this with grep? Or a similar tool? Hi Julius, I'm not sure this is quite what the grep bug interface is intended for. :-) egrep -C 3 X largefile | egrep -o "$URL_REGEX" where URL_REGEX is a regular expression matching URLs with any particular level of specificity that you want, with a very simple case being something like https?://[^, ]+ As we might have recently discussed on help-bash (?), Unix doesn't have a super-nice built-in notion of "writing to the top of a file" and you would normally need to write the matches, followed by the original file, to a temporary file. Something like set -e temp=$(mktemp) egrep -C 3 X largefile | egrep -o "$URL_REGEX" > $temp cat largefile >> $temp mv $temp largefile From unknown Sun Jul 27 06:47:10 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: Julius Hamilton Subject: bug#49698: closed (Re: bug#49698: Search for URL containing certain word) Message-ID: References: <0b66f3f6-6954-ccb7-9be2-84859e644e62@cs.ucla.edu> X-Gnu-PR-Message: they-closed 49698 X-Gnu-PR-Package: grep Reply-To: 49698@debbugs.gnu.org Date: Fri, 23 Jul 2021 15:25:02 +0000 Content-Type: multipart/mixed; boundary="----------=_1627053902-16203-1" This is a multi-part message in MIME format... ------------=_1627053902-16203-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #49698: Search for URL containing certain word which was filed against the grep package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 49698@debbugs.gnu.org. --=20 49698: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D49698 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1627053902-16203-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 49698-done) by debbugs.gnu.org; 23 Jul 2021 15:24:13 +0000 Received: from localhost ([127.0.0.1]:44321 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1m6x25-0004CE-B0 for submit@debbugs.gnu.org; Fri, 23 Jul 2021 11:24:13 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:59492) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1m6x23-0004By-In for 49698-done@debbugs.gnu.org; Fri, 23 Jul 2021 11:24:12 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 775521600BB; Fri, 23 Jul 2021 08:24:05 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id A4qa26zsQ6Ea; Fri, 23 Jul 2021 08:24:04 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id DCE451600C2; Fri, 23 Jul 2021 08:24:04 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id Hz8sLRHVQEWg; Fri, 23 Jul 2021 08:24:04 -0700 (PDT) Received: from [192.168.1.9] (cpe-172-91-119-151.socal.res.rr.com [172.91.119.151]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id BA12E1600BB; Fri, 23 Jul 2021 08:24:04 -0700 (PDT) Subject: Re: bug#49698: Search for URL containing certain word To: Julius Hamilton References: From: Paul Eggert Organization: UCLA Computer Science Department Message-ID: <0b66f3f6-6954-ccb7-9be2-84859e644e62@cs.ucla.edu> Date: Fri, 23 Jul 2021 08:24:04 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: -2.4 (--) X-Debbugs-Envelope-To: 49698-done Cc: 49698-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.4 (---) Not a bug, so closing the bug report. ------------=_1627053902-16203-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 22 Jul 2021 19:40:21 +0000 Received: from localhost ([127.0.0.1]:41661 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1m6eYL-0008BJ-Pj for submit@debbugs.gnu.org; Thu, 22 Jul 2021 15:40:21 -0400 Received: from lists.gnu.org ([209.51.188.17]:48904) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1m6e3r-0005DD-Ad for submit@debbugs.gnu.org; Thu, 22 Jul 2021 15:08:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:44076) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m6e3r-0002xU-5K for bug-grep@gnu.org; Thu, 22 Jul 2021 15:08:47 -0400 Received: from mail-oi1-x230.google.com ([2607:f8b0:4864:20::230]:44009) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m6e3p-0002Jc-CX for bug-grep@gnu.org; Thu, 22 Jul 2021 15:08:46 -0400 Received: by mail-oi1-x230.google.com with SMTP id w188so7676614oif.10 for ; Thu, 22 Jul 2021 12:08:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=DerQlYshUM1K92AWVG9xJXItJ+JD9r0Nfxz2Hj+66rQ=; b=WdNlTbmRiBqCkD/vIAoeX2+CRzepguw7ePOGgaEp8Q19Jj/YLyi4NWIsPZenf14Z7F aml9E++upbt23PuM3EGDuTqXKhNownPVZDrpJXz/uT7nBfagW0kNNDon3mx+9qYt8VFv LnU85R+fEapDIsbz3HV+J4XkKHYkL9lyPVGDDKPji9PktTec7ueJA/Es38yiuJWQnrxE hoiTAbcvlwvo3wPH+oQ0KJWocCJDPyFs50IIxrhZVEz7i/Odgr+YzYzds3+gB+vqBewZ QlhuxorDNK/K0QJsd2u2Vk7E7P/fTFsWmhZplHOkOpAt6bFLixx0s6AQYOawCpbAOqNx hjHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=DerQlYshUM1K92AWVG9xJXItJ+JD9r0Nfxz2Hj+66rQ=; b=m+VXFeJY2AQDr5v6EbutK9HtTY2O+Va7/SZQmzyUtI4CTuA9iO4g/rF3IY+ETdYjdS cXzmQALHwW0Q5VDTbPJsCY0cjuGC+MAP6un9yOQuXIyB9YLU8AqZJN9R4LfWLH2Dwxc1 JSu8oqxs0zi7Su+5c5WjyHnvdk1II6FIoapBcKWoxiUQVLErXw8cd/LsBTr9XBorTrp4 E/JDAdTr2a887JLvPiNqr6+VyJQrRp6veRdrh082N0me/zjYQqHZ60NWzYFqQiAK41S+ OZm5L+Xfcr9Jt1oSW9KcDHD14nZvgmuQ0PXyEdWdu+NQ1Pcy/S3QQBr4VATxd1bA2rV7 jzWA== X-Gm-Message-State: AOAM533YUX2zDI824DdnFtHudkU3BKCczpJpLC5Il8dZw8u+zCuhnNda dncwnykVMzZSHjfMS3EtrHD34TgzsE0jpdWiC8ZMgqAa X-Google-Smtp-Source: ABdhPJwJLzMJnvsQwVfrCbkvTNhOJnRpenpevQd1yBY4b1Kkr9H2HH5ixNG7VXcQDOFlR6H3KHdumy4/wZVzoNoSfqQ= X-Received: by 2002:a05:6808:3c5:: with SMTP id o5mr1029937oie.112.1626980923502; Thu, 22 Jul 2021 12:08:43 -0700 (PDT) MIME-Version: 1.0 From: Julius Hamilton Date: Thu, 22 Jul 2021 21:08:32 +0200 Message-ID: Subject: Search for URL containing certain word To: bug-grep@gnu.org Content-Type: multipart/alternative; boundary="0000000000000074c105c7bb0449" Received-SPF: pass client-ip=2607:f8b0:4864:20::230; envelope-from=julkhami@gmail.com; helo=mail-oi1-x230.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.3 (-) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Thu, 22 Jul 2021 15:40:16 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) --0000000000000074c105c7bb0449 Content-Type: text/plain; charset="UTF-8" Hey, I'm new to grep so I'd love any tips on how to search for text in the following way. I'd like to find a certain URL that is somewhere in a large text file. I would like to find it by specifying "a URL which contains word X somewhere within it", or even "a URL which is located within 3 lines of the word X". I'd like to copy that URL and then write it to the top of the file. I am considering doing this with Vim search commands, yet the underlying regex would be the same, so I think this would be a good place to ask. How would you do this with grep? Or a similar tool? Thanks very much, Julius --0000000000000074c105c7bb0449 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hey,

I'm= new to grep so I'd love any tips on how to search for text in the foll= owing way.

I'd like = to find a certain URL that is somewhere in a large text file. I would like = to find it by specifying "a URL which contains word X somewhere within= it", or even "a URL which is located within 3 lines of the word = X".

I'd like to= copy that URL and then write it to the top of the file.

I am considering doing this with Vim searc= h commands, yet the underlying regex would be the same, so I think this wou= ld be a good place to ask.

How would you do this with grep? Or a similar tool?

Thanks very much,
Jul= ius

--0000000000000074c105c7bb0449-- ------------=_1627053902-16203-1--