From debbugs-submit-bounces@debbugs.gnu.org Thu Jul 22 15:40:21 2021 Received: (at submit) by debbugs.gnu.org; 22 Jul 2021 19:40:21 +0000 Received: from localhost ([127.0.0.1]:41661 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1m6eYL-0008BJ-Pj for submit@debbugs.gnu.org; Thu, 22 Jul 2021 15:40:21 -0400 Received: from lists.gnu.org ([209.51.188.17]:48904) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1m6e3r-0005DD-Ad for submit@debbugs.gnu.org; Thu, 22 Jul 2021 15:08:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:44076) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m6e3r-0002xU-5K for bug-grep@gnu.org; Thu, 22 Jul 2021 15:08:47 -0400 Received: from mail-oi1-x230.google.com ([2607:f8b0:4864:20::230]:44009) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m6e3p-0002Jc-CX for bug-grep@gnu.org; Thu, 22 Jul 2021 15:08:46 -0400 Received: by mail-oi1-x230.google.com with SMTP id w188so7676614oif.10 for ; Thu, 22 Jul 2021 12:08:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=DerQlYshUM1K92AWVG9xJXItJ+JD9r0Nfxz2Hj+66rQ=; b=WdNlTbmRiBqCkD/vIAoeX2+CRzepguw7ePOGgaEp8Q19Jj/YLyi4NWIsPZenf14Z7F aml9E++upbt23PuM3EGDuTqXKhNownPVZDrpJXz/uT7nBfagW0kNNDon3mx+9qYt8VFv LnU85R+fEapDIsbz3HV+J4XkKHYkL9lyPVGDDKPji9PktTec7ueJA/Es38yiuJWQnrxE hoiTAbcvlwvo3wPH+oQ0KJWocCJDPyFs50IIxrhZVEz7i/Odgr+YzYzds3+gB+vqBewZ QlhuxorDNK/K0QJsd2u2Vk7E7P/fTFsWmhZplHOkOpAt6bFLixx0s6AQYOawCpbAOqNx hjHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=DerQlYshUM1K92AWVG9xJXItJ+JD9r0Nfxz2Hj+66rQ=; b=m+VXFeJY2AQDr5v6EbutK9HtTY2O+Va7/SZQmzyUtI4CTuA9iO4g/rF3IY+ETdYjdS cXzmQALHwW0Q5VDTbPJsCY0cjuGC+MAP6un9yOQuXIyB9YLU8AqZJN9R4LfWLH2Dwxc1 JSu8oqxs0zi7Su+5c5WjyHnvdk1II6FIoapBcKWoxiUQVLErXw8cd/LsBTr9XBorTrp4 E/JDAdTr2a887JLvPiNqr6+VyJQrRp6veRdrh082N0me/zjYQqHZ60NWzYFqQiAK41S+ OZm5L+Xfcr9Jt1oSW9KcDHD14nZvgmuQ0PXyEdWdu+NQ1Pcy/S3QQBr4VATxd1bA2rV7 jzWA== X-Gm-Message-State: AOAM533YUX2zDI824DdnFtHudkU3BKCczpJpLC5Il8dZw8u+zCuhnNda dncwnykVMzZSHjfMS3EtrHD34TgzsE0jpdWiC8ZMgqAa X-Google-Smtp-Source: ABdhPJwJLzMJnvsQwVfrCbkvTNhOJnRpenpevQd1yBY4b1Kkr9H2HH5ixNG7VXcQDOFlR6H3KHdumy4/wZVzoNoSfqQ= X-Received: by 2002:a05:6808:3c5:: with SMTP id o5mr1029937oie.112.1626980923502; Thu, 22 Jul 2021 12:08:43 -0700 (PDT) MIME-Version: 1.0 From: Julius Hamilton Date: Thu, 22 Jul 2021 21:08:32 +0200 Message-ID: Subject: Search for URL containing certain word To: bug-grep@gnu.org Content-Type: multipart/alternative; boundary="0000000000000074c105c7bb0449" Received-SPF: pass client-ip=2607:f8b0:4864:20::230; envelope-from=julkhami@gmail.com; helo=mail-oi1-x230.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.3 (-) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Thu, 22 Jul 2021 15:40:16 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) --0000000000000074c105c7bb0449 Content-Type: text/plain; charset="UTF-8" Hey, I'm new to grep so I'd love any tips on how to search for text in the following way. I'd like to find a certain URL that is somewhere in a large text file. I would like to find it by specifying "a URL which contains word X somewhere within it", or even "a URL which is located within 3 lines of the word X". I'd like to copy that URL and then write it to the top of the file. I am considering doing this with Vim search commands, yet the underlying regex would be the same, so I think this would be a good place to ask. How would you do this with grep? Or a similar tool? Thanks very much, Julius --0000000000000074c105c7bb0449 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hey,

I'm= new to grep so I'd love any tips on how to search for text in the foll= owing way.

I'd like = to find a certain URL that is somewhere in a large text file. I would like = to find it by specifying "a URL which contains word X somewhere within= it", or even "a URL which is located within 3 lines of the word = X".

I'd like to= copy that URL and then write it to the top of the file.

I am considering doing this with Vim searc= h commands, yet the underlying regex would be the same, so I think this wou= ld be a good place to ask.

How would you do this with grep? Or a similar tool?

Thanks very much,
Jul= ius

--0000000000000074c105c7bb0449-- From debbugs-submit-bounces@debbugs.gnu.org Thu Jul 22 19:48:05 2021 Received: (at 49698) by debbugs.gnu.org; 22 Jul 2021 23:48:05 +0000 Received: from localhost ([127.0.0.1]:41858 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1m6iQ8-0001Zh-MI for submit@debbugs.gnu.org; Thu, 22 Jul 2021 19:48:04 -0400 Received: from frotz.zork.net ([69.164.197.204]:41702) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1m6i8C-0007Oh-Mu for 49698@debbugs.gnu.org; Thu, 22 Jul 2021 19:29:33 -0400 Received: by frotz.zork.net (Postfix, from userid 1008) id D11AB11992; Thu, 22 Jul 2021 23:29:31 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 frotz.zork.net D11AB11992 Date: Thu, 22 Jul 2021 16:29:31 -0700 From: Seth David Schoen To: Julius Hamilton Subject: Re: bug#49698: Search for URL containing certain word Message-ID: <20210722232931.GR3621002@frotz.zork.net> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 49698 X-Mailman-Approved-At: Thu, 22 Jul 2021 19:48:03 -0400 Cc: 49698@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) Julius Hamilton writes: > Hey, > > I'm new to grep so I'd love any tips on how to search for text in the > following way. > > I'd like to find a certain URL that is somewhere in a large text file. I > would like to find it by specifying "a URL which contains word X somewhere > within it", or even "a URL which is located within 3 lines of the word X". > > I'd like to copy that URL and then write it to the top of the file. > > I am considering doing this with Vim search commands, yet the underlying > regex would be the same, so I think this would be a good place to ask. > > How would you do this with grep? Or a similar tool? Hi Julius, I'm not sure this is quite what the grep bug interface is intended for. :-) egrep -C 3 X largefile | egrep -o "$URL_REGEX" where URL_REGEX is a regular expression matching URLs with any particular level of specificity that you want, with a very simple case being something like https?://[^, ]+ As we might have recently discussed on help-bash (?), Unix doesn't have a super-nice built-in notion of "writing to the top of a file" and you would normally need to write the matches, followed by the original file, to a temporary file. Something like set -e temp=$(mktemp) egrep -C 3 X largefile | egrep -o "$URL_REGEX" > $temp cat largefile >> $temp mv $temp largefile From debbugs-submit-bounces@debbugs.gnu.org Fri Jul 23 11:24:13 2021 Received: (at 49698-done) by debbugs.gnu.org; 23 Jul 2021 15:24:13 +0000 Received: from localhost ([127.0.0.1]:44321 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1m6x25-0004CE-B0 for submit@debbugs.gnu.org; Fri, 23 Jul 2021 11:24:13 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:59492) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1m6x23-0004By-In for 49698-done@debbugs.gnu.org; Fri, 23 Jul 2021 11:24:12 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 775521600BB; Fri, 23 Jul 2021 08:24:05 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id A4qa26zsQ6Ea; Fri, 23 Jul 2021 08:24:04 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id DCE451600C2; Fri, 23 Jul 2021 08:24:04 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id Hz8sLRHVQEWg; Fri, 23 Jul 2021 08:24:04 -0700 (PDT) Received: from [192.168.1.9] (cpe-172-91-119-151.socal.res.rr.com [172.91.119.151]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id BA12E1600BB; Fri, 23 Jul 2021 08:24:04 -0700 (PDT) Subject: Re: bug#49698: Search for URL containing certain word To: Julius Hamilton References: From: Paul Eggert Organization: UCLA Computer Science Department Message-ID: <0b66f3f6-6954-ccb7-9be2-84859e644e62@cs.ucla.edu> Date: Fri, 23 Jul 2021 08:24:04 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: -2.4 (--) X-Debbugs-Envelope-To: 49698-done Cc: 49698-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.4 (---) Not a bug, so closing the bug report. From unknown Sun Jul 27 06:46:11 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Sat, 21 Aug 2021 11:24:08 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator