GNU bug report logs - #31979
csplit: a regexp pattern does not consider the negative offset of a previous regexp pattern

Previous Next

Package: coreutils;

Reported by: Stéphane Campinas <stephane.campinas <at> gmail.com>

Date: Tue, 26 Jun 2018 15:12:02 UTC

Severity: normal

Tags: notabug

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

Full log


Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Stéphane Campinas <stephane.campinas <at> gmail.com>
To: bug-coreutils <at> gnu.org
Subject: csplit: a regexp pattern does not consider the negative offset of a
 previous regexp pattern
Date: Tue, 26 Jun 2018 10:24:29 +0200
[Message part 1 (text/plain, inline)]
Hi,

When using two consecutive regexp patterns with a negative offset
applied to the first one, the second one doesn't start its input section
after the offset.

From the invocation [0] page it should:

	> [...] If it is given, the input up to (but not including) the
	  matching line plus or minus offset is put into the output file,
	  and the line after that begins the next section of input.

Here is an example of the problem, where I want to split a file of 50
lines having a number on each line, ranging from 1 to 50.

# My environment:

	- Linux mars 4.17.2-1-ARCH #1 SMP PREEMPT Sat Jun 16 11:08:59 UTC 2018 x86_64 GNU/Linux
	- csplit (GNU coreutils) 8.29

# A failing example with the unexpected behavior

	$ csplit numbers50.txt /15/-5 /12/
	18
	csplit: ‘/12/’: match not found
	123

# A working example when using a regexp pattern followed by a linenum pattern

	$ csplit numbers50.txt /15/-5 12
	18
	6
	117
	
	$ head xx*
	==> xx00 <==
	1
	2
	3
	4
	5
	6
	7
	8
	9
	
	==> xx01 <==
	10
	11
	
	==> xx02 <==
	12
	13
	14
	15
	16
	17
	18
	19
	20
	21

I think that both should work and output the same thing. I have found
this while trying to port csplit to rust at [1] for some more
information, as I have tried to understand the cause of this behavior
in the code.

Cheers,

[0] https://www.gnu.org/software/coreutils/manual/html_node/csplit-invocation.html#csplit-invocation
[1] https://github.com/uutils/coreutils/issues/501#issuecomment-399569870

-- 
Stephane Campinas
[signature.asc (application/pgp-signature, inline)]

This bug report was last modified 6 years and 235 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.