Pádraig,

Thank you for alternative suggestions.

Actually I just found yet another way to solve my problem:

perl -0002 -F"\001" -an -e "print((join \"\001\", @F[0..2,14..46]), \"\002\");" data.dat >new_data.dat

It works fine, but I am a little concerned of the speed. I have over three hundreds of such files, from 3Mb to 30Mb each. And this process should be run every day... I thought that by using cut (which just looks for delimiters) I can gain a few minutes on the whole process.

Originally I though of adding "-r, --record-delimiter=DELIM" and "--output-record-delimiter=DELIM: keys to the cut.

Then the example above could be done with

cut -d☺ -r☻ --output-delimiter=☺ --output-record-delimiter=☻ -f1-3,15-47 data.dat >new_data.dat

I think it is feasible and would be more convenient (and hopefully faster) than using a whole perl or two calls to tr.

Bob,

I understand your desire to receive a discussion of features not inside the bug related mail list, but here is a extract from the README:
> Mail suggestions and bug reports for these programs to

> the address on the last line of --help output.

And guess what, the `cut --help` has the bug-coreutils email in the last line! The coreutils email is not mentioned inside README at all. And bug-coreutils is mentioned several times in different context.

I apologize for using this mail-list inappropriately, but I did not know about any other mail-lists

On Wed, Apr 17, 2013 at 9:13 PM, Pádraig Brady <P@draigbrady.com> wrote:

On 04/17/2013 02:26 PM, George Brink wrote:
> Hello,

>
> I have a task of extracting several "fields" from the text file. The
> standard `cut` tool could be a perfect tool for a job, but...

> In my file the '\n' character is a legal symbol inside fields and therefore

> the text file uses other symbol for record-separator. And the `cut` has a
> hard-coded '\n' for record separator (I just checked the source from the
> coreutils-8.21 package).

The patch would be simple but not without compatibility cost.
I.E. scripts using this would immediately become incompatible
with any systems without this feature.

So you'd like something like tac -s, --separator
However cut -s is taken, so we'd have to avoid the short -s at least.
Also tac -s takes a string rather than a character, so
that gives some extra credence (and complexity) to that option there.

Also related would be to support the -z, --zero-terminated option.
join, sort and uniq all have this option to use NUL as the record separator,
however they're all closely related sort dependent utilities
and we're trying to unify options between them.

If it is just a character you want to separate on,
then you can always use tr to convert before processing,
albeit with associated data copying overhead.

SEP=^
tr "$SEP"'\n' '\n'"$SEP" | cut ... | tr "$SEP"'\n' '\n'"$SEP"

So given that cut is not special here among the text filters,
and there is a workaround available, I'm 60:40 against
adding this feature.

thanks,
Pádraig.