GNU bug report logs - #13127
[PATCH] cut: use only one data strucutre

Previous Next

Package: coreutils;

Reported by: xojoc <at> gmx.com

Date: Sun, 9 Dec 2012 10:29:01 UTC

Severity: normal

Tags: patch

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Cojocaru Alexandru <xojoc <at> gmx.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: 13127 <at> debbugs.gnu.org
Subject: bug#13127: [PATCH] cut: use only one data strucutre
Date: Tue, 11 Dec 2012 15:24:36 +0100
[Message part 1 (text/plain, inline)]
On Sun, 09 Dec 2012 21:45:03 +0100
Jim Meyering <jim <at> meyering.net> wrote:

> Thanks for the patch.
> This is large enough that you'll have to file a copyright assignment.
> For details, see the "Copyright assignment" section in the file
> named HACKING.

Fine.


> Have you considered performance in the common case?
> I suspect that a byte or field number larger than 1000 is
> not common.  That is why, in the FIXME comment above,
> I suggested to use an adaptive approach.  I had the feeling
> (don't remember if I profiled it) that testing a bit per
> input field would be more efficient than an in-range test.

Yes, it was the first thing I checked. And there's no performance loss.


> If you construct test cases and gather timing data, please do so
> in a reproducible manner and include details when you report back,
> so we can compare on different types of systems.

Here are my benchmarks:

OS:       Parabola GNU/linux-libre (linux-libre v3.6.8-1)
Compiler: GCC 4.7.2
Cflags:   -O2
LANG:     C
CPU:      Intel Core Duo  (1.86 GHz) (L1 Cache 64KiB) (L2 Cache 2MiB)
Main memory:
 - Bank 0: DIMM DRAM Synchronous (1GiB) (width 64 bits)
 - Bank 1: DIMM DRAM Synchronous (1GiB) (width 64 bits)

NOTE: information gathered with `lshw'.


Summary (see the attached file for complete data):

### small ranges
cut-pre: 0:01.84
cut-post: 0:01.36
cut-split: 0:01.25

### bigger ranges
cut-pre: 0:11.74
cut-post: 0:09.20
cut-split: 0:07.91 ***

### fields
cut-pre: 0:02.90
cut-post: 0:02.68
cut-split: 0:02.85

### --output-delimiter
cut-pre: 0:02.90
cut-post: 0:02.74
cut-split: 0:02.80


NOTES:
 cut-pre is the current implementation and was compiled from commit ec48beadf.
 cut-post was compiled after applying the above patch to commit ec48beadf.
 cut-split was compiled after applying the `split-print_kth' patch to commit ec48beadf.


The main advantages cames from splitting `print_kth' into two
separate functions, so now `print_kth' does fewer checks.


Best regards,
Cojocaru Alexandru
[full-data.txt (text/plain, attachment)]
[split-print_kth (application/octet-stream, attachment)]

This bug report was last modified 12 years and 76 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.