GNU bug report logs -
#10953
Potential logical bug in readtokens.c
Previous Next
Full log
Message #14 received at 10953 <at> debbugs.gnu.org (full text, mbox):
On 03/06/2012 03:32 PM, Eric Blake wrote:
> Why not just strchr instead of building up an isdelim bitmap?
strchr would not be right, since '\0' is valid in data and
as a delimiter.
No doubt you meant 'memchr'; but using 'memchr' would slow
down readtoken by about a factor of two. I got this result by
timing the following benchmark on gcc-4.6.1.tar (uncompressed)
on Fedora 15 x86-64 with GCC 4.6.2:
#include <stdio.h>
#include <readtokens.h>
struct tokenbuffer t;
int main (void)
{
for (;;)
{
size_t s = readtoken (stdin, " \t\n", 3, &t);
if (s == (size_t) -1)
return 0;
}
}
On this benchmark, the relative speeds (user+sys CPU time ratios,
bigger numbers are better) are:
0.54 readtoken with memchr
1.00 current readtoken (with non-thread-safe byte array)
1.13 proposed readtoken (with thread-safe bitset)
So the proposed patch is a performance win even in non-thread-safe use.
> And why
> are we calling getc() one character at a time, instead of using tricks
> like freadahead() to operate on a larger buffer?
>
> Also, is readtoken() intended to be a more powerful interface than
> strtok, in which case we _do_ want to be non-threadsafe, and to have a
> readtoken_r interface that is the underlying threadsafe variant that can
> benefit from caching?
I haven't thought about these issues, but surely they are
independent of the proposed patch.
This bug report was last modified 13 years and 157 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.