On Wed, Mar 22, 2017 at 2:58 PM, John P. Linderman wrote: > I used to use LC_ALL=C, but, as I vaguely recall, it got in the way of > dealing with UNICODE. I tried a couple LC values aimed at UNICODE and the > US, but something always went pear-shaped. I finally give up. I am perfectly > happy to suffer a tiny bit of performance, to have most things work without > thinking. A factor of 6, or 35, is not tiny, since I use grep and friends > intensely. That's how I discovered the performance problem to begin with. > Anyway, thank you for fixing my problem. I suspect that many of us pioneers > (using UNIX since 1973) have '[0-9]' wired into our fingers. > > On Wed, Mar 22, 2017 at 2:01 PM, Paul Eggert wrote: >> >> On 03/22/2017 05:44 AM, John P. Linderman wrote: >>> >>> That puts the runtimes on equal footing: >>> >> In my measurements, P[0-9] is still a tiny bit slower if one is using >> glibc regex, due to a performance problem in glibc. You can work around it >> by configuring --with-included-regex. It's probably not worth worrying >> about, though. >> >> By the way, using LC_ALL=C should help avoid performance problems like >> these in the future, if all you're doing is something where single-byte >> pattern matching suffices. I've just pulled that gnulib change into grep's repository with the attached, along with a NEWS update: