GNU bug report logs - #50093
djb2 correction

Previous Next

Package: grep;

Reported by: Jim Meyering <jim <at> meyering.net>

Date: Tue, 17 Aug 2021 10:33:02 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: tracker <at> debbugs.gnu.org
Subject: bug#50093: closed (djb2 correction)
Date: Wed, 18 Aug 2021 14:49:02 +0000
[Message part 1 (text/plain, inline)]
Your message dated Wed, 18 Aug 2021 07:48:09 -0700
with message-id <91534c5b-8bef-0a7d-b510-178b4441fcba <at> cs.ucla.edu>
and subject line Re: bug#50093: djb2 correction
has caused the debbugs.gnu.org bug report #50093,
regarding djb2 correction
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)


-- 
50093: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=50093
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Jim Meyering <jim <at> meyering.net>
To: bug-grep <at> gnu.org, Alex Murray <alex.murray <at> canonical.com>
Subject: djb2 correction
Date: Tue, 17 Aug 2021 12:32:39 +0200
Alex Murray noticed that my djb2 implementation mistakenly initialized
to 0, rather than to 5381. Corrected with this:

>From 54590ca833dba62041af045e7bc7c09b90b54b71 Mon Sep 17 00:00:00 2001
From: Alex Murray <alex.murray <at> canonical.com>
Date: Tue, 17 Aug 2021 03:24:37 -0700
Subject: [PATCH] grep: correct DJB2 initialization

* src/grep.c (hash_pattern): DJB2 starts with 5381, not 0.
---
 src/grep.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/grep.c b/src/grep.c
index 271b6b9..8fed550 100644
--- a/src/grep.c
+++ b/src/grep.c
@@ -126,7 +126,7 @@ static Hash_table *pattern_table;
 static size_t _GL_ATTRIBUTE_PURE
 hash_pattern (void const *pat, size_t n_buckets)
 {
-  size_t h = 0;
+  size_t h = 5381;
   intptr_t pat_offset = (intptr_t) pat - 1;
   unsigned char const *s = (unsigned char const *) pattern_array + pat_offset;
   for ( ; *s != '\n'; s++)
--


[Message part 3 (message/rfc822, inline)]
From: Paul Eggert <eggert <at> cs.ucla.edu>
To: alex.murray <at> canonical.com
Cc: Jim Meyering <jim <at> meyering.net>, 50093-done <at> debbugs.gnu.org
Subject: Re: bug#50093: djb2 correction
Date: Wed, 18 Aug 2021 07:48:09 -0700
On 8/17/21 7:50 PM, Alex Murray wrote:
> How about letting the compiler choose which prime to use? - something like the following:
> 
> #define H15_PRIME (unsigned long long)5381
> #define H32_PRIME (unsigned long long)3657500101
> #define H64_PRIME (unsigned long long)4123221751654370051
> size_t h = H64_PRIME <= SIZE_MAX ? H64_PRIME : H32_PRIME <= SIZE_MAX ? H32_PRIME : H15_PRIME;

That's equivalent to the patch I suggested; on my platform it even 
generates the same machine code. It's better to avoid macros when that's 
easy, as is the case here.

I installed the patch, except with uint_fast64_t instead of unsigned 
long long int as that's a better type for the 64-bit constants in 
question. (In case you're wondering what that "_fast" is doing there, 
POSIX and the C standard guarantee the existence of uint_fast64_t but 
not of uint64_t, as a concession to Unisys mainframes where long long is 
72 bits.)


This bug report was last modified 3 years and 333 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.