GNU bug report logs - #50093
djb2 correction

Previous Next

Package: grep;

Reported by: Jim Meyering <jim <at> meyering.net>

Date: Tue, 17 Aug 2021 10:33:02 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Jim Meyering <jim <at> meyering.net>
Subject: bug#50093: closed (Re: bug#50093: djb2 correction)
Date: Wed, 18 Aug 2021 14:49:02 +0000
[Message part 1 (text/plain, inline)]
Your bug report

#50093: djb2 correction

which was filed against the grep package, has been closed.

The explanation is attached below, along with your original report.
If you require more details, please reply to 50093 <at> debbugs.gnu.org.

-- 
50093: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=50093
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Paul Eggert <eggert <at> cs.ucla.edu>
To: alex.murray <at> canonical.com
Cc: Jim Meyering <jim <at> meyering.net>, 50093-done <at> debbugs.gnu.org
Subject: Re: bug#50093: djb2 correction
Date: Wed, 18 Aug 2021 07:48:09 -0700
On 8/17/21 7:50 PM, Alex Murray wrote:
> How about letting the compiler choose which prime to use? - something like the following:
> 
> #define H15_PRIME (unsigned long long)5381
> #define H32_PRIME (unsigned long long)3657500101
> #define H64_PRIME (unsigned long long)4123221751654370051
> size_t h = H64_PRIME <= SIZE_MAX ? H64_PRIME : H32_PRIME <= SIZE_MAX ? H32_PRIME : H15_PRIME;

That's equivalent to the patch I suggested; on my platform it even 
generates the same machine code. It's better to avoid macros when that's 
easy, as is the case here.

I installed the patch, except with uint_fast64_t instead of unsigned 
long long int as that's a better type for the 64-bit constants in 
question. (In case you're wondering what that "_fast" is doing there, 
POSIX and the C standard guarantee the existence of uint_fast64_t but 
not of uint64_t, as a concession to Unisys mainframes where long long is 
72 bits.)

[Message part 3 (message/rfc822, inline)]
From: Jim Meyering <jim <at> meyering.net>
To: bug-grep <at> gnu.org, Alex Murray <alex.murray <at> canonical.com>
Subject: djb2 correction
Date: Tue, 17 Aug 2021 12:32:39 +0200
Alex Murray noticed that my djb2 implementation mistakenly initialized
to 0, rather than to 5381. Corrected with this:

>From 54590ca833dba62041af045e7bc7c09b90b54b71 Mon Sep 17 00:00:00 2001
From: Alex Murray <alex.murray <at> canonical.com>
Date: Tue, 17 Aug 2021 03:24:37 -0700
Subject: [PATCH] grep: correct DJB2 initialization

* src/grep.c (hash_pattern): DJB2 starts with 5381, not 0.
---
 src/grep.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/grep.c b/src/grep.c
index 271b6b9..8fed550 100644
--- a/src/grep.c
+++ b/src/grep.c
@@ -126,7 +126,7 @@ static Hash_table *pattern_table;
 static size_t _GL_ATTRIBUTE_PURE
 hash_pattern (void const *pat, size_t n_buckets)
 {
-  size_t h = 0;
+  size_t h = 5381;
   intptr_t pat_offset = (intptr_t) pat - 1;
   unsigned char const *s = (unsigned char const *) pattern_array + pat_offset;
   for ( ; *s != '\n'; s++)
--



This bug report was last modified 3 years and 332 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.