GNU bug report logs - #22689
25.1.50; implement logcount

Previous Next

Package: emacs;

Reported by: Mark Oteiza <mvoteiza <at> udel.edu>

Date: Mon, 15 Feb 2016 23:20:01 UTC

Severity: wishlist

Found in version 25.1.50

Done: Mark Oteiza <mvoteiza <at> udel.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 22689 in the body.
You can then email your comments to 22689 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#22689; Package emacs. (Mon, 15 Feb 2016 23:20:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Mark Oteiza <mvoteiza <at> udel.edu>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Mon, 15 Feb 2016 23:20:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Mark Oteiza <mvoteiza <at> udel.edu>
To: bug-gnu-emacs <at> gnu.org
Subject: 25.1.50; implement logcount
Date: Mon, 15 Feb 2016 18:18:56 -0500
Wishlist:

Logcount, AKA popcount, Hamming weight, sideways sum.  Basically,
counting the number of 1s in a number's binary representation. The
concept is similar to that in `bool-vector-count-population', except the
argument would be a number, presumably a non-negative integer.

There are a number of ways to go about it, and beyond that I'm not sure
how to write it into data.c:

- Some compilers have a __builtin_popcount (gcc since 2004 and llvm
  since 2005)
- gmp has a popcount function
- lots of algorithms implementing it

Many are given here:

https://en.wikipedia.org/wiki/Hamming_weight

another resource:

http://rosettacode.org/wiki/Population_count

Guile's implementation uses table lookup:

http://git.savannah.gnu.org/cgit/guile.git/tree/libguile/numbers.c#n5234

When I needed it, I just naïvely ported an algorithm to elisp (in
particular ignoring the case of negative numbers):

  (let ((m1 #x555555555555555)
        (m2 #x333333333333333)
        (m4 #x0f0f0f0f0f0f0f0f)
        (h01 #x0101010101010101))
    (setq x (- x (logand (lsh x -1) m1)))
    (setq x (+ (logand x m2) (logand (lsh x -2) m2)))
    (setq x (logand (+ x (lsh x -4)) m4))
    (lsh (* x h01) -56))

The table lookup isn't so different

  (defvar logtab [0 1 1 2 1 2 2 3 1 2 2 3 2 3 3 4])
  (let ((count 0))
    (while (not (zerop x))
      (setq count (+ count (aref logtab (logand 15 x))))
      (setq x (lsh x -4)))
    count)

I couldn't find any implementation of this in calc either, but such a
function/operation would fit in calc-bin.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22689; Package emacs. (Fri, 29 Sep 2017 17:42:02 GMT) Full text and rfc822 format available.

Message #8 received at 22689 <at> debbugs.gnu.org (full text, mbox):

From: Mark Oteiza <mvoteiza <at> udel.edu>
To: 22689 <at> debbugs.gnu.org
Subject: [PATCH] Add logcount
Date: Fri, 29 Sep 2017 13:41:34 -0400
Hi,

I made an attempt implementing this:


diff --git a/src/data.c b/src/data.c
index e070be6c20..1332173dcd 100644
--- a/src/data.c
+++ b/src/data.c
@@ -3069,6 +3069,57 @@ usage: (logxor &rest INTS-OR-MARKERS)  */)
   return arith_driver (Alogxor, nargs, args);
 }
 
+#if defined (__POPCNT__) && defined (__GNUC__) && (__GNUC__> 4 || (__GNUC__== 4 && __GNUC_MINOR__> 1))
+#define HAVE_BUILTIN_POPCOUNTLL
+#endif
+
+static uint32_t
+bitcount32 (uint32_t b)
+{
+  b -= (b >> 1) & 0x55555555;
+  b = (b & 0x33333333) + ((b >> 2) & 0x33333333);
+  b = (b + (b >> 4)) & 0x0f0f0f0f;
+  return (b * 0x01010101) >> 24;
+}  
+
+static uint64_t
+bitcount64 (uint64_t b)
+{
+  b -= (b >> 1) & 0x5555555555555555;
+  b = (b & 0x3333333333333333) + ((b >> 2) & 0x3333333333333333);
+  b = (b + (b >> 4)) & 0x0f0f0f0f0f0f0f0f;
+  return (b * 0x0101010101010101) >> 56;
+}
+
+DEFUN ("logcount", Flogcount, Slogcount, 1, 1, 0,
+       doc: /* Return population count of VALUE. */)
+  (register Lisp_Object value)
+{
+  Lisp_Object res;
+
+  CHECK_NUMBER (value);
+
+#ifdef HAVE_BUILTIN_POPCOUNTLL
+  XSETINT (res, __builtin_popcount (XUINT (value)));
+#else  /* HAVE_BUILTIN_POPCOUNTLL */
+  if (XINT (value) <= UINT_MAX)
+    XSETINT (res, bitcount32 (XUINT (value)));
+  else if (XINT (value) <= ULONG_MAX)
+    XSETINT (res, bitcount64 (XUINT (value)));
+  else
+    {
+      int count;
+      EMACS_UINT v = XUINT (value);
+      for (count = 0; v; count++)
+        {
+          v &= v - 1;
+        }
+      XSETINT (res, v);
+    }
+#endif /* HAVE_BUILTIN_POPCOUNTLL */
+  return res;
+}
+
 static Lisp_Object
 ash_lsh_impl (Lisp_Object value, Lisp_Object count, bool lsh)
 {
@@ -3856,6 +3907,7 @@ syms_of_data (void)
   defsubr (&Slogand);
   defsubr (&Slogior);
   defsubr (&Slogxor);
+  defsubr (&Slogcount);
   defsubr (&Slsh);
   defsubr (&Sash);
   defsubr (&Sadd1);




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22689; Package emacs. (Sat, 30 Sep 2017 11:44:01 GMT) Full text and rfc822 format available.

Message #11 received at 22689 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Mark Oteiza <mvoteiza <at> udel.edu>
Cc: 22689 <at> debbugs.gnu.org
Subject: Re: bug#22689: [PATCH] Add logcount
Date: Sat, 30 Sep 2017 14:42:51 +0300
> Date: Fri, 29 Sep 2017 13:41:34 -0400
> From: Mark Oteiza <mvoteiza <at> udel.edu>
> 
> I made an attempt implementing this:

Thanks.  I'm not an expert on these ops, but otherwise the code looks
OK to me, modulo the minor comments below.

This will eventually need a NEWS entry and some docs in the ELisp
manual.

> +#if defined (__POPCNT__) && defined (__GNUC__) && (__GNUC__> 4 || (__GNUC__== 4 && __GNUC_MINOR__> 1))
> +#define HAVE_BUILTIN_POPCOUNTLL
> +#endif

Where does __POPCNT__ definition come from?  I don't see it in my
GCC's "gcc -dM" output.

As for the rest of the condition, I think you should use GNUC_PREREQ,
like this:

 #if GNUC_PREREQ (4, 1, 0)


> +static uint64_t
> +bitcount64 (uint64_t b)
> +{
> +  b -= (b >> 1) & 0x5555555555555555;
> +  b = (b & 0x3333333333333333) + ((b >> 2) & 0x3333333333333333);
> +  b = (b + (b >> 4)) & 0x0f0f0f0f0f0f0f0f;
> +  return (b * 0x0101010101010101) >> 56;

Shouldn't these constants have a ULL suffix, to make sure they are not
truncated to a 32-bit int?

> +#else  /* HAVE_BUILTIN_POPCOUNTLL */
> +  if (XINT (value) <= UINT_MAX)
> +    XSETINT (res, bitcount32 (XUINT (value)));
> +  else if (XINT (value) <= ULONG_MAX)
> +    XSETINT (res, bitcount64 (XUINT (value)));

The comparisons against Uxxx_MAX seem to assume that VALUE is
unsigned, but that's not guaranteed, right?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22689; Package emacs. (Sat, 30 Sep 2017 13:17:01 GMT) Full text and rfc822 format available.

Message #14 received at 22689 <at> debbugs.gnu.org (full text, mbox):

From: Mark Oteiza <mvoteiza <at> udel.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 22689 <at> debbugs.gnu.org
Subject: Re: bug#22689: [PATCH] Add logcount
Date: Sat, 30 Sep 2017 09:16:36 -0400
On 30/09/17 at 02:42pm, Eli Zaretskii wrote:
> > +#if defined (__POPCNT__) && defined (__GNUC__) && (__GNUC__> 4 || (__GNUC__== 4 && __GNUC_MINOR__> 1))
> > +#define HAVE_BUILTIN_POPCOUNTLL
> > +#endif
> 
> Where does __POPCNT__ definition come from?  I don't see it in my
> GCC's "gcc -dM" output.
> 
> As for the rest of the condition, I think you should use GNUC_PREREQ,
> like this:
> 
>  #if GNUC_PREREQ (4, 1, 0)

I guess it comes from nowhere, that condition and the two helper
functions come from here
https://rosettacode.org/wiki/Population_count#C

> > +static uint64_t
> > +bitcount64 (uint64_t b)
> > +{
> > +  b -= (b >> 1) & 0x5555555555555555;
> > +  b = (b & 0x3333333333333333) + ((b >> 2) & 0x3333333333333333);
> > +  b = (b + (b >> 4)) & 0x0f0f0f0f0f0f0f0f;
> > +  return (b * 0x0101010101010101) >> 56;
> 
> Shouldn't these constants have a ULL suffix, to make sure they are not
> truncated to a 32-bit int?

Probably, I don't know--I'm out of my comfort zone here.

> > +#else  /* HAVE_BUILTIN_POPCOUNTLL */
> > +  if (XINT (value) <= UINT_MAX)
> > +    XSETINT (res, bitcount32 (XUINT (value)));
> > +  else if (XINT (value) <= ULONG_MAX)
> > +    XSETINT (res, bitcount64 (XUINT (value)));
> 
> The comparisons against Uxxx_MAX seem to assume that VALUE is
> unsigned, but that's not guaranteed, right?

True.  Should I instead be doing

  XINT (value) <= xxx_MAX &&
  XINT (value) >= xxx_MIN

or might there be a better check?  For negative values popcount
typically counts ones of the two's complement




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22689; Package emacs. (Sat, 30 Sep 2017 14:01:01 GMT) Full text and rfc822 format available.

Message #17 received at 22689 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Mark Oteiza <mvoteiza <at> udel.edu>
Cc: 22689 <at> debbugs.gnu.org
Subject: Re: bug#22689: [PATCH] Add logcount
Date: Sat, 30 Sep 2017 16:59:47 +0300
> Date: Sat, 30 Sep 2017 09:16:36 -0400
> From: Mark Oteiza <mvoteiza <at> udel.edu>
> Cc: 22689 <at> debbugs.gnu.org
> 
> On 30/09/17 at 02:42pm, Eli Zaretskii wrote:
> > > +#if defined (__POPCNT__) && defined (__GNUC__) && (__GNUC__> 4 || (__GNUC__== 4 && __GNUC_MINOR__> 1))
> > > +#define HAVE_BUILTIN_POPCOUNTLL
> > > +#endif
> > 
> > Where does __POPCNT__ definition come from?  I don't see it in my
> > GCC's "gcc -dM" output.
> > 
> > As for the rest of the condition, I think you should use GNUC_PREREQ,
> > like this:
> > 
> >  #if GNUC_PREREQ (4, 1, 0)
> 
> I guess it comes from nowhere, that condition and the two helper
> functions come from here
> https://rosettacode.org/wiki/Population_count#C

I only see that symbol in the GCC sources, which I don't think are
relevant here.

If you drop the __POPCNT__ part, does the code still work for you?

> > > +static uint64_t
> > > +bitcount64 (uint64_t b)
> > > +{
> > > +  b -= (b >> 1) & 0x5555555555555555;
> > > +  b = (b & 0x3333333333333333) + ((b >> 2) & 0x3333333333333333);
> > > +  b = (b + (b >> 4)) & 0x0f0f0f0f0f0f0f0f;
> > > +  return (b * 0x0101010101010101) >> 56;
> > 
> > Shouldn't these constants have a ULL suffix, to make sure they are not
> > truncated to a 32-bit int?
> 
> Probably, I don't know--I'm out of my comfort zone here.

I'd use the suffixes.

> > > +#else  /* HAVE_BUILTIN_POPCOUNTLL */
> > > +  if (XINT (value) <= UINT_MAX)
> > > +    XSETINT (res, bitcount32 (XUINT (value)));
> > > +  else if (XINT (value) <= ULONG_MAX)
> > > +    XSETINT (res, bitcount64 (XUINT (value)));
> > 
> > The comparisons against Uxxx_MAX seem to assume that VALUE is
> > unsigned, but that's not guaranteed, right?
> 
> True.  Should I instead be doing
> 
>   XINT (value) <= xxx_MAX &&
>   XINT (value) >= xxx_MIN
> 
> or might there be a better check?  For negative values popcount
> typically counts ones of the two's complement

I'd just assign to an unsigned temporary, IIUC the semantics.

Btw, on Windows, a long is a 32-bit type, so I think we need
to check against ULONG_LONG_MAX as well.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22689; Package emacs. (Sat, 30 Sep 2017 14:56:01 GMT) Full text and rfc822 format available.

Message #20 received at 22689 <at> debbugs.gnu.org (full text, mbox):

From: Mark Oteiza <mvoteiza <at> udel.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 22689 <at> debbugs.gnu.org
Subject: Re: bug#22689: [PATCH] Add logcount
Date: Sat, 30 Sep 2017 10:55:25 -0400
On 30/09/17 at 04:59pm, Eli Zaretskii wrote:
> I only see that symbol in the GCC sources, which I don't think are
> relevant here.
> 
> If you drop the __POPCNT__ part, does the code still work for you?

Yes.

> > > > +#else  /* HAVE_BUILTIN_POPCOUNTLL */
> > > > +  if (XINT (value) <= UINT_MAX)
> > > > +    XSETINT (res, bitcount32 (XUINT (value)));
> > > > +  else if (XINT (value) <= ULONG_MAX)
> > > > +    XSETINT (res, bitcount64 (XUINT (value)));
> > > 
> > > The comparisons against Uxxx_MAX seem to assume that VALUE is
> > > unsigned, but that's not guaranteed, right?
> > 
> > True.  Should I instead be doing
> > 
> >   XINT (value) <= xxx_MAX &&
> >   XINT (value) >= xxx_MIN
> > 
> > or might there be a better check?  For negative values popcount
> > typically counts ones of the two's complement
> 
> I'd just assign to an unsigned temporary, IIUC the semantics.
> 
> Btw, on Windows, a long is a 32-bit type, so I think we need
> to check against ULONG_LONG_MAX as well.

I tried implementing your comments (and added a test).  I don't know
what to do in the #else for ULONG_LONG_MAX.

diff --git a/src/data.c b/src/data.c
index e070be6c20..8b3866151d 100644
--- a/src/data.c
+++ b/src/data.c
@@ -3069,6 +3069,66 @@ usage: (logxor &rest INTS-OR-MARKERS)  */)
   return arith_driver (Alogxor, nargs, args);
 }
 
+#if GNUC_PREREQ (4, 1, 0)
+#define HAVE_BUILTIN_POPCOUNTLL
+#endif
+
+#ifndef HAVE_BUILTIN_POPCOUNTLL
+static uint32_t
+logcount32 (uint32_t b)
+{
+  b -= (b >> 1) & 0x55555555;
+  b = (b & 0x33333333) + ((b >> 2) & 0x33333333);
+  b = (b + (b >> 4)) & 0x0f0f0f0f;
+  return (b * 0x01010101) >> 24;
+}  
+
+static uint64_t
+logcount64 (uint64_t b)
+{
+  b -= (b >> 1) & 0x5555555555555555ULL;
+  b = (b & 0x3333333333333333ULL) + ((b >> 2) & 0x3333333333333333ULL);
+  b = (b + (b >> 4)) & 0x0f0f0f0f0f0f0f0fULL;
+  return (b * 0x0101010101010101ULL) >> 56;
+}
+#endif /* HAVE_BUILTIN_POPCOUNTLL */
+
+DEFUN ("logcount", Flogcount, Slogcount, 1, 1, 0,
+       doc: /* Return population count of VALUE.
+If VALUE is negative, the count is of its two's complement representation.  */)
+  (register Lisp_Object value)
+{
+  Lisp_Object res;
+  EMACS_UINT v;
+
+  CHECK_NUMBER (value);
+
+  v = XUINT (value);
+#ifdef HAVE_BUILTIN_POPCOUNTLL
+  if (v <= UINT_MAX)
+    XSETINT (res, __builtin_popcount (v));
+  else if (v <= ULONG_MAX)
+    XSETINT (res, __builtin_popcountl (v));
+  else if (v <= ULONG_LONG_MAX)
+    XSETINT (res, __builtin_popcountll (v));
+#else  /* HAVE_BUILTIN_POPCOUNTLL */
+  if (v <= UINT_MAX)
+    XSETINT (res, logcount32 (v));
+  else if (v <= ULONG_MAX)
+    XSETINT (res, logcount64 (v));
+#endif /* HAVE_BUILTIN_POPCOUNTLL */
+  else
+    {
+      unsigned int count;
+      for (count = 0; v; count++)
+        {
+          v &= v - 1;
+        }
+      XSETINT (res, v);
+    }
+  return res;
+}
+
 static Lisp_Object
 ash_lsh_impl (Lisp_Object value, Lisp_Object count, bool lsh)
 {
@@ -3856,6 +3916,7 @@ syms_of_data (void)
   defsubr (&Slogand);
   defsubr (&Slogior);
   defsubr (&Slogxor);
+  defsubr (&Slogcount);
   defsubr (&Slsh);
   defsubr (&Sash);
   defsubr (&Sadd1);
diff --git a/test/src/data-tests.el b/test/src/data-tests.el
index 8de8c145d4..d1154cc5c4 100644
--- a/test/src/data-tests.el
+++ b/test/src/data-tests.el
@@ -107,6 +107,19 @@
   (should (isnan (min 1.0 0.0e+NaN)))
   (should (isnan (min 1.0 0.0e+NaN 1.1))))
 
+(defun data-tests-popcnt (byte)
+  "Calculate the Hamming weight of BYTE."
+  (setq byte (- byte (logand (lsh byte -1) #x55555555)))
+  (setq byte (+ (logand byte #x33333333) (logand (lsh byte -2) #x33333333)))
+  (lsh (* (logand (+ byte (lsh byte -4)) #x0f0f0f0f) #x01010101) -24))
+
+(ert-deftest data-tests-logcount ()
+  (should (cl-loop for n in (number-sequence 0 255)
+                   always (= (logcount n) (data-tests-popcnt n))))
+  ;; https://oeis.org/A000120
+  (should (= 11 (logcount 9727)))
+  (should (= 8 (logcount 9999))))
+
 ;; Bool vector tests.  Compactly represent bool vectors as hex
 ;; strings.
 




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22689; Package emacs. (Sat, 30 Sep 2017 15:39:02 GMT) Full text and rfc822 format available.

Message #23 received at 22689 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Mark Oteiza <mvoteiza <at> udel.edu>
Cc: 22689 <at> debbugs.gnu.org
Subject: Re: bug#22689: [PATCH] Add logcount
Date: Sat, 30 Sep 2017 18:38:18 +0300
> Date: Sat, 30 Sep 2017 10:55:25 -0400
> From: Mark Oteiza <mvoteiza <at> udel.edu>
> Cc: 22689 <at> debbugs.gnu.org
> 
> > Btw, on Windows, a long is a 32-bit type, so I think we need
> > to check against ULONG_LONG_MAX as well.
> 
> I tried implementing your comments (and added a test).  I don't know
> what to do in the #else for ULONG_LONG_MAX.

Same as you do for ULONG_MAX, I think.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22689; Package emacs. (Sat, 30 Sep 2017 16:04:01 GMT) Full text and rfc822 format available.

Message #26 received at 22689 <at> debbugs.gnu.org (full text, mbox):

From: Mark Oteiza <mvoteiza <at> udel.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 22689 <at> debbugs.gnu.org
Subject: Re: bug#22689: [PATCH] Add logcount
Date: Sat, 30 Sep 2017 12:03:31 -0400
On 30/09/17 at 06:38pm, Eli Zaretskii wrote:
> > Date: Sat, 30 Sep 2017 10:55:25 -0400
> > From: Mark Oteiza <mvoteiza <at> udel.edu>
> > Cc: 22689 <at> debbugs.gnu.org
> > 
> > > Btw, on Windows, a long is a 32-bit type, so I think we need
> > > to check against ULONG_LONG_MAX as well.
> > 
> > I tried implementing your comments (and added a test).  I don't know
> > what to do in the #else for ULONG_LONG_MAX.
> 
> Same as you do for ULONG_MAX, I think.

so the following?

  else if (v <= ULONG_MAX || v <= ULONG_LONG_MAX)
      XSETINT (res, logcount64 (v));





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22689; Package emacs. (Sat, 30 Sep 2017 16:19:01 GMT) Full text and rfc822 format available.

Message #29 received at 22689 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Mark Oteiza <mvoteiza <at> udel.edu>
Cc: 22689 <at> debbugs.gnu.org
Subject: Re: bug#22689: [PATCH] Add logcount
Date: Sat, 30 Sep 2017 19:17:50 +0300
> Date: Sat, 30 Sep 2017 12:03:31 -0400
> From: Mark Oteiza <mvoteiza <at> udel.edu>
> Cc: 22689 <at> debbugs.gnu.org
> 
> > > I tried implementing your comments (and added a test).  I don't know
> > > what to do in the #else for ULONG_LONG_MAX.
> > 
> > Same as you do for ULONG_MAX, I think.
> 
> so the following?
> 
>   else if (v <= ULONG_MAX || v <= ULONG_LONG_MAX)
>       XSETINT (res, logcount64 (v));

Yes.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22689; Package emacs. (Sat, 30 Sep 2017 16:51:02 GMT) Full text and rfc822 format available.

Message #32 received at 22689 <at> debbugs.gnu.org (full text, mbox):

From: Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>
To: Mark Oteiza <mvoteiza <at> udel.edu>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 22689 <at> debbugs.gnu.org
Subject: Re: bug#22689: [PATCH] Add logcount
Date: Sat, 30 Sep 2017 18:50:42 +0200
Hi Mark,

Just a drive-by note:

Mark Oteiza writes:
> +    {
> +      unsigned int count;
> +      for (count = 0; v; count++)
> +        {
> +          v &= v - 1;
> +        }
> +      XSETINT (res, v);
> +    }

Isn't this a fancy way of just saying "XSETINT (res, 0)"?  The variable
"count" is incremented but never used, so without it the loop
degenerates to

      while (v)
        {
          v &= v - 1;
        }

I other words, this just loops until "v == 0".

Maybe that assignment should be "XSETINT (res, count)"?  That would
actually use the variable "count".

benny




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22689; Package emacs. (Sat, 30 Sep 2017 17:00:02 GMT) Full text and rfc822 format available.

Message #35 received at 22689 <at> debbugs.gnu.org (full text, mbox):

From: Mark Oteiza <mvoteiza <at> udel.edu>
To: Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 22689 <at> debbugs.gnu.org
Subject: Re: bug#22689: [PATCH] Add logcount
Date: Sat, 30 Sep 2017 12:59:04 -0400
On 30/09/17 at 06:50pm, Benjamin Riefenstahl wrote:
> Hi Mark,
> 
> Just a drive-by note:
> 
> Mark Oteiza writes:
> > +    {
> > +      unsigned int count;
> > +      for (count = 0; v; count++)
> > +        {
> > +          v &= v - 1;
> > +        }
> > +      XSETINT (res, v);
> > +    }
> 
> Isn't this a fancy way of just saying "XSETINT (res, 0)"?  The variable
> "count" is incremented but never used, so without it the loop
> degenerates to
> 
>       while (v)
>         {
>           v &= v - 1;
>         }
> 
> I other words, this just loops until "v == 0".
> 
> Maybe that assignment should be "XSETINT (res, count)"?  That would
> actually use the variable "count".

I think you're right, thank you.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22689; Package emacs. (Sat, 30 Sep 2017 17:09:01 GMT) Full text and rfc822 format available.

Message #38 received at 22689 <at> debbugs.gnu.org (full text, mbox):

From: Mark Oteiza <mvoteiza <at> udel.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 22689 <at> debbugs.gnu.org
Subject: Re: bug#22689: [PATCH] Add logcount
Date: Sat, 30 Sep 2017 13:07:58 -0400
On 30/09/17 at 07:17pm, Eli Zaretskii wrote:
> > Date: Sat, 30 Sep 2017 12:03:31 -0400
> > From: Mark Oteiza <mvoteiza <at> udel.edu>
> > Cc: 22689 <at> debbugs.gnu.org
> > 
> > > > I tried implementing your comments (and added a test).  I don't know
> > > > what to do in the #else for ULONG_LONG_MAX.
> > > 
> > > Same as you do for ULONG_MAX, I think.
> > 
> > so the following?
> > 
> >   else if (v <= ULONG_MAX || v <= ULONG_LONG_MAX)
> >       XSETINT (res, logcount64 (v));
> 
> Yes.

Ok.  Patch including this, Benjamin's catch, and documentation.


 doc/lispref/numbers.texi | 10 ++++++++
 etc/NEWS                 |  3 +++
 src/data.c               | 61 ++++++++++++++++++++++++++++++++++++++++++++++++
 test/src/data-tests.el   | 13 +++++++++++
 4 files changed, 87 insertions(+)

diff --git a/doc/lispref/numbers.texi b/doc/lispref/numbers.texi
index 3fdc94169b..29c6429eda 100644
--- a/doc/lispref/numbers.texi
+++ b/doc/lispref/numbers.texi
@@ -1107,6 +1107,16 @@ Bitwise Operations
 @end example
 @end defun
 
+@defun logcount integer
+This function returns the population count of @var{integer}: the
+number of ones in the binary representation of @var{integer}.
+
+@example
+(logcount 42)     ; 42 = #b101010
+     @result{} 3
+@end example
+@end defun
+
 @node Math Functions
 @section Standard Mathematical Functions
 @cindex transcendental functions
diff --git a/etc/NEWS b/etc/NEWS
index 238c7b7ea4..8fbc354fc0 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -31,6 +31,9 @@ When you add a new item, use the appropriate mark if you are sure it applies,
 
 * Changes in Emacs 27.1
 
++++
+** New function 'logcount' calculates an integer's Hamming weight.
+
 
 * Editing Changes in Emacs 27.1
 
diff --git a/src/data.c b/src/data.c
index e070be6c20..f470d6ffaa 100644
--- a/src/data.c
+++ b/src/data.c
@@ -3069,6 +3069,66 @@ usage: (logxor &rest INTS-OR-MARKERS)  */)
   return arith_driver (Alogxor, nargs, args);
 }
 
+#if GNUC_PREREQ (4, 1, 0)
+#define HAVE_BUILTIN_POPCOUNTLL
+#endif
+
+#ifndef HAVE_BUILTIN_POPCOUNTLL
+static uint32_t
+logcount32 (uint32_t b)
+{
+  b -= (b >> 1) & 0x55555555;
+  b = (b & 0x33333333) + ((b >> 2) & 0x33333333);
+  b = (b + (b >> 4)) & 0x0f0f0f0f;
+  return (b * 0x01010101) >> 24;
+}  
+
+static uint64_t
+logcount64 (uint64_t b)
+{
+  b -= (b >> 1) & 0x5555555555555555ULL;
+  b = (b & 0x3333333333333333ULL) + ((b >> 2) & 0x3333333333333333ULL);
+  b = (b + (b >> 4)) & 0x0f0f0f0f0f0f0f0fULL;
+  return (b * 0x0101010101010101ULL) >> 56;
+}
+#endif /* HAVE_BUILTIN_POPCOUNTLL */
+
+DEFUN ("logcount", Flogcount, Slogcount, 1, 1, 0,
+       doc: /* Return population count of VALUE.
+If VALUE is negative, the count is of its two's complement representation.  */)
+  (register Lisp_Object value)
+{
+  Lisp_Object res;
+  EMACS_UINT v;
+
+  CHECK_NUMBER (value);
+
+  v = XUINT (value);
+#ifdef HAVE_BUILTIN_POPCOUNTLL
+  if (v <= UINT_MAX)
+    XSETINT (res, __builtin_popcount (v));
+  else if (v <= ULONG_MAX)
+    XSETINT (res, __builtin_popcountl (v));
+  else if (v <= ULONG_LONG_MAX)
+    XSETINT (res, __builtin_popcountll (v));
+#else  /* HAVE_BUILTIN_POPCOUNTLL */
+  if (v <= UINT_MAX)
+    XSETINT (res, logcount32 (v));
+  else if (v <= ULONG_MAX || v <= ULONG_LONG_MAX)
+    XSETINT (res, logcount64 (v));
+#endif /* HAVE_BUILTIN_POPCOUNTLL */
+  else
+    {
+      unsigned int count;
+      for (count = 0; v; count++)
+        {
+          v &= v - 1;
+        }
+      XSETINT (res, count);
+    }
+  return res;
+}
+
 static Lisp_Object
 ash_lsh_impl (Lisp_Object value, Lisp_Object count, bool lsh)
 {
@@ -3856,6 +3916,7 @@ syms_of_data (void)
   defsubr (&Slogand);
   defsubr (&Slogior);
   defsubr (&Slogxor);
+  defsubr (&Slogcount);
   defsubr (&Slsh);
   defsubr (&Sash);
   defsubr (&Sadd1);
diff --git a/test/src/data-tests.el b/test/src/data-tests.el
index 8de8c145d4..d1154cc5c4 100644
--- a/test/src/data-tests.el
+++ b/test/src/data-tests.el
@@ -107,6 +107,19 @@
   (should (isnan (min 1.0 0.0e+NaN)))
   (should (isnan (min 1.0 0.0e+NaN 1.1))))
 
+(defun data-tests-popcnt (byte)
+  "Calculate the Hamming weight of BYTE."
+  (setq byte (- byte (logand (lsh byte -1) #x55555555)))
+  (setq byte (+ (logand byte #x33333333) (logand (lsh byte -2) #x33333333)))
+  (lsh (* (logand (+ byte (lsh byte -4)) #x0f0f0f0f) #x01010101) -24))
+
+(ert-deftest data-tests-logcount ()
+  (should (cl-loop for n in (number-sequence 0 255)
+                   always (= (logcount n) (data-tests-popcnt n))))
+  ;; https://oeis.org/A000120
+  (should (= 11 (logcount 9727)))
+  (should (= 8 (logcount 9999))))
+
 ;; Bool vector tests.  Compactly represent bool vectors as hex
 ;; strings.
 




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22689; Package emacs. (Sat, 30 Sep 2017 17:54:01 GMT) Full text and rfc822 format available.

Message #41 received at 22689 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Mark Oteiza <mvoteiza <at> udel.edu>
Cc: 22689 <at> debbugs.gnu.org
Subject: Re: bug#22689: [PATCH] Add logcount
Date: Sat, 30 Sep 2017 20:53:26 +0300
> Date: Sat, 30 Sep 2017 13:07:58 -0400
> From: Mark Oteiza <mvoteiza <at> udel.edu>
> Cc: 22689 <at> debbugs.gnu.org
> 
> Ok.  Patch including this, Benjamin's catch, and documentation.

Thanks.  A minor comment on the docs:

> +@defun logcount integer
> +This function returns the population count of @var{integer}: the
> +number of ones in the binary representation of @var{integer}.

I would mention here the "Hamming weight" term (in @dfn) that is
referenced in NEWS, and I would also add a few index entries:

 @cindex popcount
 @cindex Hamming weight
 @cindex counting set bits




Reply sent to Mark Oteiza <mvoteiza <at> udel.edu>:
You have taken responsibility. (Sat, 30 Sep 2017 18:19:01 GMT) Full text and rfc822 format available.

Notification sent to Mark Oteiza <mvoteiza <at> udel.edu>:
bug acknowledged by developer. (Sat, 30 Sep 2017 18:19:02 GMT) Full text and rfc822 format available.

Message #46 received at 22689-done <at> debbugs.gnu.org (full text, mbox):

From: Mark Oteiza <mvoteiza <at> udel.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 22689-done <at> debbugs.gnu.org
Subject: Re: bug#22689: [PATCH] Add logcount
Date: Sat, 30 Sep 2017 14:18:07 -0400
On 30/09/17 at 08:53pm, Eli Zaretskii wrote:
> > Date: Sat, 30 Sep 2017 13:07:58 -0400
> > From: Mark Oteiza <mvoteiza <at> udel.edu>
> > Cc: 22689 <at> debbugs.gnu.org
> > 
> > Ok.  Patch including this, Benjamin's catch, and documentation.
> 
> Thanks.  A minor comment on the docs:
> 
> > +@defun logcount integer
> > +This function returns the population count of @var{integer}: the
> > +number of ones in the binary representation of @var{integer}.
> 
> I would mention here the "Hamming weight" term (in @dfn) that is
> referenced in NEWS, and I would also add a few index entries:
> 
>  @cindex popcount
>  @cindex Hamming weight
>  @cindex counting set bits

Done, pushed as 185f3334.  Thank you.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22689; Package emacs. (Sat, 30 Sep 2017 22:50:01 GMT) Full text and rfc822 format available.

Message #49 received at 22689 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Mark Oteiza <mvoteiza <at> udel.edu>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 22689 <at> debbugs.gnu.org
Subject: Re: bug#22689: [PATCH] Add logcount
Date: Sat, 30 Sep 2017 15:48:50 -0700
[Message part 1 (text/plain, inline)]
I looked at this patch after it landed in 'master'. Thanks for contributing it.

A couple of minor improvements. First, the implementation of logcount can use 
the count_one_bits functions already used by Emacs, rather than reinventing that 
wheel; these functions should be just as fast as __builtin_popcount etc. when 
using GCC. Second, logcount should count zero bits in negative numbers, for 
compatibility with Common Lisp. I installed the attached two patches to 
implement these improvements.
[0001-Simplify-logcount-implementation.patch (text/x-patch, attachment)]
[0002-Make-logcount-act-like-CL-on-negative-arg.patch (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22689; Package emacs. (Sat, 30 Sep 2017 23:23:02 GMT) Full text and rfc822 format available.

Message #52 received at 22689 <at> debbugs.gnu.org (full text, mbox):

From: Mark Oteiza <mvoteiza <at> udel.edu>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 22689 <at> debbugs.gnu.org
Subject: Re: bug#22689: [PATCH] Add logcount
Date: Sat, 30 Sep 2017 19:22:33 -0400
On 30/09/17 at 03:48pm, Paul Eggert wrote:
> I looked at this patch after it landed in 'master'. Thanks for contributing it.
> 
> A couple of minor improvements. First, the implementation of logcount can
> use the count_one_bits functions already used by Emacs, rather than
> reinventing that wheel; these functions should be just as fast as
> __builtin_popcount etc. when using GCC. Second, logcount should count zero
> bits in negative numbers, for compatibility with Common Lisp. I installed
> the attached two patches to implement these improvements.

Thank you for the improvements--I was not aware of count-one-bits.h but
am pleasantly surprised.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 29 Oct 2017 11:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 7 years and 291 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.