GNU bug report logs - #62443
LLaMA.cpp

Previous Next

Package: guix-patches;

Reported by: Nicolas Graves <ngraves <at> ngraves.fr>

Date: Sat, 25 Mar 2023 15:06:02 UTC

Severity: normal

Tags: patch

Done: Nicolas Goaziou <mail <at> nicolasgoaziou.fr>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 62443 in the body.
You can then email your comments to 62443 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to guix-patches <at> gnu.org:
bug#62443; Package guix-patches. (Sat, 25 Mar 2023 15:06:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Nicolas Graves <ngraves <at> ngraves.fr>:
New bug report received and forwarded. Copy sent to guix-patches <at> gnu.org. (Sat, 25 Mar 2023 15:06:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Nicolas Graves <ngraves <at> ngraves.fr>
To: guix-patches <at> gnu.org
Subject: LLaMA.cpp
Date: Sat, 25 Mar 2023 16:05:04 +0100
Here are 3 patches introducing the LLaMA CPP implementation. Since
weights are available as torrent download, this makes the whole model
usable with a local config. 

Basic information for preparing the model are available in the
README. 

-- 
Best regards,
Nicolas Graves




Information forwarded to guix-patches <at> gnu.org:
bug#62443; Package guix-patches. (Sat, 25 Mar 2023 15:33:01 GMT) Full text and rfc822 format available.

Message #8 received at 62443 <at> debbugs.gnu.org (full text, mbox):

From: Nicolas Graves <ngraves <at> ngraves.fr>
To: 62443 <at> debbugs.gnu.org
Cc: ngraves <at> ngraves.fr
Subject: [PATCH 1/3] gnu: Add sentencepiece.
Date: Sat, 25 Mar 2023 16:32:18 +0100
* gnu/packages/machine-learning.scm (sentencepiece): New variable.
---
 gnu/packages/machine-learning.scm | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/gnu/packages/machine-learning.scm b/gnu/packages/machine-learning.scm
index 37d4ef78ad..f6996af77b 100644
--- a/gnu/packages/machine-learning.scm
+++ b/gnu/packages/machine-learning.scm
@@ -583,6 +583,33 @@ (define openfst-for-vosk
        '("--enable-shared" "--enable-far" "--enable-ngram-fsts"
          "--enable-lookahead-fsts" "--with-pic" "--disable-bin")))))
 
+(define-public sentencepiece
+  (package
+    (name "sentencepiece")
+    (version "0.1.97")
+    (source
+     (origin
+       (method git-fetch)
+       (uri (git-reference
+             (url "https://github.com/google/sentencepiece")
+             (commit (string-append "v" version))))
+       (file-name (git-file-name name version))
+       (sha256
+        (base32 "1kzfkp2pk0vabyw3wmkh16h11chzq63mzc20ddhsag5fp6s91ajg"))))
+    (build-system cmake-build-system)
+    (arguments '(#:tests? #f))
+    (native-inputs (list gperftools))
+    (home-page "https://github.com/google/sentencepiece")
+    (synopsis "Unsupervised tokenizer for Neural Network-based text generation")
+    (description "SentencePiece is an unsupervised text tokenizer and
+detokenizer mainly for Neural Network-based text generation systems where the
+vocabulary size is predetermined prior to the neural model training.
+SentencePiece implements subword units (e.g., byte-pair-encoding
+(BPE) and unigram language model) with the extension of direct training from
+raw sentences.  SentencePiece allows us to make a purely end-to-end system
+that does not depend on language-specific pre/postprocessing.")
+    (license license:asl2.0)))
+
 (define-public shogun
   (package
     (name "shogun")
-- 
2.39.2





Information forwarded to guix-patches <at> gnu.org:
bug#62443; Package guix-patches. (Sat, 25 Mar 2023 15:33:02 GMT) Full text and rfc822 format available.

Message #11 received at 62443 <at> debbugs.gnu.org (full text, mbox):

From: Nicolas Graves <ngraves <at> ngraves.fr>
To: 62443 <at> debbugs.gnu.org
Cc: ngraves <at> ngraves.fr
Subject: [PATCH 2/3] gnu: Add python-sentencepiece.
Date: Sat, 25 Mar 2023 16:32:19 +0100
* gnu/packages/machine-learning.scm (python-sentencepiece): New variable.
---
 gnu/packages/machine-learning.scm | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/gnu/packages/machine-learning.scm b/gnu/packages/machine-learning.scm
index f6996af77b..df1989d316 100644
--- a/gnu/packages/machine-learning.scm
+++ b/gnu/packages/machine-learning.scm
@@ -610,6 +610,25 @@ (define-public sentencepiece
 that does not depend on language-specific pre/postprocessing.")
     (license license:asl2.0)))
 
+(define-public python-sentencepiece
+  (package
+    (name "python-sentencepiece")
+    (version "0.1.97")
+    (source
+     (origin
+       (method url-fetch)
+       (uri (pypi-uri "sentencepiece" version))
+       (sha256
+        (base32 "0v0z9ryl66432zajp099bcbnwkkldzlpjvgnjv9bq2vi19g300f9"))))
+    (build-system python-build-system)
+    (propagated-inputs (list sentencepiece))
+    (native-inputs (list pkg-config))
+    (home-page "https://github.com/google/sentencepiece")
+    (synopsis "SentencePiece python wrapper")
+    (description "This package provides a python wrapper for the SentencePiece
+unsupervised text tokenizer.")
+    (license license:asl2.0)))
+
 (define-public shogun
   (package
     (name "shogun")
-- 
2.39.2





Information forwarded to guix-patches <at> gnu.org:
bug#62443; Package guix-patches. (Sat, 25 Mar 2023 15:33:02 GMT) Full text and rfc822 format available.

Message #14 received at 62443 <at> debbugs.gnu.org (full text, mbox):

From: Nicolas Graves <ngraves <at> ngraves.fr>
To: 62443 <at> debbugs.gnu.org
Cc: ngraves <at> ngraves.fr
Subject: [PATCH 3/3] gnu: Add llama-cpp.
Date: Sat, 25 Mar 2023 16:32:20 +0100
* gnu/packages/machine-learning.scm (llama-cpp): New variable.
---
 gnu/packages/machine-learning.scm | 64 +++++++++++++++++++++++++++++++
 1 file changed, 64 insertions(+)

diff --git a/gnu/packages/machine-learning.scm b/gnu/packages/machine-learning.scm
index df1989d316..6c78b14fc6 100644
--- a/gnu/packages/machine-learning.scm
+++ b/gnu/packages/machine-learning.scm
@@ -400,6 +400,70 @@ (define-public guile-aiscm
 (define-public guile-aiscm-next
   (deprecated-package "guile-aiscm-next" guile-aiscm))
 
+(define-public llama-cpp
+  (let ((commit "3cd8dde0d1357b7f11bdd25c45d5bf5e97e284a0")
+        (revision "0"))
+    (package
+      (name "llama-cpp")
+      (version (git-version "0.0.0" revision commit))
+      (source
+       (origin
+         (method git-fetch)
+         (uri (git-reference
+               (url "https://github.com/ggerganov/llama.cpp")
+               (commit (string-append "master-" (string-take commit 7)))))
+         (file-name (git-file-name name version))
+         (sha256
+          (base32 "0i7c92cxqs31xklrn688978kk29agivgxjgvsb45wzm65gc6hm5c"))))
+      (build-system cmake-build-system)
+      (arguments
+       (list
+        #:modules '((ice-9 textual-ports)
+                    (guix build utils)
+                    ((guix build python-build-system) #:prefix python:)
+                    (guix build cmake-build-system))
+        #:imported-modules `(,@%cmake-build-system-modules
+                             (guix build python-build-system))
+        #:phases
+        #~(modify-phases %standard-phases
+            (add-before 'install 'install-python-scripts
+              (lambda _
+                (let ((bin (string-append #$output "/bin/")))
+                  (define (make-script script)
+                    (let ((suffix (if (string-suffix? ".py" script) "" ".py")))
+                      (call-with-input-file
+                          (string-append "../source/" script suffix)
+                        (lambda (input)
+                          (call-with-output-file (string-append bin script)
+                            (lambda (output)
+                              (format output "#!~a/bin/python3\n~a"
+                                      #$(this-package-input "python")
+                                      (get-string-all input))))))
+                      (chmod (string-append bin script) #o555)))
+                  (mkdir-p bin)
+                  (make-script "convert-pth-to-ggml")
+                  (make-script "convert-gptq-to-ggml")
+                  (make-script "quantize.py")
+                  (substitute* (string-append bin "quantize.py")
+                    (("os\\.getcwd\\(\\), quantize_script_binary")
+                     (string-append "\"" bin "\", quantize_script_binary"))))))
+            (add-after 'install-python-scripts 'wrap-python-scripts
+              (assoc-ref python:%standard-phases 'wrap))
+            (replace 'install
+              (lambda _
+                (let ((bin (string-append #$output "/bin/")))
+                  (install-file "bin/quantize" bin)
+                  (copy-file "bin/main" (string-append bin "llama"))))))))
+      (propagated-inputs
+       (list python-pytorch python-sentencepiece python-numpy))
+      (inputs (list python))
+      (home-page "https://github.com/ggerganov/llama.cpp")
+      (synopsis "Port of Facebook's LLaMA model in C/C++")
+      (description "This package provides a port to Facebook's LLaMA collection
+of foundation language models.  It requires models parameters to be downloaded
+independently to be able to run a LLaMA model.")
+      (license license:expat))))
+
 (define-public mcl
   (package
     (name "mcl")
-- 
2.39.2





Added tag(s) patch. Request was from Bruno Victal <mirai <at> makinata.eu> to control <at> debbugs.gnu.org. (Thu, 30 Mar 2023 23:01:02 GMT) Full text and rfc822 format available.

Information forwarded to guix-patches <at> gnu.org:
bug#62443; Package guix-patches. (Sat, 08 Apr 2023 12:08:02 GMT) Full text and rfc822 format available.

Message #19 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Nicolas Goaziou <mail <at> nicolasgoaziou.fr>
To: Nicolas Graves via Guix-patches via <guix-patches <at> gnu.org>
Cc: Nicolas Graves <ngraves <at> ngraves.fr>, 62443-done <at> debbugs.gnu.org
Subject: Re: [bug#62443] LLaMA.cpp
Date: Sat, 08 Apr 2023 14:07:22 +0200
Hello,

Nicolas Graves via Guix-patches via <guix-patches <at> gnu.org> writes:

> Here are 3 patches introducing the LLaMA CPP implementation. Since
> weights are available as torrent download, this makes the whole model
> usable with a local config.

Applied. Thank you.

Regards,
-- 
Nicolas Goaziou




Reply sent to Nicolas Goaziou <mail <at> nicolasgoaziou.fr>:
You have taken responsibility. (Sat, 08 Apr 2023 12:08:03 GMT) Full text and rfc822 format available.

Notification sent to Nicolas Graves <ngraves <at> ngraves.fr>:
bug acknowledged by developer. (Sat, 08 Apr 2023 12:08:03 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 07 May 2023 11:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 43 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.