GNU bug report logs -
#31949
[PATCH] gnu: Add docx2txt.
Previous Next
Reported by: Pierre Neidhardt <ambrevar <at> gmail.com>
Date: Sat, 23 Jun 2018 13:33:01 UTC
Severity: normal
Tags: patch
Done: ludo <at> gnu.org (Ludovic Courtès)
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
Your message dated Sat, 07 Jul 2018 17:53:28 +0200
with message-id <87o9fj3uvb.fsf <at> gnu.org>
and subject line Re: [bug#31949] [PATCH] gnu: Add docx2txt.
has caused the debbugs.gnu.org bug report #31949,
regarding [PATCH] gnu: Add docx2txt.
to be marked as done.
(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)
--
31949: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=31949
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
* gnu/packages/textutils.scm (docx2txt): New variable.
---
gnu/packages/textutils.scm | 65 ++++++++++++++++++++++++++++++++++++++
1 file changed, 65 insertions(+)
diff --git a/gnu/packages/textutils.scm b/gnu/packages/textutils.scm
index 5734bf62d..8eec045a6 100644
--- a/gnu/packages/textutils.scm
+++ b/gnu/packages/textutils.scm
@@ -14,6 +14,7 @@
;;; Copyright © 2017 Kei Kebreau <kkebreau <at> posteo.net>
;;; Copyright © 2017 Alex Vong <alexvong1995 <at> gmail.com>
;;; Copyright © 2018 Tobias Geerinckx-Rice <me <at> tobias.gr>
+;;; Copyright © 2018 Pierre Neidhardt <ambrevar <at> gmail.com>
;;;
;;; This file is part of GNU Guix.
;;;
@@ -675,3 +676,67 @@ and Cython.")
measuring and checking the width of strings, with support east asian text.")
(home-page "https://github.com/jessevdk/go-flags")
(license license:expat)))
+
+(define-public docx2txt
+ (package
+ (name "docx2txt")
+ (version "1.4")
+ (source (origin
+ (method url-fetch)
+ (uri (string-append
+ "http://downloads.sourceforge.net/docx2txt/docx2txt-"
+ version ".tgz"))
+ (sha256
+ (base32
+ "06vdikjvpj6qdb41d8wzfnyj44jpnknmlgbhbr1w215420lpb5xj"))))
+ (build-system gnu-build-system)
+ (inputs
+ `(("unzip" ,unzip)
+ ("perl" ,perl)))
+ (arguments
+ `(#:tests? #f ; No tests.
+ #:make-flags (list (string-append "BINDIR=" (assoc-ref %outputs "out") "/bin")
+ (string-append "CONFIGDIR=" (assoc-ref %outputs "out") "/etc")
+ ;; Makefile seems to be a bit dumb at guessing.
+ (string-append "INSTALL=install")
+ (string-append "PERL=perl"))
+ #:phases
+ (modify-phases %standard-phases
+ (delete 'configure)
+ (add-after 'install 'fix-install
+ (lambda* (#:key outputs inputs #:allow-other-keys)
+ (let* ((out (assoc-ref outputs "out"))
+ (bin (string-append out "/bin"))
+ (config (string-append out "/etc/docx2txt.config"))
+ (unzip (assoc-ref inputs "unzip")))
+ ;; According to INSTALL, the .sh wrapper can be skipped.
+ (delete-file (string-append bin "/docx2txt.sh"))
+ (rename-file (string-append bin "/docx2txt.pl")
+ (string-append bin "/docx2txt"))
+ (substitute* config
+ (("config_unzip => '/usr/bin/unzip',")
+ (string-append "config_unzip => '"
+ unzip
+ "/bin/unzip',")))
+ ;; Makefile is wrong.
+ (chmod config #o644)))))))
+ (synopsis "Recover text from .docx files, with good formatting")
+ (description
+ "@command{docx2txt} is a perl based command line utility to convert
+Microsoft Office™ .docx documents to equivalent text documents. Latest version
+supports following features during text extraction.
+
+@itemize
+@item Character conversions (\" ' < & > -, fractions and some mathematical
+symbols, etc.); currency characters are converted to respective names like
+Euro.
+@item Capitalisation of text blocks.
+@item Center and right justification of text fitting in a line of
+(configurable) 80 columns.
+@item Horizontal ruler, line breaks, paragraphs separation, tabs.
+@item Indicating hyperlinked text along with the hyperlink (configurable).
+@item Handling (bullet, decimal, letter, roman) lists along with (attempt at)
+indentation.
+@end itemize\n")
+ (home-page "http://docx2txt.sourceforge.net")
+ (license license:gpl3+)))
--
2.17.1
[Message part 3 (message/rfc822, inline)]
Hello Pierre,
Pierre Neidhardt <ambrevar <at> gmail.com> skribis:
> * gnu/packages/textutils.scm (docx2txt): New variable.
Perfect. Applied, thanks!
Ludo’.
This bug report was last modified 7 years and 16 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.