GNU bug report logs -
#8519
24.0.50; doc-view: allow pdftotext -layout instead of -raw
Previous Next
Reported by: trentbuck <at> gmail.com (Trent W. Buck)
Date: Mon, 18 Apr 2011 09:24:03 UTC
Severity: normal
Tags: fixed
Found in version 24.0.50
Fixed in version 27.1
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 8519 in the body.
You can then email your comments to 8519 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
owner <at> debbugs.gnu.org, rfrancoise <at> debian.org, bug-gnu-emacs <at> gnu.org
:
bug#8519
; Package
emacs
.
(Mon, 18 Apr 2011 09:24:03 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
trentbuck <at> gmail.com (Trent W. Buck)
:
New bug report received and forwarded. Copy sent to
rfrancoise <at> debian.org, bug-gnu-emacs <at> gnu.org
.
(Mon, 18 Apr 2011 09:24:03 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
doc-view supports using pdftotext on ttys.
Unfortunately it is hard-coded to pass -raw.
I would prefer to pass -layout.
Please modify doc-view to allow me to support something like
(setq doc-view-pdftotext-program-args '("-layout" "-nopgbrk"))
FYI, my pdftotext manpage says -raw is discouraged:
-layout
Maintain (as best as possible) the original physical
layout of the text. The default is to =b4undo' physical
layout (columns, hyphenation, etc.) and output the text
in reading order.
-raw Keep the text in content stream order. This is a hack
which often "undoes" column formatting, etc. Use of raw
mode is no longer recommended.
In GNU Emacs 24.0.50.1 (x86_64-pc-linux-gnu)
of 2010-12-14 on elegiac, modified by Debian
(emacs-snapshot package, version 1:20101212-2)
configured using `configure '--build' 'x86_64-linux-gnu' '--host' 'x86_64-linux-gnu' '--prefix=/usr' '--sharedstatedir=/var/lib' '--libexecdir=/usr/lib' '--localstatedir=/var' '--infodir=/usr/share/info' '--mandir=/usr/share/man' '--with-pop=yes' '--enable-locallisppath=/etc/emacs-snapshot:/etc/emacs:/usr/local/share/emacs/24.0.50/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/24.0.50/site-lisp:/usr/share/emacs/site-lisp' '--without-compress-info' '--with-x=no' '--without-dbus' '--without-sound' 'build_alias=x86_64-linux-gnu' 'host_alias=x86_64-linux-gnu' 'CFLAGS=-DDEBIAN -DSITELOAD_PURESIZE_EXTRA=5000 -g -O2' 'LDFLAGS=-g -Wl,--as-needed' 'CPPFLAGS=''
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: C
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: nil
value of $LANG: en_AU.utf8
value of $XMODIFIERS: nil
locale-coding-system: utf-8-unix
default enable-multibyte-characters: t
Major mode: Man
Minor modes in effect:
diff-auto-refine-mode: t
shell-dirtrack-mode: t
rcirc-track-minor-mode: t
xterm-mouse-mode: t
ido-everywhere: t
savehist-mode: t
icomplete-mode: t
show-paren-mode: t
delete-selection-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
column-number-mode: t
line-number-mode: t
transient-mark-mode: t
Recent input:
O A ESC O A ESC O A ESC O A ESC O A ESC O A ESC O A
ESC O A ESC O A ESC O A ESC O A ESC O A ESC O A ESC
O A ESC O A ESC O A ESC O A ESC O A ESC O A ESC O A
ESC O A ESC O A ESC O A ESC O A ESC O A ESC O A ESC
O A ESC O A ESC O A ESC O A ESC O A ESC O A ESC O A
ESC O A ESC O B ESC O B ESC C-b ESC O A ESC O A ESC
O A C-e RET RET ( e v a l - a f t e r - l o a d SPC
" p DEL d o v - DEL DEL c - v i e w " RET TAB ' ( C-y
C-x C-x C-g ESC s ESC O B TAB ESC O A C-e ESC C-k ESC
O B ESC O B ESC O B ESC b ESC O B ESC b ESC b ESC d
l a o u t DEL DEL DEL y o u t ESC O B ESC O A ESC O
A ESC O A ESC O A ESC C-x C-x C-s ESC a ESC a C-x ESC
O D C-x C-k C-x C-k RET C-x C-k RET y C-x 1 C-v C-v
C-v C-v C-v ESC x m a n RET p d f t o t e x t RET C-x
0 C-s r a w ESC O C ESC O C ESC O B C-v ESC x r e p
o r t SPC e m a c s RET b u g RET
Recent messages:
Copying /scpc:soy:/cyber/tmp/split-handshake.pdf to /tmp/tramp.24520Pw.pdf...done
Tramp: Inserting local temp file `/tmp/tramp.24520Pw.pdf'...done
Wrote /tmp/docview1000/split-handshake.pdf
No PNG support is available, or some conversion utility for pdf files is missing.
Unable to render file. View extracted text instead? (y or n) y
Invoking man pdftotext in the background
Please wait: formatting the pdftotext man page...
pdftotext man page formatted
Mark saved where search started
call-interactively: End of buffer [2 times]
Load-path shadows:
/home/twb/.emacs.d/lisp/magit/magit-svn hides /usr/share/emacs/24.0.50/site-lisp/magit/magit-svn
/home/twb/.emacs.d/lisp/magit/magit-key-mode hides /usr/share/emacs/24.0.50/site-lisp/magit/magit-key-mode
/home/twb/.emacs.d/lisp/magit/magit hides /usr/share/emacs/24.0.50/site-lisp/magit/magit
/home/twb/.emacs.d/lisp/magit/magit-topgit hides /usr/share/emacs/24.0.50/site-lisp/magit/magit-topgit
/usr/share/emacs/24.0.50/site-lisp/puppet-el/puppet-mode hides /usr/share/emacs/site-lisp/puppet-mode
/usr/share/emacs/24.0.50/site-lisp/debian-startup hides /usr/share/emacs/site-lisp/debian-startup
Features:
(shadow mail-extr emacsbug eldoc paredit find-func apropos cus-edit
cus-start cus-load ibuf-ext ibuffer sort tramp-cmds noutline outline
w3m-cookie thingatpt w3m-search mule-util w3m-form w3m-symbol
w3m-bookmark w3m-session w3m browse-url doc-view image-mode timezone
w3m-hist w3m-fb bookmark-w3m w3m-ems w3m-ccl ccl w3m-favicon w3m-image
w3m-proc w3m-util cc-mode cc-fonts cc-menus cc-cmds cc-styles cc-align
cc-engine cc-vars cc-defs woman tabify man assoc conf-mode vc-rcs
newcomment rect sh-script executable grep whitespace log-edit pcvs-util
add-log gnus-cite gnus-art mm-uu mml2015 epg-config mm-view smime dig
mailcap nnir gnus-sum macroexp nnoo gnus-group gnus-undo nnmail
mail-source gnus-start gnus-spec gnus-int message sendmail rfc822 mml
mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231 rfc2047 rfc2045
ietf-drums mailabbrev gmm-utils mailheader gnus-win gnus-range gnus
gnus-ems nnheader mail-utils mm-util mail-prsvr wid-edit rst compile
tool-bar etags windmove diff-mode vc help-mode easymenu view tramp-sh
shell comint tramp-cache tramp tramp-compat auth-source netrc gnus-util
password-cache format-spec advice help-fns advice-preload tramp-loaddefs
ffap vc-dispatcher vc-darcs cl xml vc-git image wdired multi-isearch
dired-aux dired regexp-opt disp-table rcirc time-date ring server
jka-compr edmacro kmacro xt-mouse ido savehist icomplete paren delsel
saveplace debian-el debian-el-loaddefs w3m-load emacs-goodies-el
emacs-goodies-custom emacs-goodies-loaddefs easy-mmode dpkg-dev-el
dpkg-dev-el-loaddefs ediff-hook vc-hooks lisp-float-type lisp-mode
register page menu-bar rfn-eshadow timer select mouse jit-lock font-lock
syntax facemenu font-core frame cham georgian utf-8-lang misc-lang
vietnamese tibetan thai tai-viet lao korean japanese hebrew greek
romanian slovak czech european ethiopic indian cyrillic chinese
case-table epa-hook jka-cmpr-hook help simple abbrev loaddefs button
minibuffer faces cus-face files text-properties overlay md5 base64
format env code-pages mule custom widget hashtable-print-readable
backquote make-network-process multi-tty emacs)
Information forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#8519
; Package
emacs
.
(Thu, 30 Jun 2011 22:08:03 GMT)
Full text and
rfc822 format available.
Message #8 received at 8519 <at> debbugs.gnu.org (full text, mbox):
> doc-view supports using pdftotext on ttys.
> Unfortunately it is hard-coded to pass -raw.
> I would prefer to pass -layout.
>
> Please modify doc-view to allow me to support something like
>
> (setq doc-view-pdftotext-program-args '("-layout" "-nopgbrk"))
I came across the same need and found this bug report.
I think doc-view should also support other free software
that processes PDF files:
1. pdftk
pdftk is able to extract the PDF metadata (title, author, bookmarks, etc.),
e.g.
pdftk file1.pdf dump_data output file1.txt
So for a large PDF document, doc-view could present the
Table of Contents where the user can navigate to the selected page,
and then convert only displayed pages instead of all pages
that is terribly slow for a 1000-page document.
pdftk also can prepare the PDF text for editing in emacs.
From `man pdftk':
-compress useful when you want to edit PDF code
in a text editor like vim or emacs.
Uncompress PDF page streams for editing the PDF
in a text editor (e.g., vim, emacs):
pdftk doc.pdf output doc.unc.pdf uncompress
This feature could be used after typing `C-c C-c'.
Since pdftk is dependent on Java, doc-view should not require it
and should be able to detect the installed PDF processing programs
(with e.g. `(executable-find "pdftk")') and select one of them
according to the user's priority list.
2. A better program is `qpdf'. It has no problems mentioned above.
So doc-view should also detect the availability of
`(executable-find "qpdf")' as well and provide the same option for its
command line arguments (and use all features relevant to doc-view).
3. Using the PDF rendering library `poppler,' it's possible
to implement in Emacs a PDF viewer like `apvlv' for Vim.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#8519
; Package
emacs
.
(Sun, 29 Sep 2019 12:27:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 8519 <at> debbugs.gnu.org (full text, mbox):
trentbuck <at> gmail.com (Trent W. Buck) writes:
> doc-view supports using pdftotext on ttys.
> Unfortunately it is hard-coded to pass -raw.
> I would prefer to pass -layout.
>
> Please modify doc-view to allow me to support something like
>
> (setq doc-view-pdftotext-program-args '("-layout" "-nopgbrk"))
Makes sense; I've now added this to Emacs 27.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
Added tag(s) fixed.
Request was from
Lars Ingebrigtsen <larsi <at> gnus.org>
to
control <at> debbugs.gnu.org
.
(Sun, 29 Sep 2019 12:28:02 GMT)
Full text and
rfc822 format available.
bug marked as fixed in version 27.1, send any further explanations to
8519 <at> debbugs.gnu.org and trentbuck <at> gmail.com (Trent W. Buck)
Request was from
Lars Ingebrigtsen <larsi <at> gnus.org>
to
control <at> debbugs.gnu.org
.
(Sun, 29 Sep 2019 12:28:02 GMT)
Full text and
rfc822 format available.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Mon, 28 Oct 2019 11:24:08 GMT)
Full text and
rfc822 format available.
This bug report was last modified 5 years and 232 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.