GNU bug report logs - #73530
[PATCH] Add imenu index function for Djvu files in doc-view

Previous Next

Package: emacs;

Reported by: Visuwesh <visuweshm <at> gmail.com>

Date: Sat, 28 Sep 2024 15:12:02 UTC

Severity: wishlist

Tags: patch

Done: Tassilo Horn <tsdh <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Visuwesh <visuweshm <at> gmail.com>
To: Tassilo Horn <tsdh <at> gnu.org>
Cc: Eli Zaretskii <eliz <at> gnu.org>, "Jose A. Ortega Ruiz" <jao <at> gnu.org>, 73530 <at> debbugs.gnu.org
Subject: bug#73530: [PATCH] Add imenu index function for Djvu files in doc-view
Date: Wed, 02 Oct 2024 13:49:55 +0530
[Message part 1 (text/plain, inline)]
[புதன் அக்டோபர் 02, 2024] Tassilo Horn wrote:

> Visuwesh <visuweshm <at> gmail.com> writes:
>
> Hi Visuwesh,
>
> [Sorry if this message appears twice but it seems to have bounced
> yesterday.]

[ I did not get the previous mail FYI.  ]

>> Please review the attached.
>
> First of all, the patch doesn't apply on master's NEWS and misc.texi
> here.  If I exclude those, the changes to doc-view.el can be applied.

Oops, I suppose I can no longer be lazy about pulling from remote
anymore.

> Unfortunately, I didn't find a PDF nor DjVu document on my computer
> where an index can be built.  I have the relevant tools installed but
> get the message that no index can be built for that document and
> doc-view--outline becomes 'unavailable.
>
> I've tried various PDFs generated by LaTeX with many section,
> subsections, etc.

The PDF generated by LaTeX can have a wildly different outline than
matched by doc-view's regexp:

    % mutool show test.pdf outline
    |	"Text"	#nameddest=section.1
    |	"Annotations"	#nameddest=section.2
    |	"Links"	#nameddest=section.3
    |	"Attachments"	#nameddest=section.4
    +	"Outline"	#nameddest=section.5
    +		"subsection"	#nameddest=subsection.5.1
    |			"subsubsection"	#nameddest=subsubsection.5.1.1

Compare it with:

    % mutool show atkins_physical_chemistry.pdf outline
    |	"Cover"	#page=1&view=Fit
    |	"PREFACE"	#page=7&view=Fit
    |	"USING THE BOOK"	#page=8&view=Fit
    |	"ABOUT THE AUTHORS"	#page=12&view=Fit
    |	"ACKNOWLEDGEMENTS"	#page=13&view=Fit
    |	"BRIEF CONTENTS"	#page=15&view=Fit
    |	"FULL CONTENTS"	#page=17&view=Fit
    |	"CONVENTIONS"	#page=27&view=Fit
    |	"LIST OF TABLES"	#page=28&view=Fit
    ...


> For DjVu, my sample size is 1, and that's a presentation, so at least
> here I'm not sure if there should be an index available...

I will send the link to the DjVu file that I wrote the feature for
off-list.  I will send a link to a PDF file too.

> That said, I haven't used the imenu feature before so I can't say if it
> ever worked for me...
>
>> diff --git a/doc/emacs/misc.texi b/doc/emacs/misc.texi
>> index e19e554fb26..332d5b1468f 100644
>> --- a/doc/emacs/misc.texi
>> +++ b/doc/emacs/misc.texi
>> @@ -581,17 +581,14 @@ DocView Navigation
>>  default size for DocView, customize the variable
>>  @code{doc-view-resolution}.
>>  
>> -@vindex doc-view-imenu-enabled
>>  @vindex doc-view-imenu-flatten
>>  @vindex doc-view-imenu-format
>> -  When the @command{mutool} program is available, DocView will use it
>> -to generate entries for an outline menu, making it accessible via the
>> -@code{imenu} facility (@pxref{Imenu}).  To disable this functionality
>> -even when @command{mutool} can be found on your system, customize the
>> -variable @code{doc-view-imenu-enabled} to the @code{nil} value.  You
>> -can further customize how @code{imenu} items are formatted and
>> -displayed using the variables @code{doc-view-imenu-format} and
>> -@code{doc-view-imenu-flatten}.
>> +  DocView can generate an outline menu for PDF and Djvu documents using
>
> Didn't Eli say the official spelling was DjVu?  That's at least the
> spelling that the djvused man pages also uses and they should know.

Fixed.

>> +the @command{mutool} and the @command{djvused} programs respectively
>> +when they are available.  This is made accessible via the
>> @code{imenu} +facility (@pxref{Imenu}).  You can customize how
>> @code{imenu} items are +formatted and displayed using the variables
>> @code{doc-view-imenu-format} +and @code{doc-view-imenu-flatten}.
>
> I guess you should mention the new defcustom doc-view-djvused-program
> here, too.

Done.

On this note, should we use doc-view-pdfdraw-program in place of mutool
in doc-view--pdf-outline?

>> +(defcustom doc-view-imenu-enabled (and (or (executable-find "mutool")
>> +                                           (executable-find "djvused"))
>> +                                       t)
>> +  "Whether to generate imenu outline for PDF and Djvu files.
>> +This uses \"mutool\" for PDF files and \"djvused\" for Djvu files."
>>    :type 'boolean
>> -  :version "29.1")
>> +  :version "31.1")
>> +(make-obsolete-variable 'doc-view-imenu-enabled
>> +   "Imenu index is generated unconditionally, when available"
>> +   "31.1")
>
> Ah, I thought our last agreement was that we keep that variable (as
> suggested by Jose) as it is used right now but make it possible to have
> a value that tells to index only PDF or DjVu documents.

Ahh, I misunderstood the suggestion.

> Well, I actually have no strong opinion here.  Technically, I like your
> approach better because of its simplicity.  I would like to test with
> some larger documents to see how long index building takes, though.

I tried the function with a large PDF file:

    % time mutool show atkins_physical_chemistry.pdf outline >/dev/null
        0m00.32s real     0m00.30s user     0m00.02s system
    % time mutool show atkins_physical_chemistry.pdf outline >/dev/null
        0m00.30s real     0m00.26s user     0m00.03s system
    % mutool show atkins_physical_chemistry.pdf outline |wc -l
    925
    % du -h atkins_physical_chemistry.pdf 
    97M	atkins_physical_chemistry.pdf

    (benchmark-run 10
      (doc-view--pdf-outline "~/doc/uni/refb/atkins_physical_chemistry.pdf"))
      ;; => (3.0118861719999996 0 0.0)

    (benchmark-run 1
      (doc-view--pdf-outline "~/doc/uni/refb/atkins_physical_chemistry.pdf"))
      ;; => (0.306343039 0 0.0)

which honestly isn't that long a time to wait for the first time you say
M-g i.

Now for the DjVu file that I was testing on:

    % time djvused -e print-outline Solid_State_Physics_Ashcroft.djvu >/dev/null
        0m00.24s real     0m00.23s user     0m00.01s system
    % djvused -e print-outline Solid_State_Physics_Ashcroft.djvu |wc -l
    115
    % du -sh Solid_State_Physics_Ashcroft.djvu 
    83M	Solid_State_Physics_Ashcroft.djvu

    (benchmark-run 10
      (doc-view--djvu-outline "~/tmp/Solid_State_Physics_Ashcroft.djvu"))
      ;; => (2.2234427809999997 0 0.0)

    (benchmark-run 1
      (doc-view--djvu-outline "~/tmp/Solid_State_Physics_Ashcroft.djvu"))
      ;; => (0.239040117 0 0.0)

IIRC, there's a djvu file somewhere stashed in my home directory that
had an index.  I can benchmark making the index for that file too if
you want.

For my init.el which has a (length imenu--index-alist) = 852,

    (benchmark-run 10
      (setq imenu--index-alist nil)
      (imenu--make-index-alist)) ;; => (7.113529254 0 0.0)

with REPETITIONS=1, I get (0.854962398 0 0.0).

In conclusion, the waiting time is barely an inconvenience.

> Anyhow, please write a complete sentence in the deprecation, so a dot at
> the end.  And remove the comma.

Done.

[0001-Add-imenu-index-function-for-DjVu-files-in-doc-view.patch (text/x-diff, attachment)]

This bug report was last modified 228 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.