how to store books


i have the habit of actually storing all my books in DTP. they are searchable PDFs with text, at times 1000 or more pages. now i am wondering what the most efficient way to store them is, especially for retrieval.

the question comes up more and more these days because search for specific phrases reveals quite a few books, which renders search useless. if i look for ABC and basically all books i have to this topic are in the result list, previewing becomes difficult (you can’t do it by name and opening of the document in the preview area is too slow).

i searched the forum and some suggest to split it up into smaller parts. i tried it but couldn’t get it to work. splitting it on 5 pages runs some serious risks, most often the more generic term was not present in the 5 pages and it might actually happen that an entire book on that subject doesn’t show up because of it…

any advice is highly appreciated

I’ve found dividing books into chapters and storing all the chapters from a given book together in a single folder (excluded from classification) works well. However, I’m working with books that are generally 200-300 pages long, and the chapters are mostly academic essays 15-30 pages long. If the very long documents you’re working with don’t have any sort of internal structure, perhaps 20 or 25 pages would be a good functional length?

my ebooks are mostly technical. hence i have chapters and splitting on chapters is possible. my books are generally 600-1200 pages though.

two questions.
a) why do you exclude it from classification?
b) does your approach help with retrieving the correct documents? does it narrow down the search results or is it just showing all the chapters in the result list anyways?


a) I exclude the folder from classification because I want that folder to have only the book chapters in it. Say I have a book by Peter Singer on vegetarianism, stored in the database as chapters in a single folder. (I’m a philosopher, and one of the main things I use DT for is to organize readings for a class I teach on philosophy and food.) Call that the `Singer folder’. Now I add a paper by Gary Francione, a legal scholar, on animal rights. If the Singer folder isn’t excluded from classification, the classify pane might suggest that I move the Francione paper to the Singer folder. But I don’t want to move anything else into the Singer folder, so I exclude it from classification. I don’t exclude it from search or see also, so that I can find the chapters by Singer easily.

b) It does seem to help. Especially when I’m searching on a fairly specific topic, I don’t usually get more than a couple chapters from a given book. And the chapters are in different files, so it’s easy to look over all of the relevant passages in each chapter.

thank you for your reply.

a) interesting, so you are basically mixing folders and files?

|_ B
__| a.pdf
__| b.pdf
__| singer
_____| chapter-1.pdf
_____| chapter-2.pdf
____| chapter-3.pdf
__| francione.pdf

i thought mixing is frowned upon. or is that just the case if you would include the singer folder in the classification process?

b) how often do you search for something, multiple chapters from a book show up (each in its own file) but the chapters actually don’t contain any relevant information for the search?

on another note, how do you split your books into one-file-per-chapter files?