DEVONnewbie looking for some help

I’ve been using DTP for about a week now. I had some questions after using it, and so spent the time to read through all the documentation and online tutorials. As so often happens when I learn something new, it answered several of my questions and left me asking many more…hoping you folks can help me out. I’m almost positive I’m going to be one of those people who says 6 months or years from now that DT is indispensable. Need some pointers getting started though.

I have several PDF books (~400 pages each) on my computer. I started by importing those into DT. The result wasn’t so hot…when I search and something matches, if I click on the item, it takes several seconds to load. As an example, a 3.3MB PDF with 415 pages took 6 seconds of wall clock time to even display. This is on a brand new MBP with 8 gigs of RAM & fewer than 100 documents in the database (most are much smaller than the PDFs). Another thing I noticed was that the search results weren’t very useful. It showed me that the document matched, but doesn’t really show where the best matches are. In a 400 page document it would be much more useful to get a pointer of where to look. So the conclusion I’ve come to is that I’m misusing DTP by dumping my eBooks into it. Am I correct in believing that it’s better with smaller, more focused documents? If I’m reading an eBook and see a particular paragraph or page that I really want to keep, I should just import that section specifically? I know that DTP can also split documents…so maybe it would work well to split the eBook into chapters, and import the chapters individually? Would appreciate some tips on that.

The next big question I have is how to split up multiple databases. The documentation said that having “topic databases” is the way to go, to maximize speed & AI’s effectiveness. I’m confused as to just how I should break this up. Do I want to split the DBs up based on content, function, or both? I downloaded all the example databases and looked at them…the “Research Archive” covered a fairly broad range of topics - computers, science, politics, and fun. The demo DBs all seemed to be used for very different functions though - research archive, project management, a GTD tool, images, etc. I’m not sure if it’s recommended to break things up based on how I’m using it, or if those were just demos. My basic plan was to have two databases - one for my job, and one for everything else. As an example of the “everything else” db, I will have articles I’ve saved online, archived emails, meal plans & recipes, receipts, etc. So a pretty broad range of stuff…should I be drawing hard lines and creating multiple databases for some of this stuff, or will good group management be sufficient? I won’t have 20 million words in my db for a long time, so my basic plan is to start with one db, and if it looks like one topic is starting to have a disproportionate weight then I can create a new db for it. Seems sensible to me, but there’s a chance I’ll shoot myself in the foot by having unrelated stuff in the same db and thus making the AI less effective. To give an example of something I’m on the fence about, I’m currently working on a programming book. I’ve collected a lot of reference material for the book, and it’s a big project so it can naturally have its own DB. On the flip side though, I collect a lot of info in my day to day work that I might not file directly with the book - and yet I would really like for DTP’s AI system to find some related items that I may not think of. So would it be okay to have a single DB, with a top-level group for Book, one for Reference Articles, one for Recipes, etc? I know it’s a bit weird but I’ve seen stories about people finding seemingly unrelated things via DTP and I don’t want to miss out on that.

Okay now that the big questions are out of the way, here are two easy ones:

How should I store web pages? Links apparently aren’t indexed, plus I won’t have the content offline…so this leaves plain html, web archive, and PDF. The docs say that web archive is proprietary to OS X, and that PDF is an open standard. So I guess either plain HTML or PDF would be best, and then it’s just personal preference?

The documentation says that I need Pro Office in order to archive email. As far as I can tell, I can drag messages from into DTP just fine. Am I missing something?

Apologies for the length, but I need some help with the fundamentals so I can really take advantage of this software. Thanks for reading

The next beta will improve this a lot.

That’s right. Documents with up to few thousand words are optimal.

PDF documents remain the layout but are static. HTML pages are dynamic but depend on the other hand on online resources (e.g. images).

They’re displayed but not convertable and not searchable.

It’s good to see that I am not the only beginner struggling with this issue:

“The next big question I have is how to split up multiple databases. The documentation said that having “topic databases” is the way to go, to maximize speed & AI’s effectiveness.”

I’m leaning towards 2 - one DB for research, with different groups for the projects that I am working on and another for business and business reltaed data.

I assume that at some point my research DB will get broken up into smaller DB’s as it grows and as my understanding of my research grows. Right now, I have topics in the realm of internet marketing and idea diffusion that might just have application to my research into somatics and psychology topics, so I will keep them in the same DB for ease of sharing.

  • Ryan

That’s been extensively discussed in other threads. Bill DeVIlle (DEVONTechnologies Evangelist) probably has written one or more “white paper” posts on that topic he might be able to refer you to. :slight_smile:

Reading cj grunenberg’s remark above,

I was a little confused about the last sentence–if an image on a webpage is deleted online, will my DTP version stop displaying that image?-- and checked the help. There are 2 ways to store an HTML page in DTPro,

What I usually do, which is just select part of a page and clip it to DTPro with Take Rich Note, is the “Capture Page” variety, and yes it depends on the online image. I checked this by turning off airport and then viewing a page I had captured, and it lacked images. So when my laptop is out of contact with the net, or when the page owner moves it or changes it, I would lose at least the images.

I decided to switch to “Web Archive”. At first I assumed this meant I’d have to store the entire page—that is what “Web Archive” (Save Page) means in the Finder. But happily I found that DTP will let me archive just the selected part of a page, so I can choose only a bit of text or choose all the text but not the sidebar, ads, etc.

I’m just posting this in case it is useful to someone else, or maybe there is more I need to know about this and another commenter may add to it. I’m a longtime user of DevonThink, first Pers then Pro, but haven’t explored it as much as I should!

Ryan, if you are just starting out and wish to separate content into two categories as you described, that can be done by creating two top-level groups in a database and filing your documents into subgroups within them.

As your database grows, that will make it easy to split it into two databases, each corresponding to the content of one of those two top-level groups.

Examples of procedures to split a database into two or more separate databases:

  1. Create a new, empty database and give it an appropriate name. In the original database, select the content that’s to be moved to the new database and then Control-click (right click) on the selection. Choose the contextual menu option, ‘Move To’ and select the new database as the destination. (The ‘Move To’ command will transfer the selected content to the other database and remove it from the first, in public beta 7 and later.)

  2. Another way to ‘split’ databases is to make one or more copies of it in the Finder, deleting from each the content that isn’t desired in that database.
    2a) First, Quit the DT Pro/Office application. That’s important, as making a Finder copy of an open database may result in an incomplete or damaged copy!
    2b) Still in the Finder, and with the DEVONthink application still Quit, rename the copied database if the name is to change. Never rename an open database, as DEVONthink would lose track of it.
    2c) Launch DT Pro/Office and open, one at a time, each of the database copies. Select and delete from each the content that isn’t desired in it.
    2d) If the copy was renamed in the Finder, the new database name will not be displayed in view windows until it is changed in File > Database Properties.
    2e) If all of the resulting split databases have a different name than did the original, clear DEVONthink’s memory of the original database by selecting File > Open Recent and choose the option to clear the menu.

OK - that handles the question about how to split databases.

Following is an excerpt of a previous post about why I create multiple topically or historically designed databases. Remember that if it’s useful to search across such databases, they can be opened (or closed) like informational Lego blocks to assemble the body of information you require.

Use topically designed databases to reduce their size for optimum performance on the computer.

My ModBook (a custom Mac tablet based on a MacBook) has 4 GB RAM. That amount of RAM helps responsiveness not only for my DTPO2 databases, but also for photo or video editing and so on.

Even so, I would experience poor performance were I to merge all my DTPO2 databases into a single very large one. I would also lose the improved focus of Search and See Also operations that results from good topical design of databases.

My main DTPO2 database holds more than 25 thousand documents and more than 35 million total words. It is a topical database reflecting my professional interests in environmental science and engineering, policy issues and laws and regulations.

This is, of course, a pretty broad topic. Contents range across a number of scientific disciplines, from physics and chemistry to geophysics, hydrology, atmospheric sciences, ecology and toxicology and health effects. Policy issues often involve political science, sociology and economics analyses. I often compare legal and regulatory differences in the USA, European Union and some less-developed areas.

I have a second environmental database that deals with specific topics such as environmental sampling methodologies, chemical analytical methods, environmental data evaluation, quality assurance, risk assessment methods and so on.

I’m spoiled. I expect most searches to take milliseconds, and See Also or Classify suggestions to pop up immediately. Were I to merge those two environmental databases. I would encounter performance lags.

But performance isn’t the most important justification for splitting my environmentally-related content into two databases. When I’m researching health and regulatory issues related, for example, to mercury contamination in fish, I don’t want to be deluged with hundreds of references to sampling, analytical and data evaluation procedures.

Topical design of my databases makes my use of them more efficient and productive.