Save pdf directly in DTPro2

I work frequently with pdfs of journal articles. When these open inside my Safari browser, I know of two options for getting them into DT. One is to save the pdf to my hard drive, and then import it from within DT. The other is to capture a web archive (using the handy archive bookmarklet in my bookmarks bar). Neither of these options is ideal.

The first one adds a couple steps.

The second gives me an archive, but not a stand-alone pdf file that I could, for example, e-mail to a friend.

Is there a way I can directly save a stand-alone pdf file from my browser into DT? Or is there a way that from the web archive in DT, I can extract the stand-alone file within DT?

Thanks!

One way is to save the PDF from Safari to a Finder folder, to which you have attached a Folder Action script to import added content to a DT Pro/Office database. Such a Folder Action script is contained in the Extras folder, within the download disk image.

DEVONagent allows one to capture PDF directly to a database.

But here’s what I do. I’ve bookmarked the URLs of the journals, governmental sites and news sources that I routinely visit, in a well-organized Bookmarks group in a DEVONthink Pro Office database.

The browser built into DEVONthink 2 will open to the desired site when I select that bookmark (double-click on the bookmark to open in a new window. In the table of contents of the current online issue of Science Magazine I will Command-Click on the link to each article that I may want to consider for capture into my database, and the linked article will download into a new Tab in the browser. If I wish to capture the article as PDF, I’ll click on the PDF link.

I’ll then examine the downloaded articles in each Tab. If I wish to capture a PDF, I Control-click in the window and choose the contextual menu option, “Capture as PDF”. That’s all there is to that. The PDF is now in my database. This one is “Astronomy’s Greatest Hits” displaying advances in astronomy from 1609 to 2009. Now I’ll close that Tab and take a look at the next article I’ve identified for possible capture from this week’s issue.

Actually, I usually capture articles from Science Magazine as rich text from the HTML view, for other reasons.

Between the Services options and the contextual menu options, I have a much wider and very convenient choice of options within the DEVONthink 2 browser for capture to my database; plain or rich text, PDF, HTML and WebArchive, all available with a keyboard shortcut or the choice among contextual menu options.

Thanks! The folder action does the trick, but I would like to be able to browse within DTPO as you describe.

However, I couldn’t replicate what you describe within DTPO2: whenever I ctrl-click on the pdf in the browser, the contextual menu only includes actions for manipulating the pdf.

I’m attaching a screen shot of what this looks like. What am I doing wrong?
pdf trouble.tiff (561 KB)

I was about to post something similar. I’ve tried the “Capture Current Page as PDF”, but of course that also grabs the surrounding webpage (see attached image).
dt.tiff (369 KB)

I was viewing a PDF file in DTPO2’s built in Web browser, having Command-clicked on a link to a PDF article from an HTML table of contents page, so that the PDF was available for viewing/capture in a new browser Tab. When I clicked on the Tab holding the PDF, I could then Control-click on the PDF and choose the contextual menu option to capture it as PDF to my database.

The image below shows a portion of a Three Panes view of my Bookmarks/ Scientific Journals group. I’ve selected, in the top pane, the bookmark to Science Magazine. The first Tab in the browser window below contains the table of contents of the current issue. I had Command-clicked a link to the PDF version of an article, which was downloaded in the second Tab. I clicked on the second Tab to view the PDF, decided to capture it, Control-clicked within the displayed PDF and selected the option, “Capture PDF”. Now that PDF is in my database.

If I ctrl click inside the PDF, I also only get the PDF options. The only way I have found to directly save the PDF is by right clicking on the link itself, in Google for example, and using “Capture Link”

Hi Bill,

I followed up with the same workflow as you, (using tabs from bookmarked sites) but got the same result.

The difference I see btw my screenshot and yours is that my pdf has an Acrobat frame around it with command icons for sizing and searching, etc. For some reason yours does not have this frame. Could that be why we are getting different contextual menus (my contextual menu is shown in the screen shot)?

Thanks!
Ryan

sgmiller:

When you ctrl-click to capture the link, where does the pdf file appear in DTPO?

Ryan (and hauntedtapedeck) after looking more closely at the images you attached I see the difference.

Unlike the PDFs I’ve been capturing as described above from Science Magazine, some other online journals and a number of governmental sites (US and EU) that make PDF files directly available to a Web browser, the pages you downloaded, e.g. from sagepub, are not themselves PDF files, but allow one to read, print or save a file. That’s different, and there’s the rub; that’s why the contextual menu option to Capture PDF isn’t available on those pages. I assume access is through a university or library portal, rather than direct subscription to a particular online journal.

The Capture PDF CM option will also create a non-paginated PDF from most HTML pages.

The sagepub page requires a subscription to login. So does Science Magazine, and subscribers can view and download articles present and past. But the sagepub page apparently includes additional protection of its publication rights.

That’s rather like reports produced by the US National Academy of Sciences. These publications can be read online at no cost, but using a very clumsy interface. Yes, these are actually PDF documents, but not presented as such for free. Want the downloadable PDF? Buy it. Yes, income from these publications partially funds operations. But as these publications are often important to national and international policy issues, I’m pushing for free availability. (Once in a while I stop whining and buy one, anyway.)

Anyway, it seems that to capture such journal articles you are back to Safari, and to a two-step procedure (Save As to the Finder, then import to the database) or a one-step procedure (Save As to a Finder folder to which a Folder Action script is attached, which will import added files to the database).

There is a misunderstanding here. If you have Acrobat plugIn in your Internet PlugIns folder in the Library System or in your Home Library you do not get what Bill is telling us. The Acrobat PlugIn does not have the option, in the contextual menu, to capture pdf!!! This is the problem. If you remove the PlugIn you will be able to capture the PDF. But personally I like the Acrobat PlugIn. So no solution!!!
THANKS

Dumb me never thought about removing the Acrobat plugin! I never use the added functions so being able to save the PDF’s directly from DT is a big bonus.

Bill,

Thanks for all your help. The bottom line is, I access journals through a wide variety of portals, and each has their own arbitrary-seeming way of offering pdfs for download, making it difficult to formulate a universal method of grabbing them (outside of the latter two you mention). I have been using a folder with an action attached, the only problem being that it’s pulling all pdfs, academic of otherwise, into DT. No way round that right now.

However, I have an additional query, alluded to earlier in the thread: when I capture pdfs in this way (i.e., browsing the portal in DT, then pulling a pdf into the DT database), the software seems to send it to a folder – which I assume is chosen according to the Auto Classify filters – rather than Inbox, where I would expect it to go. Furthermore, I can’t move the pdf that results anywhere, via drag/drop or via the right-click menu.

There are some default groups in DT2 databases, one of which is “All PDF Documents”. But that’s a different kind of group. No matter how hard you try, you cannot directly move a PDF out of it, just as the Coyote could never catch the Roadrunner. The reason is that this is a smart group that lists all of the PDFs in that database. The PDF isn’t “really” in that group, it’s “really” somewhere else in the database. So long as the PDF exists anywhere in the database, it will show up in All PDF Documents. It can’t be dragged out of there, by definition. No matter what cutting instruments Coyote buys from Ajax, the imaginary rubber band that ties the PDF to that smart group cannot be cut.

Where is that PDF? The Reveal command (Command-R) can solve the mystery, by one of two possible results. Select a PDF listed in All PDF Documents and press Command-R. 1) If that PDF is located in the Inbox or in any group created by the user, Reveal will show that location; mystery solved. 2) If that PDF is unclassified (residing in the top level of the database), and the Reveal command merely points back to the All PDF Documents smart groups, I know that a) the Three Panes view is being used and b) that PDF is unclassified.

To handle case 2)b definitively, I can switch to the Split view (which does display unclassified documents in the top level of a view) and once more select the PDF in All PDF Documents and invoke the Reveal command. It works — there’s the location of the PDF! Or, without leaving the Three Panes view, I could Command-click on any selected group, so that unclassified items at the top level appear in the upper pane — the PDF is one of those unclassified documents.

Similar logical smart groups are Today and Yesterday. If I see a document in one of those groups, I cannot move it out, but I can find its location as above.

OK, all that said, sometimes a PDF captured from a browser view will be sent to Inbox, and sometimes not. If I click on a bookmark displayed in the top pane of the Three Panes view, the Web page will be displayed in the bottom right pane. If in that mode I use a CM option to capture content from the Web page, the capture will be made to the group in which that bookmark is contained.

But if, when launching the page from the bookmark document, I double-click on the bookmark to open the browser in its own window, CM captures made in that mode will be sent to the Inbox.

In both modes of Web page display, captures made via Services will be sent to the Inbox.

Is this anomalous behavior in location of captured data, or useful, predictable behavior? Is it good UI or bad UI? I haven’t quite decided, as sometimes I find it useful. But it can clearly puzzle and irritate a user who isn’t aware of these possibilities.

Well, turns out I was being silly, and trying to move it while looking at it in “Today”; smart group, dumb user :wink:

The thing is, that the place DT had filed the pdf was just plain odd: 4 folders deep, amongst some documents I had filed away as ‘archived’. I can’t really see any rhyme or reason to it being there. Nevertheless, I know now that I can always turn to Cmd-R in such instances.

Thanks again, Bill. Your posts always deliver much more than just a straightforward answer to the question at hand.

While Bill’s workflow is a viable option here, it seems to be a case of modifying behavior to fit the system, not the system facilitating optimal usability.

I often receive a link to a pdf via email. As I don’t have DT set up as my default browser, this means I start reading the pdf in Safari. If I decide it’s worth saving, my current workflow is:

  1. Hover mouse over pdf to bring up the options. Select option to open in external viewer
  2. In external viewer (for me, Skim), select the pdf icon in the menu bar with mouse. Pause momentarily (to avoid dragging the whole window), then drag the pdf onto the DT Dock icon

This is still one step too complicated. Safari may not allow you to drag a pdf document straight from it, but you can drag a URI. It would be great if DT would give you the option to save this dragged URI as a pdf, not a bookmark…

It occurred to me that I may as well just AppleScript my desired behavior—which led to the [re?]discovery of the “Add PDF document to DEVONthink” script, which works like a charm.

Yes, that script will work (DT Pro and DT Pro Office only). Here’s still another way:

If you are using Safari and are viewing a PDF, you can use the File > Save As command to save that PDF directly to the Inbox location in the ‘Places’ section of the left column of a Finder window. With one step, the PDF is now available in your Global Inbox (DT Pro and DT Pro Office only).

If the Inbox ‘Place’ isn’t currently visible in a Finder window and you are running under Snow Leopard, you can go to ~/Library/Application Support/DEVONthink Pro 2/ and drag the folder named ‘Inbox’ under ‘PLACES’ in the left column of the Finder window.

Some additional functionalities: Now, if you are viewing a Web page (not PDF) in Safari you can use File > Save As to save the page as a WebArchive directly to the Inbox ‘Place’. It will then be sent to your Global Inbox.

From almost any other application that has a Save or Save As command (Word, Pages, Excel, Numbers, Keynote, etc.), you can save a document directly to the Inbox ‘Place’ and it will be sent to your Global Inbox.

If you have imported a Word or other document already, and wish to edit it, you can click on the ‘Open Externally’ icon in the Toolbar, or Control-click on the document, choose ‘Open With’ and the desired application to open it (but don’t choose an application that would change the filetype). Make your changes to the document and choose FIle > Save. The changes will be sent directly to the document stored in your database and you will be able to view the changes within your database. (Don’t use File > Save As in this case, unless you wish to navigate to the ‘Inbox Place’ and save a new version of that file to your Global Inbox.)

Good point, Bill. Saving to the inbox is a great option, though I often prefer to drag directly to DT so I can use the destination selector to place it right away. (This wouldn’t be important if I were more disciplined about clearing my inbox.)

Also, I should mention that the script is better than dragging from Skim because it preserves the URI of the document.

This is getting off topic, but since Bill stated this in his reply:

Opening an MS Office file, editing it and saving it is not working for me. It does not save back to the original location, so when I reopen later my changes are not there.

Example: I have a note, with some text I typed, and then I dragged in an Excel file. According to “Info” on the note, the note is stored as “./rtfd/0/Curriculum database fixes & updates needed - punch list.rtfd” I can’t find where this is in my file system, FWIW.

If I double click on the excel file, it opens in excel, and I can edit it. But according to File > Properties (in Excel), it is stored in ~/Library/Caches/TemporaryItems/DEVONThink Pro.

If I click “Save As” the default location is my Desktop.

I have upgraded to DT2 (Pro Office).

This is quite inconvenient – I need to be able to edit docs I store in DT. Help!

Yes, you can import an Excel file into your database, open and edit it under Excel, press Save in Excel and see the changes in your database.

Your problem was that you apparently incorporated an Excel file as an attachment within a rich text note. An attachment of that kind cannot be modified as I described. Instead, import the actual Excel .xls file into your database.

Select the Excel document in your database view list. Control-click (right-click) and choose ‘Open With’. Choose Excel to open the file. Edit the file in Excel and, when finished, press ‘Save’ ( NOT ‘Save As’). Note: Another way to open the Excel file under Excel is to select it, then click on the ‘Open Externally’ icon in the Toolbar (the default in this case will be Excel).

Now look at that Excel document in your database. The changes you just made are there.

DT Pro/Office 2 users can save files directly to their Global Inbox from almost any application that has the ‘Save’ or ‘Save As’ command, by choosing the Inbox ‘Place’ in the left column of a Finder window. Example: You are viewing a Web page in Safari or Firefox. Choose ‘Save As’, choose ‘Inbox’ as the destination, and the pages is saved as a WebArchive in your Global Inbox. Or if you are viewing a PDF in your browser, the same procedure will save the PDF to your Global Inbox. Created a new Excel sheet? Choose the .xls file format, then Save it directly to your Global Inbox by saving it to ‘Inbox’.

As I work with self-contained (Import-captured) databases, I rarely bother to save files externally to my databases. For example, I’ve got templates for Pages, Numbers, Word and some other filetypes that I frequently use that can be called up by Data > New with Templates. If I select a Pages template, a new Pages document appears in my database ready for editing. Select it, click on the ‘Open Externally’ icon in the Toolbar and it opens under Pages. When I’m finished writing I invoke ‘Save’ in Pages and the document is saved back into my database.