annotation tips? linking to specific place in file?

Hi, I am new to DTP and would like to use it for going through my research data. I have thousands of pages that I’ve already converted into pdfs and have broken down into groupings/subgrouping. Unfortunately, most of the files are pdf images of archival data that I took using digital camera a few years back. The image is clear but not good enough for the OCR to render it into text (a BIG disapointment). So the AI features, at least for this particular project, are not very useful to me

I would like to use DT to annotate the pdfs so I can anaylyze the data across the different groups. I saw the comments field and know that it’s readable, but I’m not sure that it’s the best for me.

Here’s what I’d like to be able to do:
** take notes/annotation while haveing the PDF open (so would be able to see both screens at once).
** have the annotations automatically linked to the pdf (or other media file) that it’s based on.
** have the annotations file be searchable

I see using DT as primarily a place to really dig into the data - and then be able to export the notes/annotation to Scrivener for writing.

I’m quite new to DT and haven’t really had a chance to learn all the features even though I did go through some of the tutorials. So any tips or suggestions from others would be greatly appreciated.

One final questions:

Is it possible to have a link in DT (e.g. in a rft file) the points directly to a specific spot in another document (either housed internally in DT or externally in another folder on harddrive).
Specifically, I have to work with a lot of policy documents and rather than placing a link for the entire document to a rtf file I’m working on in DT, I would like to have a link to a specific paragarh in the policy document that is relevant to the point I’m working on. (in this case, the document would be text based).

Thank you!!

Gina

Hi, Gina. Yes, one can place PDF and note windows side by side for notetaking. Although hyperlinking cannot be enabled from a PDF document, one can hyperlink from a rich text note to a PDF or any other type of document.

A hyperlink can only target the Name of a document, not a text string inside the document.

You noted you would like to have your note let you jump directly to a certain point inside the text of a text document. That can be done, although not with a direct link.

The trick is to set up a text string in your rich text note that will correspond exactly to a text string at the point of your reference text document to which you wish to jump. Think of this as similar to folding down a corner of a page in a book, or inserting a bookmark ribbon, except that the marker exists both in your rich text note and in your reference text document.

  1. Phrase marker: One way is to select and copy to the clipboard a string of text at the place in the reference text to which you want to jump. Paste that string into your rich text note, perhaps inside characters that will let you recognize it as a marker string, e.g., with a note that refers to its reference point, e.g., Aldous Huxley, Eyeless in Gaza, p. 124. Now, to jump from your note to that place in the reference, select that marker string inside the carets. Press the Services > Lookup shortcut – Command-/ – to open a new Search window with the bookmark string already inserted. Now set the Search Operator for Phrase. Press Return. Select your reference document in the results list. It will open to the highlighted bookmark (assuming your marker was unique), or you can use Find to jump to the next occurrence (the text string is already entered in Find). If you wish to open the reference document in its own window, use Find to jump to your bookmark place.

Tip: As the search for a phrase doesn’t require modification of the referred to document, this will also work for PDF, HTML and WebArchive documents.

  1. Wildcard marker: This requires typing a unique string at the point in the reference document to which you wish to jump, and the string must always be delimited by asterisks. Example: B Huxley Eyeless 141. Select that string (including asterisks), copy it to the clipboard and paste it into your note document, along with a note to yourself that it refers to a bookmark to Aldous Huxley, Eyeless in Gaza, p. 141.

As you are now making changes to your reference document, you may wish to work on a duplicate copy. This also frees you to use highlighting and underlining to your heart’s content, without mutilating your original reference copy.

Just as in 1) above, to jump from your note to the desired place in the reference text select the Wildcard bookmark string, including the enclosing asterisks in this case. Press Command-/ to invoke Lookup. But for this procedure set Operators to Wildcard. Hit Return. Select the reference text from the Search result list. There you are! Note: The current release doesn’t highlight the search string when a Search result is viewed, but the next maintenance release will likely do so. The string will be highlighted in the text if Find is used.

Tip: As a Wlldcards search of content is slower than a Phrase search, you might limit the scope of the search to the group in which the reference document is located.

Although Wildcards searches can’t be made on PDF, HTML or WebArchive documents because you can’t directly type the marker strings into those formats, those formats can be converted to plain or rich text using Data > Convert, so that marker strings can be inserted.

I think a much better option would be to use Skim. It offers a couple different options for file output, but the default (I believe) is to save all annotations as a separate .skim file. This does offer some potential drawbacks (addressed below), but the upshot is that annotations (text boxes, ‘anchored notes’, highlights, underlines) have a unique identity that can be referenced through applescript (note 2 of page 5 of document Some Document). Your question has prompted me to look into what should be the fairly trivial task of creating an applescript to work with these references. Also, Skim offers a list of annotations on the side of the document.

Dealing With Skim Files:
Dealing with 2 files instead of one can be a bit of a pain, but if you import the PDF into DTP first, it will simply create the file inside the DTP database package. Off the top of my head, I am not sure exactly how the linkage between the PDF and the .skim annotation file is handled, but it is fairly robust. I use this method frequently and have had no problems exporting a PDF from my DTP database, opening the file in Skim, and having Skim read the annotation file in the database. If you modify the annotations and save, a new annotation is created in the same directory as the exported PDF. (Skim can also save files as a bundles that includes the PDF and annotations in multiple formats, and while only Skim and possibly BibDesk recognize them, they are essentially folders containing the PDF and annotations)

Thanks Bill and Matty. I appreciate the tips. I’ve played around with Skim and like it alot - still need to figure out the most effective workflow for me though.

Some quick follow up questions - Matty, can you please tell me how I would use applescript to reference the unique identity of a highlighted text, for example? Do you meant to say that I would be able to set up a link or some other kind of pointer to this text string in another document - for example Scrivener or Mellel?

Also, I’m working with quite a few documents that basically jpegs converted into PDFs. I’ve tried to OCR them with Devon, Adobe 8 pro, etc. and it’s pretty hopeless. I don’t mind reading from the jpeg pdfs but some of them have quite a few pages and I would like to cut/copy & paste or annotate on parts of them. I’m assuminng that I could add notes in Skim - but would these notes also have a unique identity?

For DT - does DTP have some kind of typerwriter-like tool that adobe has, or any other kinds, for working with pdfs that are not text searchable?

Finally - a very newbie question. How exactly do I get a note window in DT so I can take notes side by side on the pdf? I’m almost embarassed to ask as I’m sure this is something pretty straightforward and probably staring me in the face but for the life of me, I can’t seem to do it

Thanks!!!

Gina

@yadayoda

were I not dealing with a minor crises with my thesis work right now, I would provide a more developed solution, but here are the portions that I already have:

I like to do my annotations right in skim, using ‘text notes’ if the annotations are of reasonable length and the PDF has decent margins, or using ‘anchored notes’ if such is not the case. This way, you may (if you save as PDF w/ separate .skim notes) still have 2 files, I feel that there is a ‘tighter’ connection between them than if you used an rtf, for example, with ‘links’ back to ‘locations’ in skim.

I use OmniOuliner Professional to handle passages from Skim, though I would imagine DTP’s sheets would, assuming that they are sufficiently scriptable (I assume they are, but haven’t checked), also work. Set up some columns to handle document location (unfortunately, OO doesn’t, as far as I can tell, support link creation through applescript), page number, note number, if you are doing academic work, cite key, and, of course, one for the actual annotation.

Then use the following snippet to get the relevant info from Skim. It processes either the selected note, or if you have simply selected some text, grabs that selected text (this would be useful if you wanted to have the script go ahead and create a note for that selection; unfortunately, it seems that the AS implementation of note creation is broken right now). I also have included a reference to a subroutine that is commented out. This would process the highlighted text, for example, creating one paragraph from each of the lines.

Unfortunately, I have not been able to actually extract the unique identity of a note through applescript (looks like I was wrong about it being trivial). The note does have one (note 3 of page 4 of document 1), but the closest I can get is the page number.

tell application "Skim"
	try
		-- Basic error handling
		if (count of documents) is 0 then
			error "No documents found."
		end if
		
		--Setting basic properties
		tell document 1
			set theName to name
			set thePath to path
			--set theFile to file
			set theSelection to {a reference to selection} -- Reference to selection
		end tell
		
		--Determining if the selection is text of a note and getting content accordingly
		if (length of (get selection of document 1)) is 0 then
			
			--If selection is a note (or null)
			set theNote to active note of document 1
			if theNote is missing value then display dialog "Nothing Selected" --Null selection
			set thePage to the last item of (get pages for theNote)
			
			--Determining note type and getting appropriate information
			if ((type of theNote) is anchored note) then
				set theText to (get text for theNote) & ":" & return & (get extended text of theNote)
			else
				set theText to get text for theNote
			end if
		else
			--The selection is not a note, but some text
			set thePage to the last item of (get pages for theSelection)
			set theText to get text for theSelection -- Text of selection
			--set theText to my processText(theText)	
		end if
		
		set pageNumber to label of thePage -- Page number
		
	end try
end tell

The script returns:

theName: Document name
thePath: Document path
theText: Text of selection (either the note or the selected)
pageNumber: Page number

Later today, I will include a link to a service I made with ThisService and an OOP template for how I work with Skim. If you show package contents on the service you will basically find the above script. It only works on a selected note, not selected text, however.

I was finally able to implement something close to the system I had in mind. This script takes a either a selected note or a text selection from Skim (if it is a text selection, it also creates a highlight note) and creates a form in a DTP sheet with the selected text or the content of the note, the page number, and the bounds of the note.

--SkimToDTP
--Matt Gacy
--v 1.0.2

property pDestinationGroup : "Reading Notes"

tell application "Skim"
	try
		-- Basic error handling
		if (count of documents) is 0 then error "No documents found."
		
		--Setting basic properties
		set theDocument to front document
		tell theDocument
			set theName to name
			set thePath to path
			set theSelection to {a reference to selection} -- Reference to selection
			set thePage to current page
			set pageNumber to label of thePage
		end tell
		
		--Determining if the selection is text or a note and getting content accordingly	
		if (length of (get selection of document 1)) is greater than 0 then --Selection is text, create highlight note
			set theText to get text for theSelection
			set theText to my processText(theText)
			set theNote to make note with properties {type:highlight note, selection:theSelection} at end of notes of theDocument
		else
			--If there is a selection, it is a note
			set theNote to active note of theDocument
			if theNote is missing value then error "Nothing Selected"
			
			--Determining note type and getting appropriate information
			set theText to get text for theNote
			if ((type of theNote) is anchored note) then ¬
				set theText to theText & ":" & return & (get extended text of theNote) as text
		end if
		
		set noteBounds to get bounds for theNote --as string
		set noteBounds to noteBounds as string
	end try
end tell

tell application "DEVONthink Pro"
	try
		if not (exists current database) then error "Please open a database before using this script!"
		set incomingGroup to create location pDestinationGroup
		set refSheet to "Books"
		
		--Checking if refSheet exists, creating it if not
		if not (exists child refSheet of incomingGroup) then
			set refSheet to create record with {name:refSheet, type:sheet, columns:{"Text", "Page", "Bounds"}} in incomingGroup
		else
			set refSheet to child refSheet of incomingGroup
		end if
		
		create record with {name:theName, type:form, cells:{theText, pageNumber, noteBounds}, path:thePath} in refSheet
		
	on error error_message number error_number
		if the error_number is not -128 then display alert "DEVONthink Pro" message error_message as warning
	end try
end tell


on processText(theText) --v 1.1; This cleans up the text a little bit
	set theText to every paragraph of theText as styled text
	set AppleScript's text item delimiters to "- "
	set theText to text items of theText -- makes list of space-delimited substrings of s 
	set AppleScript's text item delimiters to ""
	set theText to theText as Unicode text -- rejoins list into string with empty delimiters
	
	--Removing unwanted characters at the end of the passage
	repeat while (last character of theText is " ") or (last character of theText is ".")
		set theText to characters 1 thru -2 of theText as styled text
	end repeat
	
	return theText
end processText

This is the beginning of a script that would take that information from the DTP form and go to the corresponding page. It still needs some work, though. I wasn’t quite sure of the best way to script its interaction with the sheet.

tell application "DEVONthink Pro"
	try
		if not (exists think window 1) then error "No window open."
		if not (exists content record) then error "No document selected."
		
		set thisRecord to content record
		if (thisRecord is missing value) or (type of thisRecord is not form) then error "Please select a form."
		
		tell thisRecord
			set thePath to path
			set theCells to cells
			set thePage to item 2 of theCells as integer
			set theBounds to item 3 of theCells as string
		end tell
		
	on error error_message number error_number
		if the error_number is not -128 then display alert "DEVONthink Pro" message error_message as warning
	end try
end tell

tell application "Skim"
	set theDoc to POSIX path of thePath
	open theDoc
	set theDocument to front document
	set notePage to page thePage of theDocument
	
	repeat with aNote in (get notes of notePage)
		set aBounds to bounds of aNote as string
		if aBounds is equal to theBounds then
			tell theDocument to go to aNote
			exit repeat
		end if
	end repeat
end tell

Unfortunately, its reliance upon note boundaries means that it won’t work if you move them, but it will at least still get you to the right page.

@ yadayoda

Gina, the trick is to open each document, the PDF and the note, in its own window. Even on my 13.3-inch ModBook screen I can place a narrow rich text note window alongside a PDF window with the PDF text at a readable magnification.

I give the note the same name as the PDF, plus a bit more, perhaps designating the page range of the PDF document to which the note refers. So if I’m making notes referring to a PDF named Harry, my RTF document would have the name Harry 13-15, cueing me that it refers to pages 13-15 in the Harry PDF. (I don’t display the file type suffix in the names of my database documents.)

Thus, while I’m looking at a PDF I can do a Lookup on its Name. That will open a Search window. For quick response I limit the search to Name. Now the results will list that PDF plus all my annotation notes for that PDF.

And as I usually insert a hyperlink to the referred PDF in each note/annotation document, I can “jump” from those notes to the PDF.

True, this isn’t as elegant as Skim, but it has two current advantages for getting stuff done. 1) my notes are searchable in my database; 2) this approach works for all file types, including other text documents that I don’t want to mark up, HTML, WebArchive, Pages, Excel and so on.

Tip: When annotating or making notes about a document that’s open under it’s parent application, I use a floating DEVONnote window to make notes, then copy/paste that content into a new rich text document in my database. If I didn’t have DEVONnote, I could create a narrow TextEdit window alongside the Pages or Excel window into which notes can be entered, then copy/pasted into my database.