Questions on possible use-case in academic environment (law)

Hi,

As a non-user who just happened to stumble across this software (but having read up on it by now) I would love to have your feedback/expectations of whether DT would be able to do what I intend… So the question is not how it is done, but would you expect it be to be possible (easily)?

Situation: I am the editor of a law journal. I have the last 20+ years of issues as pdf-files (10 per year, one per paper-issue (which is approx. 44 printed pages with approx. 250,000 characters per issue, practically no pictures)) and consider getting DT to make better use of the knowledge contained therein.

Intention 1: use DT for cross-referencing.
Imagine a new decision is handed down by a court. I am working on the text to publish it in an upcoming issue and wish to add references to cases as they were published in my journal (eg: a judgment from 2022 references a decision from 2007 and I want to say “that was published in my journal in 2008, page 27”. ). In order to do that I would want search the pdf files (without splitting them up on a text-by-text basis) for the case reference.

Intention 2: research/writing.
I want to write an article and wish to dig through the existing pdf files to come up with relevant information using Boolean operations and things like NEAR. For this I added search results from DevonAgent to the database. Would DT cross-reference DA search results to contents of the 20+ years of pdf?

Intention 3: own database.
There are numerous accessible databases which DA is not able to crawl (login, language, silly interface etc…). Imagine I download all documents from the database that contain a reference to section 1 Sales of Goods Act - say 200 texts in HTML. Next I download all that reference to section 2 Sales of Goods Act - say 180 texts in HTML. Next I … and so on. There will be hundreds of duplicates, because if the text says “section 1 and 2” the database will spit it out under the first two searches. Add, that I will search different databases so that the various texts will not be exactly alike (one database will abbreviate the decision, one will provide it “as is” and the third will add a comment to it). Will DT mark/highlight/notify such duplicates and provide a feature that allows me to get rid of all but one?

Thanks in advance for your thoughts!
Tube

Hi @Tube

I think DT should do most of what you are looking for.

I use DT to hold working copies of legislation, case law, legal textbooks and journal articles which I then extract references from for submissions and advices.

Here’s an example of a case search (which may be slightly different to what you want):

The search terms were “stevedores NEAR contract”.

It found 8 cases and 6 textbooks/ articles in 0.37 seconds (not all are in the screenshot above). I must say DT’s search is really, really good.

I input the custom meta data with the case citation etc manually for each case (I try to do it each time I import a case into my database) but once it is in, you can use a script to automatically convert it to a case citation whenever you need to reference it (with or without a link back to the relevant page of the PDF). Here’s an example of a case citation that is created by the script I adapted from another user’s post when the pdf is selected (it’s triggered by clicking the “Cit” icon you can see in the menu bar of the screenshot above or by a Keyboard Maestro key trigger):

Scruttons Ltd v Midland Silicones Ltd [1962] AC 446

The link is to my reference database, so obviously won’t work for you if you click on it. When I click on it, it takes me to the page I was looking at when I triggered the citation link:

It’s saved to the clipboard so you can paste it into whatever you’re working on.

I think you’ll find that once you have the cases in your system and set up how you want them, a lot of other useful things will become apparent. For example, I use a script to copy and paste lists of selected case names with citations for lists of authorities for court. Wiki linking is handy for your case notes etc.

My experience has been that DT helps by highlighting suspected duplicates but can’t always get it 100% right, so you have to check before you start deleting.

FWIW I have about 11,000 pdf files in my reference database and about 4,700 markdown notes cross-referenced to the source texts.

There’s a learning curve, but it’s well worth it. :slightly_smiling_face:

8 Likes

Hi @stephenjw

a huge thanks for taking the time to answer this so thoroughly (extra points for B/L :wink: )!

I grasp what you say with the automated citation, but that will be some meters up the learning curve (which I can see from here …).

The case file reference should pinpoint the duplicates, so weeding them out should be doable, but knowing that most can be removed easily is a great asset.

Thanks and a great weekend
Tube

A pleasure, you too.

Best of luck with working out your setup. Let us know how you get on. :slightly_smiling_face:

Wow! I had no idea one could do this. I assume your script is customised to your particular fields but wonder if you would be prepared to share it or some explanation of how you worked it out please?

1 Like

Hi,

A brief update on the process: I have thrown a lot (at least feels like a lot) of data at DT and so far it works fine: the search function does what it is supposed to do and the duplicate finder found duplicates with little mistakes (14 out of 1,600 or so). Deleted them by hand because of trust issues regarding the automatic deletion function…

Next step: will add publication dates (date tags, I guess) to each issue of the journal so that I can sort search results by publication date, rather than the name/date of the pdf file. That way I can jump to the actual/original case rather than to a later case that only quotes/references the actual/original.

Coming to think of it: might in the end be easier to add the case file reference/date of decision plus a tag like “actual” to each of the published decisions in the metadata. Then searching the the case file reference + “actual”-tag should give me just that one result. Ah, something for next weekend. Anyone got an idea whether that could be automated?

Learning curve here I come …

Kind regards
Tube

Hi @wrothnie

Sure - here’s the script:

-- produces list of case names with citations for selected authorities
-- assumes custom metadata fields "Citation" for reported citation (eg [2019] 4 All ER 745) and "Neutcite" for neutral citation (eg [2017] UKSC 59)
-- assumes empty custom metadata fields for Citation & Neutcite have been prepopulated with underscore (ie "_")

tell application id "DNtp"
	try
		set the clipboard to ""
		
		set theSelection to the selection
		
		repeat with thisRecord in theSelection
			set caseName to the name without extension of thisRecord
			
			set theURL to reference URL of thisRecord
			
			set customMD to custom meta data of thisRecord
			-- get reported citation if available
			set mdCitation to (mdCitation of customMD)
			if mdCitation is "_" then
				set reportedCitation to ""
			else
				set reportedCitation to mdCitation
			end if
			-- get neutral citation if available
			set mdNeutcite to (mdNeutcite of customMD)
			if mdNeutcite is "_" then
				set neutralCitation to ""
			else
				set neutralCitation to mdNeutcite
			end if
			
			-- set result depending on whether values empty
			-- (a) case name
			if reportedCitation is "" and neutralCitation is "" then
				set theCitation to ""
				display alert "DEVONthink" message "No citation or neutral citation for " & caseName & " in Devonthink" as informational
				-- (b) case name + reportedCitation
			else if reportedCitation is not "" and neutralCitation is "" then
				set theCitation to reportedCitation
				-- (c) case name + neutralCitation
			else if reportedCitation is "" and neutralCitation is not "" then
				set theCitation to neutralCitation
				-- (d) case name + reportedCitation + neutralCitation
			else if reportedCitation is not "" and neutralCitation is not "" then
				set theCitation to reportedCitation & "; " & neutralCitation
			end if
			
			-- display alert "DEVONthink" message "Citation copied with link for " & caseName & " in clipboard" as informational
			
			set the clipboard to (the clipboard) & "[" & caseName & "](" & theURL & ")" & " " & theCitation
			
		end repeat
		
	on error error_message number error_number
		if the error_number is not -128 then display alert "DEVONthink" message error_message as warning
	end try
	
end tell

Here’s the previous post I adapted it from (thanks go to @bws950 & @pete31).

Caveat: I’m not very good at scripting - bluefrog would probably be able to do something in 3 lines which does all the above and pete31’s would additionally draft the legal submission and lodge a notice of appeal. :stuck_out_tongue_winking_eye:

The reason for the “if … mdCitation is “_”” line is so that it returns only the neutral citation if the case hasn’t been reported.

bws950’s version is a bit more sophisticated in that it has separate metadata fields for the court and first page, whereas mine lumps them into the one field, so that I can copy and paste it in one hit. You could probably do a smart rule with a regex search on the content (a bit like @chrillek’s date regex (which is very cool)) to pull that from the documents and automatically put it in the citation fields but it would likely be a bit hit and miss with some of the poorer quality older scanned reports.

If you want to have a version which copies the citation without the link (so that you don’t have the link cluttering the citation when you are pasting into a Word document, for example), you can change “set the clipboard to (the clipboard) & “[” & caseName & “](” & theURL & “)” & " " & theCitation” to

set the clipboard to (the clipboard) & " " & caseName & " " & theCitation

A question for @bluefrog or pete31 which has been niggling at me for some time is whether there is a workaround to avoid the script failing if the custom metadata field is empty. My current workaround is to have a smart rule that pre-populates the custom metadata fields with an underscore on importing the file into my inbox, but if I didn’t have to do that it would save a step and simplify the script.

Best of luck with it. :slightly_smiling_face:

Learning curve here I come …

Ha ha. Good job. It’s a long and winding road.

FWIW I do it by changing the creation date of the file to the date of the decision. The reason is so I can easily sort in chronological order in DTTG. I’ve never needed to know the date my copy of the decision was created.

You can do it from the inspector:

but there’s a bit of pointing and clicking involved which you can speed up with a script triggered by a keystroke. Here’s the script I’m using:

tell application id "DNtp"
	try
		set theRecords to the selected records
		if theRecords is {} then error "Please select some records."
		
		repeat with thisRecord in theRecords
			
			set currentName to name without extension of thisRecord
			set theCreationDate to creation date of thisRecord
			
			display dialog "Enter date of document in DDMMYYYY format for " & currentName & ". Creation date is currently " & theCreationDate default answer "" buttons {"Cancel", "OK"} default button 2
			set theDate to the text returned of the result
			
			set theYear to (characters -4 thru -1 in theDate) as string
			set theMonth to (characters -6 thru -5 in theDate) as string
			set theDay to (characters -8 thru -7 in theDate) as string
			
			try
				set year of theCreationDate to theYear as integer
				set month of theCreationDate to theMonth as integer
				set day of theCreationDate to theDay as integer
				set hours of theCreationDate to 0
				set minutes of theCreationDate to 0
				set seconds of theCreationDate to 0
				set creation date of thisRecord to theCreationDate
			end try
			
			set the name of thisRecord to currentName & " " & theDay & "." & theMonth & "." & theYear
		end repeat
		
	on error error_message number error_number
		if the error_number is not -128 then
			display dialog error_message buttons {"OK"} default button 1
		end if
	end try
end tell

Again, in case it is of use, if you already have the document date in the file name you can automatically pull from the name (I often get sent briefs with files already named that way). Here’s a basic one I use when the file name structure is “Name DD.MM.YYYY” which can be tweaked for other formats (eg YYYY-MM-DD). You could devise one that covered various permutations in conjunction with a smart rule.

-- Set creation date based on name "NAME DD.MM.YYYY”
-- assumes date is at the end of of the file name

tell application id "DNtp"
	try
		set theRecords to selected records
		if theRecords = {} then error "Please select some records"
		show progress indicator "Renaming... " steps (count theRecords) as string with cancel button
		
		repeat with thisRecord in theRecords
			set thisName to name without extension of thisRecord
			step progress indicator thisName
			set theCreationDate to creation date of thisRecord
			try
				set seconds of theCreationDate to 0
				set minutes of theCreationDate to 0
				set hours of theCreationDate to 0
				set day of theCreationDate to ((characters -10 thru -9 in thisName) as string) as integer
				set month of theCreationDate to ((characters -7 thru -6 in thisName) as string) as integer
				set year of theCreationDate to ((characters -4 thru -1 in thisName) as string) as integer
				set creation date of thisRecord to theCreationDate
			end try
		end repeat
		
		hide progress indicator
		
	on error error_message number error_number
		hide progress indicator
		if the error_number is not -128 then display alert "DEVONthink" message error_message as warning
	end try
end tell

I can see the logic, though, of storing the publication date in a custom metadata field. I haven’t done it that way because AFAIK DTTG doesn’t really do custom metadata (yet).

searching the the case file reference + “actual”-tag should give me just that one result

Do you need to tag by “actual”? When I search by case name, DT always by some special kind of magic seems to put the original decision at the top of the search and the cases referring to it below, eg:

1 Like

Great! Thank you.

The tip to change so that the cite doesn’t include the link to file is helpful since, as you guessed, yes I do usually put these into a Word document. (I do not want to imagine what would happen if I started sending law firms documents in markdown!)

I added a “tweak” to the no link version:

set the clipboard to (the clipboard) & “" & caseName & "” & theCitation

so that the case name appears in italics - in a markdown document which I can then convert to a Word doc - although it will probably be just as easy to select the pasted “no link” citation and apply the italics in Word.

It is interesting that you have a field “orig name” in the custom data when that needs to be the file’s title for the script to run.

Any way, thank you once again!

A pleasure.

Yes - “markdown” → “meltdown”(?).

The linked version is for my markdown notes and outlines for oral submissions in DT - handy for when you are on your feet.

The “orig name” is really just a place to store the original file name as an undo/backup for when I invent a cunning and sophisticated smart rule which automatically renames all the file names in my brief to something indecipherable and incomprehensible. :face_with_open_eyes_and_hand_over_mouth:

Edit: Italicise the case name - good idea.

Ok, I realise that my data is not uniform. I have several hundred pdf-files each of which is an issue of the journal (let’s call one of them „journal“) and a bunch (several thousands) of standalone decisions (let’s call these „decisions“) plus other stuff.

Now, the script you kindly provided is really great for adding the date to the decisions -THANKS! - (and I even got it to work - and when I grow up I will even be able to add an icon to DT). For the decisions the file name contains the file reference (but not the date, because the file ref is unique). I think I will be able to adjust the script to take the file ref rather than the date and add it as a tag.

Now with the journals, I could either split them into 7-15 separate files to have a number of decisions, or
A) I could leave them as is and add tags indicating an „actual“ decision (to tell me it starts on page 42 of the pdf) or
B) I just add the month of the journal as a date for all journal files and then sort any search results by date to find the „actual“ (or at least the first reference).

In my mind, the work involved in adding a reference to a specific page of the journal (as you did in your original reply) was less than the splitting + referencing steps.

My (probably CURRENTLY very limited) wish/need is to add references to new cases before I publish them. So, if a court in 2022 writes

„We refer to decision X vs Y [case ref ] [citation not-my-journal]“

I want to edit the 2022 decision to read:

„We refer to decision X vs Y [case ref ] [citation not-my-journal] = [citation of-my-journal]“

So that readers can find the decision in my journal. My thinking was that adding references to the journal files would serve that purpose and to get there I would only need the „actual“ tag without adding other metadata. The simple search (currently) just displays journals sich have the case file ref in them without ordering them by date (which would be the pdf-file date and not the publishing date)…

It’s not as adult as you might think :wink:. I found the manual a bit delphic (could be my limitations, not the manual). You just need to put the script in the “Toolbar” subdirectory of the “com.devon-technologies.think3” subfolder in “Application Scripts”. If you have the script icon in you DT menu bar:

(1) open your script folder:

Screen Shot 1

(2) put the script in the Toolbar s/d:

Screen Shot 2

(3) quit & restart DT

(4) right click on the grey area of the top bar in DT:

(5) select “Customize Toolbar”

You’ll see a window with a series of icons, including one for the script you just moved or copied into the Toolbar s/d (in my case, the icon named “Test”):

(6) As the instructions say, you drag it onto your DT toolbar.

If it helps, DT takes some of the pain out of splitting files (as Jim pointed out to me in January).

Wouldn’t that no longer be a problem if you are either setting the creation date or a custom metadata date field to the date of the subsequent decision? You can just sort by ascending date. In my Smith v Wilson example above, DT displays all the subsequent cases referring to Smith v Wilson in ascending date order:

Anyway, I think I see your logic. I guess the options are tagging, custom metadata, maybe a smart rule. If it would be useful for you to have an intermediate “bridging” linking note, you might find the Annotations inspector useful. I use it to make notes about cases with links to subsequent decisions to map lines of authority for a particular proposition.

Hi,

once again, thanks for the help!

For the citation issue I found an amazingly low-tech solution: Scan all yearly indexes of the journal, apply ABBYY to the scan and - presto - I have a searchable list of all decisions with only one hit per case :slight_smile:

That file might also be a perfect bridging note for the cases - quite an intriguing idea, thanks!

Wow - the PDF splitting/extraction feature is really nice!

As for the other issue: I fiddled around with smart rules and scripts only to find that I need more time to properly understand either…

1 Like