Scripting dictionary access to "Statistics"

I would find it useful if the “Statistics” available in File > Database Properties > [name of database] were exposed through properties in the scripting dictionary.

An alternative would be a “Save” action on the Properties sheet which would save the statistics as a sheet (.csv) in the incoming group.

It should already be possible to calculate most statistics (except unique words), for example…


tell application "DEVONthink Pro"
	set totalWords to 0
	set numRichTexts to 0
	set theContents to (contents of current database) as list
	repeat with theRecord in theContents
		if type of theRecord is rtf or type of theRecord is rtfd then set numRichTexts to numRichTexts + 1
		set totalWords to totalWords + (word count of theRecord)
	end repeat
end tell

I realize that the brute force method of tallying all words is possible; I know that script. I’m looking for access to the tally that DEVONthink has in Database Properties. It is not efficient to do a brute force tally if the data in the property sheet is already available. Or does DEVONthink tally the statistics on the fly whenever Database Properties is viewed?

It does.

Well, then, here’s a brute force script to tally the record count and number of words in all open databases. Useful when examining whether you are near (or exceeded) the recommended size of open databases. This script is not guaranteed to work or to match the tallys in Database Properties. It takes a while to run - be patient.

-- 20130610
-- Get the total number of records and words for all open databases

tell application id "com.devon-technologies.thinkpro2"
	set all_databases to every database
	set grandTotalWords to 0
	set grandTotalRecords to 0
	set theDatabases to ""
	repeat with this_database in all_databases
		set totalWords to 0
		set recordCount to 0
		set theContents to (contents of this_database) as list
		repeat with theRecord in theContents
			set totalWords to totalWords + (word count of theRecord)
			set recordCount to recordCount + 1
		end repeat
		set theDatabases to theDatabases & (the name of this_database) & " (Records: " & (my comma_delimit(recordCount)) & ") -- Words: " & (my comma_delimit(totalWords)) & return
		set grandTotalWords to grandTotalWords + totalWords
		set grandTotalRecords to grandTotalRecords + recordCount
	end repeat
	set theDatabase to the current database
	set theText to theDatabases & return & "Total Words for all Open Databases (Records: " & (my comma_delimit(grandTotalRecords)) & "): " & (my comma_delimit(grandTotalWords))
	create record with {name:"Word and Record Count for Databases", type:rtf, rich text:theText} in display group selector
end tell

(*
	This "comma_delimit" subroutine is provided by Mac OSX Automation
	see: http://www.macosxautomation.com/applescript/sbrt/sbrt-02.html
	Credit: Ben Waldie
*)

on comma_delimit(this_number)
	set this_number to this_number as string
	if this_number contains "E" then set this_number to number_to_text(this_number)
	set the num_length to the length of this_number
	set the this_number to (the reverse of every character of this_number) as string
	set the new_num to ""
	repeat with i from 1 to the num_length
		if i is the num_length or (i mod 3) is not 0 then
			set the new_num to (character i of this_number & the new_num) as string
		else
			set the new_num to ("," & character i of this_number & the new_num) as string
		end if
	end repeat
	return the new_num
end comma_delimit



The recommendation was important on 32-bit systems. Now it’s less critical (as long as the 32-bit mode isn’t enabled on 64-bit systems).

ere
That’s interesting and probably not widely known among the customer base.

I suggest that adding a sticky topic in the forum containing the recommendation for 32-bit and 64-bit implementations.