Hi, I have imported a new sheet and the behavior in DTP3 seems to make the first row of the sheet the column headings for the sheet. I want to bump the rows down and make the first row the values for the column heading.
Alternatively, I need to know the reference for the column heading I guess for AppleScript.
DTP3 seems to make the first row of the sheet the column headings for the sheet.
That’s pretty standard for tabular data.
Where did the sheet come from?
tell application id "DNtp"
set sel to content record
columns of sel
end tell
Note: There is a small change coming for referring to cells in a selected sheet.
a numbers csv file. it seems it used to import as no headers, but now the first row becomes a header row.
Nothing has changed with the import of CSV. The first row in a CSV file is assumed to be the column headers.
Also, the export options in Numbers make a difference.
- Exporting without table names will definitely treat the first row as column headers.
- Exporting with the table names will have A, B, C, etc. as column headers.
thanks bluefrog. I wrote my script to read row one as the headers now I am debating whether I should have it read the column headers as the actual headers. I am thinking the best option would be to use the headers as proper headers and start my data at row one, however, I am not sure I completely understand your script above. what I need to do is be able to say for all items in columns, test if each matches the relevant headers I want to gather data for. I am under the impression that columns is an array of the header values, is that right? what I need to is to know what column number represent the fields I want.
for example if my header row looks like:
Author Filename Case Filepath Recipient
and I want the Author and Recipient, I run through columns(1)-columns(5) test if it matches “Author” or “Recipient” and then log index 1 and index 5 as the author and recipient indices. does that make any sense at all?
Can you post your code and a screen capture showing the top section of your sheet?
you asked for it! it ain’t pretty but…
tell application id "DNtp"
tell think window 1
set theCustodianArray to {}
set columnArray to {}
set theMetaDataArray to {}
set currRow to 2
set currCol to 1
set currFile to 1
set currCounter to 1
set customMetaD to ""
set DatAuthor to "Author"
set DatCustodian to "Custodian"
set DatAllCustodians to "All Custodians"
set DatConfidential to "Confidential"
set DatFilePath to "File Path"
set DatFileName to "File Name"
set DatMD5Hash to "MD5 Hash"
set DatTitle to "Title"
set DatTimeCreated to "Date Time Created"
set DatSubject to "Subject"
set DatFrom to "From"
set DatTo to "To"
set DatCC to "CC"
set DatBCC to "BCC"
set DatDateTimeSent to "Date Time Sent"
set BUTTON_Returned to "More"
repeat until BUTTON_Returned is "Last"
set valueReturned to (display dialog ("Enter Column with Metadata" & columnArray) with title ("Column Entry") buttons {"Last", "More"} default button "More" default answer "")
set BUTTON_Returned to button returned of valueReturned
set new_column to text returned of valueReturned as number
copy new_column to the end of columnArray
end repeat
-- set columnArray to {new_column}
set rowCnt to number of rows
set colCnt to number of columns
if rowCnt ≠ -1 then -- A sheet is not selected in the frontmost window
-- repeat with theItem in columns
repeat while currCol ≤ colCnt
set datHeader to get cell at row 1 column currCol
-- set datHeader to theItem of columns
if datHeader = DatAuthor then
set authorColumn to currCol
end if
if datHeader = DatCustodian then
set custodianColumn to currCol
end if
if datHeader = DatAllCustodians then
set allCustodiansColumn to currCol
end if
if datHeader = DatConfidential then
set confidentialColumn to currCol
end if
if datHeader = DatFilePath then
set filePathColumn to currCol
end if
if datHeader = DatFileName then
set fileNameColumn to currCol
end if
if datHeader = DatMD5Hash then
set md5HashColumn to currCol
end if
if datHeader = DatTitle then
set titleColumn to currCol
end if
if datHeader = DatTimeCreated then
set timeCreatedColumn to currCol
end if
if datHeader = DatSubject then
set subjectColumn to currCol
end if
if datHeader = DatFrom then
set fromColumn to currCol
end if
if datHeader = DatTo then
set toColumn to currCol
end if
if datHeader = DatCC then
set ccColumn to currCol
end if
if datHeader = DatBCC then
set bccColumn to currCol
end if
if datHeader = DatDateTimeSent then
set timeSentColumn to currCol
end if
set currCol to currCol + 1
end repeat
repeat while currRow ≤ rowCnt
set file_name to get cell at row currRow column 1
set fullfileName to file_name & ".pdf"
repeat with theItem in columnArray
-- set newMetaName to get cell at row 1 column theItem
set newAuthorData to get cell at row currRow column authorColumn
--set newTagComplete to (newTagName & ": " & newTag)
set newCustodianData to get cell at row currRow column custodianColumn
set newAllCustodiansData to get cell at row currRow column allCustodiansColumn
set newConfidentialData to get cell at row currRow column confidentialColumn
set newFilePathData to get cell at row currRow column filePathColumn
set newMd5HashData to get cell at row currRow column md5HashColumn
set newTitleData to get cell at row currRow column titleColumn
set newTimeCreatedData to get cell at row currRow column timeCreatedColumn
set newSubjectData to get cell at row currRow column subjectColumn
set newFromData to get cell at row currRow column fromColumn
set newtoData to get cell at row currRow column toColumn
set newCCData to get cell at row currRow column ccColumn
set newBccData to get cell at row currRow column bccColumn
set newTimeSentData to get cell at row currRow column timeSentColumn
my transfer_data(fullfileName, newAuthorData, newCustodianData, newAllCustodiansData, newConfidentialData, newFilePathData, newMd5HashData, newTitleData, newTimeCreatedData, newSubjectData, newFromData, newtoData, newCCData, newBccData, newTimeSentData)
end repeat
set currRow to currRow + 1
end repeat
end if
end tell
end tell
on transfer_data(sub_file_name, sub_author_data, sub_custodian_data, sub_allcustodians_data, sub_confidential_data, sub_filepath_data, sub_md5hash_data, sub_title_data, sub_timecreated_data, sub_subject_data, sub_from_data, sub_to_data, sub_cc_data, sub_bcc_data, sub_timesent_data)
tell application "DEVONthink 3"
set recoList to lookup records with file sub_file_name
set recoListLength to length of recoList
if recoListLength = 1 then
set theRecord to item 1 of recoList
tell theRecord
--set sub_full_data to the tags of theRecord
set customMetaD to the custom meta data of theRecord
add custom meta data sub_author_data for "author" to it
add custom meta data sub_custodian_data for "custodian" to it
add custom meta data sub_allcustodians_data for "allcustodians" to it
add custom meta data sub_confidential_data for "confidential" to it
add custom meta data sub_filepath_data for "filepath" to it
add custom meta data sub_md5hash_data for "md5hash" to it
add custom meta data sub_title_data for "title" to it
add custom meta data sub_timecreated_data for "datetimecreated" to it
add custom meta data sub_subject_data for "subject" to it
add custom meta data sub_from_data for "from" to it
add custom meta data sub_to_data for "to" to it
add custom meta data sub_cc_data for "cc" to it
add custom meta data sub_bcc_data for "bcc" to it
add custom meta data sub_timesent_data for "datetimesent" to it
-- set the Author of theRecord to newTag
end tell
else
if recoListLength = 0 then
error "Zero records were found!"
else if recoListLength > 1 then
error "More than 1 record was found!"
end if
end if
end tell
end transfer_data
Edited: Use three backticks before and after the code to created fenced code clocks. Thanks.
So you’re trying to apply the cell content in each row as custom metadata to existing files in the database, looking them up by the name in column 1?
So, here is what i am trying to do:
I receive a folder of pdf files along with a spreadsheet (a load file) that has a column with the file name and then a bunch of additional descriptive data for each file. I want to import the folder of files, and then map the metadata from the spreadsheet into the custom metadata now available in DTP3.
To be more precise, some of the header fields (metadata types) I don’t need. So the app is looking for specific headings and identifying those columns associated with those sought after fields and then maps data from those fields back onto the files after it finds the pdfs in the database.
i understand my process if rather brute force and not foolproof, but generally it works if i can clean up the issue with row 1 now being the header row…
exactly like that, yeah. processing thousands to tens of thousands. the filename ideally would include the extension, but generally the way documents are produced is as .pdf or .tiff files and thus the actual file name (for example report.doc) is not the name of the file included in the database unless its produced as a native file (which would generally include .xls and .doc files). if the worst case was having all the metadata assigned to the pdfs that would be a good start.
i have talked to eric about this in the past, and it seems like potentially my conversations about metadata may have taken some hold in the advanced metadata features found in DTP3. still requires code like this to turn this into a real powerhouse for these giant file productions…but i am cheap and prefer technical solutions even if i have to work for them!
So you have potentially tens of thousands of rows in one document or multiple documents?
That would make a difference as timeouts might need to be applied.
And…
well, good question. there are multiple spreadsheets each providing metadata for thousands of documents. making a total of hundreds of thousands of documents. currently my goal was only to be able to process one “sheet” at a time for the thousand or so documents.
these files should only need to be processed once.
i use labels to tag my documents as hot, very hot, irrelevant, etc. in my databases.
s
Stay tuned and come back to this as we have some things going on in here that will likely help streamline this.
why don’t we talk offline about the realities (I can share actual exemplary files)?
Hold the Option key and choose Help > Report bug to start a support ticket and we can discuss more off to the side.