I try to write a script that can search through multiple .html files and give me an answer if it find any words I’m looking after in the source code or if I want to download the file it founds.
I really need help to get this to be correct.
This is my script I have worked with so far and I’m very new on this…:
property search_string : ""
tell application id "com.devon-technologies.thinkpro2"
activate
set theSelection to the selection
if theSelection is not {} then
set downloadstate to false
set downloadStatus to false
set downloadcheck to false
try
repeat
set search_string to display name editor "Type any keywords you want to search after"
if search_string is not "" then exit repeat
end repeat
show progress indicator "Searching links to download......" steps (count of theSelection) with cancel button
repeat with theRecord in theSelection
set this_URL to the URL of theRecord
set this_source to the source of theRecord
set these_links to get text of this_source --base URL this_URL
step progress indicator theSelection
show progress indicator "Searching links to download..." steps (count of theRecord) with cancel button
repeat with this_link in these_links
if the this_link contains "<a href=" or the this_link contains "<img src=" or the this_link contains "title=" or the this_link ends with search_string or the this_link contains "<img width=" or the this_link contains "src=" then
set downloadStatus to false
set downloadstate to false
set downloadcheck to true
end if
if downloadcheck is true then
if the this_link contains "<a href=" or the this_link contains "<img src=" or the this_link contains "title=" or the this_link ends with search_string or the this_link contains "<img width=" or the this_link contains "src=" then
display dialog "found!"
--add download this_link referrer this_URL without automatic
else
if downloadcheck is false then
step progress indicator (name of theRecord) as string
end if
end if
end if
end repeat
if cancelled progress then
hide progress indicator
return
end if
end repeat
hide progress indicator
if downloadstate is false then
display dialog "Couldn't find any files containing the keywords"
end if
on error error_message number error_number
hide progress indicator
if the error_number is not -128 then display alert "DEVONthink Pro" message error_message as warning
end try
end if
end tell
The best thing would I give an option to choose if I want to download a file (if it should find any source that can be downloaded) or just say that it find the keyword and which page it founds from (because this will be running through multiple html files. Please can someone help me with this…
If you want to write a new better code, you are really welcome to do that…
Thanks in advanced and it’s really appreciated
A simple solution could just use the “get links of” command and its parameters:
tell application id "com.devon-technologies.thinkpro2"
try
set theSelection to the selection
if theSelection is {} then error "Please select some documents."
repeat
set search_string to display name editor "Type the string you want to search after"
if search_string is not "" then exit repeat
end repeat
show progress indicator "Searching links to download......" steps (count of theSelection) with cancel button
repeat with theRecord in theSelection
set this_URL to the URL of theRecord
set this_source to the source of theRecord
-- All links with a matching URL
set all_links to get links of this_source base URL this_URL
repeat with this_link in all_links
if this_link contains search_string then add download this_link referrer this_URL without automatic
end repeat
-- All links with a matching description
set these_links to get links of this_source base URL this_URL containing search_string
repeat with this_link in these_links
add download this_link referrer this_URL without automatic
end repeat
if cancelled progress then exit repeat
end repeat
hide progress indicator
on error error_message number error_number
hide progress indicator
if the error_number is not -128 then display alert "DEVONthink Pro" message error_message as warning
end try
end tell
But parsing the complete HTML source code on your own and handling images too would require much more work.
Thank you very much for the script Christian
But I wonder if you can help me to extend this script further so I can get a dialog who says that the string was found and in which file (if I had a multiple selection) it found it in and give me an option to download it too, So i can choose myself if I only want to know if the string was found and if I also want to download the link…
tell application id "com.devon-technologies.thinkpro2"
activate
try
set theSelection to the selection
if theSelection is {} then error "Please select some documents."
repeat
set search_string to display name editor "Type the string you want to search after"
if search_string is not "" then exit repeat
end repeat
show progress indicator "Searching links to download......" steps (count of theSelection) with cancel button
repeat with theRecord in theSelection
set this_URL to the URL of theRecord
set this_source to the source of theRecord
-- Get all links with a matching description
set these_links to get links of this_source base URL this_URL containing search_string
-- Add all links with a matching URL
set all_links to get links of this_source base URL this_URL
repeat with this_link in all_links
if this_link contains search_string then set these_links to these_links & this_link
end repeat
repeat with this_link in these_links
display alert ((name of theRecord) as string) message this_link as warning buttons {"Cancel", "Skip", "Download"} default button 3
set this_button to the button returned of the result
if this_button is equal to "Download" then
add download this_link referrer this_URL without automatic
else if this_button is equal to "Cancel" then
error number -128
end if
end repeat
if cancelled progress then exit repeat
end repeat
hide progress indicator
on error error_message number error_number
hide progress indicator
if the error_number is not -128 then display alert "DEVONthink Pro" message error_message as warning
end try
end tell
I wonder if you could expand this script so it’s possible to use regex syntax in the search string.
Wonder also if you could place one more button on the search results dialog there the function is to go to that page that is found from the search results.
Neither AppleScript nor DEVONthink’s commands support regular expressions, therefore this would be a major revision.
Unfortunately only up to 3 buttons are supported per alert. However, this script selects the currently processed item before displaying the alert:
tell application id "com.devon-technologies.thinkpro2"
activate
try
set theWindow to viewer window 1
set theSelection to the selection of theWindow
if theSelection is {} then error "Please select some documents."
repeat
set search_string to display name editor "Type the string you want to search after"
if search_string is not "" then exit repeat
end repeat
show progress indicator "Searching links to download......" steps (count of theSelection) with cancel button
repeat with theRecord in theSelection
set this_URL to the URL of theRecord
set this_source to the source of theRecord
-- Get all links with a matching description
set these_links to get links of this_source base URL this_URL containing search_string
-- Add all links with a matching URL
set all_links to get links of this_source base URL this_URL
repeat with this_link in all_links
if this_link contains search_string then set these_links to these_links & this_link
end repeat
repeat with this_link in these_links
set the selection of theWindow to {theRecord}
display alert ((name of theRecord) as string) message this_link as warning buttons {"Cancel", "Skip", "Download"} default button 3
set this_button to the button returned of the result
if this_button is equal to "Download" then
add download this_link referrer this_URL without automatic
else if this_button is equal to "Cancel" then
error number -128
end if
end repeat
if cancelled progress then exit repeat
end repeat
hide progress indicator
on error error_message number error_number
hide progress indicator
if the error_number is not -128 then display alert "DEVONthink Pro" message error_message as warning
end try
end tell
One thing I can’t understand is if the search syntax is limited to a number of characters.
I mean, when I try to search for a sentence like:
for example, that I know exist in one of the html files, It doesn't find this one. it doesn't find schema.org or schema either. So something is not right when I try to use the script.
So that was why I ask for regexp, but wonder if this script could be fixed without regexp so I could search for a long sentence for an example, is it possible to do this Christian with your script?
When I google around I found this: satimage.fr/software/downloa … age389.pkg that is a applescript plugin to support regex but I don’t know how to use this. Maybe this could be used on someway to get this to work
Ah, I see. Yes it would be very great if I could view the HTML page when it’s found from the source code or have a choice like the script could do now to skip to the next results if I get more results than one
tell application id "com.devon-technologies.thinkpro2"
activate
try
set theWindow to viewer window 1
set theSelection to the selection of theWindow
if theSelection is {} then error "Please select some documents."
repeat
set search_string to display name editor "Type the string you want to search after"
if search_string is not "" then exit repeat
end repeat
show progress indicator "Scanning HTML..." steps (count of theSelection) with cancel button
repeat with theRecord in theSelection
set this_URL to URL of theRecord
set this_source to the source of theRecord
if this_source contains search_string then
display alert ((name of theRecord) as string) message this_URL as warning buttons {"Cancel", "Skip", "View"} default button 3
set this_button to the button returned of the result
if this_button is equal to "View" then
set the selection of theWindow to {theRecord}
else if this_button is equal to "Cancel" then
error number -128
end if
end if
step progress indicator (name of theRecord) as string
if cancelled progress then exit repeat
end repeat
hide progress indicator
on error error_message number error_number
hide progress indicator
if the error_number is not -128 then display alert "DEVONthink Pro" message error_message as warning
end try
end tell