I need to get images and corresponding information fields from several catalog sites and put them in database or spreadsheet format. My plan is to import the sites, extract HTML images and then import image files, then extract text from product page. I suppose it will be doable if the pages have a standard structure, which is likely. I have a vague intuition that DTPro and AppleScript could help with this but I’m not familiar with the kind of processing these will allow. Would appreciate any tips from those who might be using DTPro in a similar way and any pointers to AppleScripts or plugins which could be used or modified to perform this task in DTPro in a more automated fashion than I could with a simple web browser. Thanks in advance for any insights.
Extracting embedded images is easy, everything else is probably much more complicated depending on the sites and your actual goals:
tell application id "DNtp"
set theURL to "http://..."
set theSource to download markup from theURL
set theLinks to get embedded images of theSource base URL theURL
repeat with thisLink in theLinks
-- Insert code here
end repeat
end tell
Thanks! This will get me started, for sure.
Tried various things but am unable to import the images.
“download URL thisLink” puts the data into the Appelscript result. How to save as JPG?
“import URL thisLink” requires a POSIX path which does not exist, only a URL. Don’t know how I get the images into DTP and how I specify where they are saved. Any help appreciated.
For example, this script will add all of the images on DEVONtechnologies’ home page to the DEVONthink Download queue (Window > Download Manager) from which you could choose the items you want and their destination in DEVONthink or to the Download folder in the file system.
tell application id "DNtp"
set theURL to "http://www.devontechnologies.com"
set theSource to download markup from theURL
set theLinks to get embedded images of theSource base URL theURL
repeat with thisLink in theLinks
set theResult to add download thisLink
end repeat
end tell