I have been trying to remove Embedded links from the Sumary RTF documents generated by DTP before I import them into Scrivener. I am trying Regex without making anyheadway. I’ve tried uding (<a href[\s\S]?>[\s\S]?)|(\b(http|https)://.*[^ alt]\b) but that requires access to the RTF source. I’d like to clean the Doc in DTP. after generating the summary.
Has anyone solved the problem of removing Embedded links and the corresponding text from RTF files? I want to remove the embedded link and text.
E.g. via AppleScript, this example strips all rich text links:
tell application id "DNtp"
set theRecord to selected record 1
tell text of theRecord
set i to 1
set cnt to number of attribute runs
repeat while i ≤ cnt
if exists URL of attribute run i then
set text of attribute run i to ""
set cnt to cnt - 1
else
set i to i + 1
end if
end repeat
end tell
end tell
I am seeing an issue with this removing links within paragraphs as well as the initial link back to the original, e.g., from captured Wikipedia pages…
Original
Script results
Here’s my offering…
property localizedLine : {"Line:", "Leitung", "Ligne:"}
tell application id "DNtp"
if (count selection) ≥ 1 then
repeat with thisRecord in (selected records)
set {recName, recType} to {name, type} of thisRecord
if (recType as string) is in {"RTF", "RTFD"} then
tell text of thisRecord
repeat with textRun in (attribute runs)
if (exists URL of textRun) and (word 1 of textRun & ":" is in localizedLine) then
set text of textRun to ""
end if
end repeat
end tell
else
log message (recName as string) info ((recType as string) & " file(s) cannot be processed.")
end if
end repeat
end if
end tell
yielding…