Working on refining my scripts at spare time. I am only aware of the command to extract the full text of a pdf. How can I extract only the text on the first page of a pdf? I understand that a workaround is to get the first n paragraphs of the rich text content, but get the text in the first page is my goal…
Thank you in advance.
set {a} to item 1 of {selection}
set b to (rich text of a)
Thanks. It’s OK, I’ll extract the first n hundreds of words OR first n paragraphs in the text content of the pdf as a proxy - it’s good enough for me.
I am reluctant to use too many different tools to achieve any task. I think DT3 + Better Touch Tool + Text Expander are already giving me almost all I need given that all my tasks are within DT3. The rest is just adjusting the workflow and find a workaround by AppleScript.
Related to this, is it possible to use a content-driven query to set the boundaries of the extracted text? For instance if I want to extract the text that lies between “Abstract” and “Introduction?”
(Please be gentle. My scripting skills are rudimentary at best.)
What I actually want to do is extract the Abstract (and title) out to a separate document, with an eye toward manipulating a folder full of abstracts with other tools.
Sure. At least for the immediate need, they’re all from the same conference and follow the same format. These are recent, so they all come with a text layer. (OCR not needed.)