Merge Metadata from one file to text of another

Geo · April 26, 2021, 2:07pm

Hello.

I’m brand new to DT, looking to have it be where I park all my digital life. Dipping the toe in, I used Andreas Schmidt’s Pocket to DT script (found here) and now have a database of .weblocs with structured data (title, creation date) and files in my finder with the title & body text my web articles. I’d like to merge the two so the metadata of the former and the text of the latter exist in one (markdown or .txt) file. Is there the need for another script? Is this even do-able? Or is this a built in feature and am I just being a n00b?

mfg

Geoff

cgrunenberg · April 26, 2021, 2:36pm

A script could probably handle this but this depends on the naming of the files.

Geo · April 26, 2021, 4:17pm

Ah, the AS script import brings every single file in with the files in titled as text.html. So the titles, frustratingly, don’t line up with those of the .weblocs which take their title (accurately) from the webpage.

Can you think of a way to script something to scan the body text to see if the title matches files in the other database and, if then, to merge them retaining the metadata of the destination but the body of the merge? That seems complex.

— Geoff

cgrunenberg · April 27, 2021, 7:40am

Which script do you currently use? In case of HTML pages Scripts > Rename > To Web Page Title might be able to change the title to the desired one. WARNING: This can’t be undone.

Geo · April 27, 2021, 9:03am

Christian,

Thanks for all your help here. I used Andreas Schmidt’s Pocket to DevonThink script from here on the forums (can’t paste links here for some reason). It’s possible that what it’s doing in the database is creating ‘groups’ of both .weblocs and the saved .html files and that I’m simply not good enough — yet! — and DT to figure this out.

The HTML pages seem to already be titled after the Web Page Title.

chrillek · April 27, 2021, 9:15am

Trustlevel - the forum software allows posting of links and images only after some posts/time. Since you joined only 19 hours ago …

Geo · April 27, 2021, 12:43pm

Ah, gotcha. Thanks Chrillek.

Geo · April 27, 2021, 4:07pm

I’ve just been bumped up in trust. Here is the link to the specific script:

cgrunenberg · April 27, 2021, 4:52pm

I don’t use Pocket and it’s hard to tell what your account contained, therefore screenshots of items that should be merged (ideally with a visible Info inspector) might be useful.

Geo · April 27, 2021, 5:02pm

Thank Christian. I’ve gone ahead and uploaded 3 screenshots of what I believe is the same document repeated thrice.

1: The .webloc import with nice Title, Created By, etc. data. (Database is DEVONthink 2)
2: Manually imported (drag & drop) files produced by the Pocket Import script. That’s a Rich Text. All titled web.
(3. I tested out the batch markdown convert. That’s markdown from 2.)

cgrunenberg · April 28, 2021, 7:37am

Unfortunately the “web” items have neither a useful name nor any other metadata (e.g. a URL) that could be used for matching the items. One is a Markdown document, another one an HTML document and there seem to be at least 3 related rich text documents (which might even be almost duplicates as the size is identical). Therefore it’s unclear what exactly should be merged.

Geo · April 28, 2021, 1:06pm

I figured as much, sadly. But I really appreciate all your help here!

Geo · April 28, 2021, 1:27pm

Sorry to jump back on here but I have discovered that despite their generic “web” titles, the files actually have precisely the metadata affixed to them. So it’s probably my user error and not a flaw in the script. I’ll need to get better at understanding the main window in DT.

andreas_schmidt · May 9, 2021, 4:12pm

Can’t help much with figuring out what happens on your systems. But probably there is no need for diving into that. To get an uncluttered page with all that meta data, you could follow these simple steps without the need to fiddle with that script:

Start with that .webloc record in screenshot 1
Go to the menu bar, script icon (between Window and Help) - Download - [As decluttered note or whatever the English command here is]
If you want a text or md record, command-click and select convert → to format of your choosing