Convert hashtags to tag option - exists, but doesn't seem to work, and isn't documented

There is an option in preferences>import to “convert hashtags to tags”.

However, I can’t seem to get it to do anything, and it is not documented in the help. (I’ve tried tagging with # and @, in various document types, sent via drag and drop/ share sheets/ etc.)

This is a plain text file I created in TextEdit and saved into DEVONthink 3…

And an RTF…

In fact, the only oddity I see here is the case wasn’t preserved.

But thanks for the screencap! The import option needs to be added to the documentation.

I have the same issue: the Hashtags to Tags function only works intermittently — or, more accurately, it seems to work for some files and not for others.

I was going to do more testing before I reported it, but I think it’s something to do with whether the files was added to an indexed group in the Finder from another program (and therefore automatically appears in DT3) or is added to the indexed group from within DT3. Whether that’s actually the case (or if that’s even possible) I don’t know — as I said I was going to do more testing — but I could definitely get the command to work with the latter, and not always with the former.

Technically adding files to an indexed folder in the Finder is not the same as adding a file to an indexed group in DEVONthink.

Perhaps, they should be treated the same in this instance, but that’s for Development to assess.

PS: I indexed a Finder folder, added the two files in the Finder one at a time, and the hashtags converted as expected. However, creating a file in the indexed group and typing a hashtag obviously doesn’t add the Tag. That’s not an indexing nor an import operation. In smart rule parlance, that would be an On Creation event. :slight_smile: In this instance you could use Data > Tags > Convert Hashtags to Tags.

Hmm. So turns out if I save something to the finder sidebar called “Inbox,” that parses hashtags properly. But if I share something to DevonThink via the share menu, or drag something into the dock icon, or save something via the Sorter, it doesn’t parse them.

Sounds like a bug, different workflows should get the same end result with tagging.

if I share something to DevonThink via the share menu, or drag something into the dock icon, or save something via the Sorter,

If you examine smart rule events, you’ll see differing event triggers for these things. Should they all behave the same? Likely, but @cgrunenberg will have to weigh in on this.

Hi Jim,

Not at the desktop at the moment, so I can’t give the details, but I’m talking specifically about invoking the Hashtag to Tags script from the Data Menu, not about automatic invocation. Manual running of the feature works on some files, not on others.

Cheers…

1 Like

Hashtag to Tags script

? A script? Are you talking about a third-party script in the script menu?

(On my end I’m talking about automatic invocation, which seems to only work via the Finder “Inbox” and not via the other methods I’ve tried. Sounds like @brookter and I are both having similar, but distinct, issues with hashtag-to-tag workflows.)

It’s the standard function run off the Data > Tags menu, the one provided with the basic installation.

This same Data>Tags process is also hit and miss for me. Sometimes I find if I close DT and re-open, the conversion will then work. It also seems to get caught up on textfiles NOT originally created in DT. (in other words, I have fewer issues quickly generating a text file, then converting. It is text files imported from elsewhere, then converted, that seems to snag).

If of interest, i found an applescript for another program (https://c-command.com/forums/showthread.php/5539-Script-to-convert-hashtags-to-EagleFiler-tags),

and used it to create below AppleScript.
If you select a markdown/text file in devonthink, it will replace all tags with whatever tags it finds in the file. I’m no AppleScript expert but it works as far I have tested.

tell application id "DNtp"
	
	-- Get selected record(s) in Devonthink
	set selectedItems to selection
	
	-- Go through each record
	repeat with thisRec in selectedItems
		
		-- Get the text of the record (assuming text/markdown file)
		set _text to plain text of thisRec
		
		-- Use shell script to extract text after character "#", ending with a space/new-line etc
		
		-- UPDATED search line in v2, take care of hashtags in web links, we dont want their # in the link as tags..
		set grepResults to do shell script "grep -E -o " & quote & "(\\s|^)#\\S*" & quote & " <<< " & quoted form of _text & ¬
			"| sed -E " & quote & "s/#//g" & quote
		
		
		-- more info about above cryptic line...
		--
		-- - The line from v1 also found "#ch02lev2sec7" as tag in a text with link e.g.(https://www.oreilly.com/library/view/automotive-spice-in/9781933952291/ch02.html#ch02lev2sec7)):
		--
		-- which is not what we want, so I looked at the reg ex code what the problem is and found it
		--
		-- grep -E → E is "Add support for extended regular expressions " (needed on Mac I think I read)
		--
		-- The Regex command in applescript is: "(\\s|^)#\\S*"
		-- (It can be verified on this page: https://regexr.com/), but then note that the command is actually (\s|^)#\S* in rexeg, but you need to add an extra backslash (per backslash) if you write the command in apple script.
		--
		-- So the regex performs the following search:
		-- (\s|^) → (Search for whitespace, line break etc) OR (Search for start of row), thus prohibing tags with text prior a "#"
		-- and 
		-- #   → find character "#"
		-- \S* → and continue to match characters until we find a  space/line break etc (big S is inverse of s, and s itself is a special code in regex)
		
		
		-- OLD v1
		--		set grepResults to do shell script "grep -o " & quote & "#\\S*" & quote & " <<< " & quoted form of _text & ¬
		--			"| sed -E " & quote & "s/#//g" & quote
		
		-- transform the output to a list
		set grepResultsList to paragraphs of grepResults
		
		-- Clear the current records tag list
		set tags of thisRec to {}
		
		-- Add found tags one by one
		repeat with foundTag in grepResultsList
			
			set foundTagAsString to foundTag as string
			
			-- If the tag is an empty string (likely a # without text afterwards, like markdown header, skip it)
			if foundTagAsString is not "" then
				-- Add this tag to the records tag list
				set tags of thisRec to tags of thisRec & foundTagAsString
			end if
			
		end repeat
		
	end repeat
	
end tell

  • Update after 1st post, i found a problem in the regex string, should be fixed now.

(Please also note if you include it in a smart rule, please make sure to set “markdown” as search file kind in the filter, also note it clears all existing tags for the file and replaces them with whatever hashtags the file now contains.)

1 Like

Are you intending to clear the record’s Tags first?

Hi, yes it is intended for my specific use case yes.

Because, for markdown files I like them to be “master of tags”, using above code (as a starting point / example to build other AS on) i can use e.g. perhaps a smart rules to periodically scan a folder/files & update tags based on current text content. I mostly alter markdown files in other programs (like FSNotes).
(It happends i remove text & tag in a text file, then I dont want the file/record have old tags hanging on in Devonthink)

…if of interest again…here is my “Smart Rule” I tried successfully now numerous times on +200 markdown files with various tags. If you have many markdown files in the folder, be patient (200 files takes perhaps 20-40 seconds to update all tags from the markdown text using the script)

image

for info: The rule name in english is “Update tags on markdown files in FSNotes folder” (which is an indexed folder)

The script to enter in “Edit script” is:

on performSmartRule(selectedItems)
	tell application id "DNtp"
		
		-- Go through each record
		repeat with thisRec in selectedItems
			
			-- Get the text of the record (assuming text/markdown file)
			set _text to plain text of thisRec
			
			-- Use shell script to extract text after character "#", ending with a space/new-line etc
			
			-- UPDATED search line v2, take care of hashtags in web links, we dont want their # in the link as tags..
			set grepResults to do shell script "grep -E -o " & quote & "(\\s|^)#\\S*" & quote & " <<< " & quoted form of _text & ¬
				"| sed -E " & quote & "s/#//g" & quote
			
			
			-- more info about above cryptic line...
			--
			-- - The line from v1 also found "#ch02lev2sec7" as tag in a text with link e.g. (https://www.oreilly.com/library/view/automotive-spice-in/9781933952291/ch02.html#ch02lev2sec7)):
			--
			-- which is not what we want, so I looked at the reg ex code what the problem is and found it
			--
			-- grep -E ? E is "Add support for extended regular expressions " (needed on Mac I think I read)
			--
			-- The Regex command in applescript is: "(\\s|^)#\\S*"
			-- (It can be verified on this page: https://regexr.com/), but then note that the command is actually (\s|^)#\S* in rexeg, but you need to add an extra backslash (per backslash) if you write the command in apple script.
			--
			-- So the reg ex performs the following search:
			-- (\s|^) ? (Search for whitespace, line break etc) OR (Search for start of row), thus prohibing tags with text prior a "#"
			-- and 
			-- #   ? find character "#"
			-- \S* ? and continue to match characters until we find a  space/line break etc (big S is inverse of s, and s itself is a special code in regex)
			
			
			-- OLD v1
			--		set grepResults to do shell script "grep -o " & quote & "#\\S*" & quote & " <<< " & quoted form of _text & ¬
			--			"| sed -E " & quote & "s/#//g" & quote
			
			-- transform the output to a list
			set grepResultsList to paragraphs of grepResults
			
			-- Clear the current records tag list
			set tags of thisRec to {}
			
			-- Add found tags one by one
			repeat with foundTag in grepResultsList
				
				set foundTagAsString to foundTag as string
				
				-- If the tag is an empty string (likely a # without text afterwards, like markdown header, skip it)
				if foundTagAsString is not "" then
					-- Add this tag to the records tag list
					set tags of thisRec to tags of thisRec & foundTagAsString
				end if
				
			end repeat
			
		end repeat
		
	end tell
	
end performSmartRule

To run it, right click on the smart rule and press “Apply Rule”, then wait for it to completet (depends on # of files)

Since you over time will get tags which is no longer used, or create tags with similar name but not identical (because you might not remember what you wrote last time), it can be a good idea to re-visit devonthink tag section for the database every now and then to ensure you consistently enter same tag name (i for instance only use lower-case letters). Also a script could be made to walk through empty tags and highlight them for some kind of action.

This hashtag conversion doesn’t seem to work on tags that have an underscore in it, like #Teaching_ideas.

Running beta 5

This is confirmed but hashtags do not contain punctation. Obviously this could be extended for use in DEVONthink 3 but this would be non-standard and I wouldn’t suggest it personally.

Development would have to assess this.

Do any popular apps or online services support this?

“Convert hashtags to text” still does not work, AFAICT.

Yes, for one, Agenda supports use of hashtags internally in notes. When notes are exported from Agenda to DEVONthink (via Share extensions) as markdown, the hashtags are not converted to tags.