Follow-up from OpenAI ChatGPT for automatic generation of matching filenames - #3 by syntagm
ChatGPT works extremely well to get some logic into OCRed documents and PDFs, but would be nice to do this locally with llama2. I did a lot of playing around with it but wasn’t able to get it into something useful (yet).
First of all, here’s my script:
# function to generate a random string
on randomString(length)
set theCharacters to "abcdefghijklmnopqrstuvwxyz0123456789"
set theResult to ""
repeat length times
set theResult to theResult & character (random number from 1 to length of theCharacters) of theCharacters
end repeat
return theResult
end randomString
# store filecontent into a temporary txt file and return the path to it
on storeFileContent(filecontent)
-- set uniqueIdentifier to current application's NSUUID's UUID()'s UUIDString as text
set uniqueIdentifier to my randomString(20)
set posixtmpfile to POSIX path of (path to temporary items folder) & uniqueIdentifier & ".txt"
try
set fhandle to open for access posixtmpfile with write permission
write filecontent to fhandle as «class utf8»
close access fhandle
return posixtmpfile
on error
try
close access posixtmpfile
end try
end try
end storeFileContent
on processRecord(theRecord)
tell application id "DNtp"
if type of theRecord as text is "group" or (word count of theRecord) is 0 then return
set c to plain text of theRecord
# cut c to be max 8000 chars long, if it's longer than 8000. otherwise take the entire content
if length of c > 8000 then
set c to text 1 thru 8000 of c
end if
set posixtmpfile to my storeFileContent(c)
log "temporary filepath: " & posixtmpfile
# current date as "yyyy-mm"
set currentDate to text 1 thru 7 of (do shell script "date +'%Y-%m'")
set theCommand to "/opt/homebrew/bin/ollama run llama2:7b-chat-q5_K_M \"You are a filename generation AI. Given the following text, output exactly 1 descriptive filename option that could match this content and could be it's filename on disk. Output the possible file name in quotes. Use spaces instead of underscores to separate words. Do not output a file extension, only the name. Include date if applicable. Only output the filename and nothing else, do not chat, no preamble, get to the point. Your output format should be: `Filename: <your suggested filename>`\" \"$(cat " & posixtmpfile & ")\""
log "executing: " & theCommand
try
set theResult to do shell script theCommand
log "command result: " & theResult
display dialog theResult
-- set name of theRecord to theResult
on error errorMessage number errorNumber
log errorMessage
display dialog "Error: " & errorMessage & " (" & errorNumber & ")"
end try
end tell
end processRecord
on performSmartRule(theRecords)
tell application id "DNtp"
repeat with theRecord in theRecords
my processRecord(theRecord)
end repeat
end tell
end performSmartRule
-- this is for testing so we can just execute with osascript xxx.applescript and don't need to put it into a smartrule first
tell application id "DNtp"
set theRecords to selected records
my performSmartRule(theRecords)
end tell
How to setup llama2
The easiest method is ollama and I would recommend that, so download it from https://ollama.ai and run the instructions.
Run which ollama
on the command line to figure out where your ollama installation is. If you used brew install ollama
it’s gonna be in /opt/homebrew/bin/ollama
, but adjust this path in the script to whatever fits on your system
Picking a model
ollama can pull a bunch of models out of the box: library and you can pull them with ollama pull <model>
- 3b models are the smallest and the dumbest
- 7b models are bigger, but require at least 16gb of memory
- 13b models need at least 32gb of memory and are a good bit slower
Quantizations:
- the higher the
q
number in the model name, the higher the bit quantization. tl;dr: higher quantization = more memory, better performance. q4 is the normal one, q5 is a tad better. higher than q5 is probably not gonna give good results for 3-13b models so no need to try them (I think?) - K_M models are said to be the best mix of the bunch
Chat vs text vs instruct
- the
-chat
models are finetuned to work like a chat, so like chatgpt you say “hello model” and it responds to you like a chatbot - the
-text
models are default LLM models, so a very smart autocompletion engine. You write “Hello world, my name is llama, I am” and it generates text that should come after this. This is going to give you the best results, but needs more thinking how to tweak it - then there is stuff like
codellama:7b-instruct
which are for code completion, but the instruct models are said to be better at handling specific instructions (eg: “do X”)
Anyway I would recommend the following models (pull them with ollama pull <model>
:
llama2
- this is the default model and the same as7b-chat-q4_0
llama2:7b-q4_K_M
(using K_M instead of default)llama2:7b-chat-q5_K_M
(same but with q5)llama2:llama2:7b-q5_K_M
orllama2:7b-q4_K_M
- same as above, but non-chat models
all the models above, but with 13b
if your mac has enough memory and power
Testing with DEOVNthink
Chug the model and prompt into the script I posted above and run osascript xxx.applescript
while selecting a document in DEVONthink
The prompt I tinkered with is
You are a filename generation AI. Given the following text, output exactly 1 descriptive filename option that could match this content and could be it’s filename on disk. Output the possible file name in quotes. Use spaces instead of underscores to separate words. Do not output a file extension, only the name. Include date if applicable. Only output the filename and nothing else, do not chat, no preamble, get to the point. Your output format should be:
Filename: <your suggested filename>
" "$(cat " & posixtmpfile & ")"
my problem is that it flat out ignores my instructions. I tell it to not output a file extension but it outputs a file extension. It tell it to only output a filename and it responds with “Sure thing, based on the content you gave me, a filename could be: xxx.txt”
I got around this by doing some extra work, like extracing only text from the results that’s inside quotes, or trying to search for Filename: xxx
and extracting that, but sometimes it doesn’t even output that.
Then when the file content is long, it often just ignores my instructions completely and starts telling me things about the document lol
I think using the -chat
models is the wrong way to go at this and we have to use the non-chat models which is the way LLMs are designed to work, ChatGPT just spoiled us into the chat method. So a possible prompt could be:
This is the output of a program that outputs filenames based on content
Content:
---
Booking confirmation, flight XXX to YYY
Date: 2023-09-15
Passenger: XXX
---
Filename: Booking Confirmation
Content
---
blah blah blah
---
Filename: Something Else
Content
---
{actual file content here}
---
Filename:
The model will then complete the prompt, so hopefully output something more coherent, but needs more work extracting and telling it when to stop.
It’s also possible to tweak the models even more with Modelfile
to change temperature and system prompt, see: GitHub - jmorganca/ollama: Get up and running with Llama 2 and other large language models locally
FROM llama2
# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1
# set the system prompt
SYSTEM """
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
"""
If anyone has ideas or came up with a good way to get this to work, please share