Follow up to Experimenting with OpenAI API for automatic classification and renaming
I refined the applescript to get rid of that classification server and just use the openai CLI directly. With a bit of prompt tuning I got amazing results. For example, turning:
into
Smart rule available here:
To install, make sure you install the openai CLI:
pip3 install openai
- Run
which openai
to get the full path of where the executable is - Copy the path and replace
/opt/homebrew/bin/openai
in the script with your path if it’s different - Update
set OPENAI_API_KEY to "xxx"
to use your OpenAI key
To use GPT3.5-turbo instead of gpt-4, change /opt/homebrew/bin/openai api chat_completions.create -m gpt-4
to -m gpt-3.5-turbo
or other model with larger context window.
You can also instruct it to extract additional information such as city names, dates, names and put them into the filename as well. For example, one prompt that I had very good results with is this adjustment, which will correctly find create filenames like: 2023-07 XXX Hotel Booking Confirmation 07/22 -> 07/24
:
set currentDate to text 1 thru 7 of (do shell script "date +'%Y-%m'")
set theCommand to "OPENAI_API_KEY='" & OPENAI_API_KEY & "' /opt/homebrew/bin/openai api chat_completions.create -m gpt-4 -g system \"You are a program designed to generate filenames that could match the given text. Output exactly 1 filename that could fit the content and nothing else. Include a date in the format yyyy-mm at the beginning of the filename if present in the content, otherwise use the current date, which is " & currentDate & ". Don't output a file extension, separate words by space. No preamble, no extra output, only output the filename. Keep it concise. If the file is a booking confirmation, include relevant city information and booking dates (mm/dd format, no year) (with -> arrow). For bus/flight tickets, include departure -> destination. For airports, only airport code, NO city name.\" -g user \"$(cat " & posixtmpfile & ")\""
Prompt only for better readability:
You are a program designed to generate filenames that could match the given text. Output exactly 1 filename that could fit the content and nothing else. Include a date in the format yyyy-mm at the beginning of the filename if present in the content, otherwise use the current date, which is " & currentDate & ". Don’t output a file extension, separate words by space. No preamble, no extra output, only output the filename. Keep it concise. If the file is a booking confirmation, include relevant city information and booking dates (mm/dd format, no year) (with → arrow). For bus/flight tickets, include departure → destination. For airports, only airport code, NO city name.
(obviously the more complex the prompt, the more will 3.5 choke on it. Use GPT4 for complex instructions)
I’ve also experimented heavily with using llama2 for this task so it doesn’t need to go to OpenAI servers but the results were pretty poor for the 7b and 13b models. I’ll create a separate thread about it so we can tinker together and maybe turn it into something useful