Experimenting with llama2 LLM for local file classification (renaming, summarizing, analysing)

Follow-up from OpenAI ChatGPT for automatic generation of matching filenames - #3 by syntagm

ChatGPT works extremely well to get some logic into OCRed documents and PDFs, but would be nice to do this locally with llama2. I did a lot of playing around with it but wasn’t able to get it into something useful (yet).

First of all, here’s my script:

# function to generate a random string
on randomString(length)
  set theCharacters to "abcdefghijklmnopqrstuvwxyz0123456789"
  set theResult to ""
  repeat length times
    set theResult to theResult & character (random number from 1 to length of theCharacters) of theCharacters
  end repeat
  return theResult
end randomString

# store filecontent into a temporary txt file and return the path to it
on storeFileContent(filecontent)
--  set uniqueIdentifier to current application's NSUUID's UUID()'s UUIDString as text
  set uniqueIdentifier to my randomString(20)
  set posixtmpfile to POSIX path of (path to temporary items folder) & uniqueIdentifier & ".txt"

  try
    set fhandle to open for access posixtmpfile with write permission
    write filecontent to fhandle as «class utf8»
    close access fhandle

    return posixtmpfile
  on error
    try
      close access posixtmpfile
    end try
  end try
end storeFileContent

on processRecord(theRecord)
  tell application id "DNtp"
    if type of theRecord as text is "group" or (word count of theRecord) is 0 then return
    set c to plain text of theRecord

    # cut c to be max 8000 chars long, if it's longer than 8000. otherwise take the entire content
    if length of c > 8000 then
      set c to text 1 thru 8000 of c
    end if

    set posixtmpfile to my storeFileContent(c)

    log "temporary filepath: " & posixtmpfile

    # current date as "yyyy-mm"
    set currentDate to text 1 thru 7 of (do shell script "date +'%Y-%m'")

    set theCommand to "/opt/homebrew/bin/ollama run llama2:7b-chat-q5_K_M \"You are a filename generation AI. Given the following text, output exactly 1 descriptive filename option that could match this content and could be it's filename on disk. Output the possible file name in quotes. Use spaces instead of underscores to separate words. Do not output a file extension, only the name. Include date if applicable. Only output the filename and nothing else, do not chat, no preamble, get to the point. Your output format should be: `Filename: <your suggested filename>`\" \"$(cat " & posixtmpfile & ")\""

    log "executing: " & theCommand

    try
      set theResult to do shell script theCommand

      log "command result: " & theResult
      display dialog theResult

--      set name of theRecord to theResult
    on error errorMessage number errorNumber
      log errorMessage
      display dialog "Error: " & errorMessage & " (" & errorNumber & ")"
    end try
  end tell
end processRecord

on performSmartRule(theRecords)
  tell application id "DNtp"
    repeat with theRecord in theRecords
      my processRecord(theRecord)
    end repeat
  end tell
end performSmartRule

-- this is for testing so we can just execute with osascript xxx.applescript and don't need to put it into a smartrule first
tell application id "DNtp"
  set theRecords to selected records
  my performSmartRule(theRecords)
end tell

How to setup llama2

The easiest method is ollama and I would recommend that, so download it from https://ollama.ai and run the instructions.

Run which ollama on the command line to figure out where your ollama installation is. If you used brew install ollama it’s gonna be in /opt/homebrew/bin/ollama, but adjust this path in the script to whatever fits on your system

Picking a model

ollama can pull a bunch of models out of the box: library and you can pull them with ollama pull <model>

  • 3b models are the smallest and the dumbest
  • 7b models are bigger, but require at least 16gb of memory
  • 13b models need at least 32gb of memory and are a good bit slower

Quantizations:

  • the higher the q number in the model name, the higher the bit quantization. tl;dr: higher quantization = more memory, better performance. q4 is the normal one, q5 is a tad better. higher than q5 is probably not gonna give good results for 3-13b models so no need to try them (I think?)
  • K_M models are said to be the best mix of the bunch

Chat vs text vs instruct

  • the -chat models are finetuned to work like a chat, so like chatgpt you say “hello model” and it responds to you like a chatbot
  • the -text models are default LLM models, so a very smart autocompletion engine. You write “Hello world, my name is llama, I am” and it generates text that should come after this. This is going to give you the best results, but needs more thinking how to tweak it
  • then there is stuff like codellama:7b-instruct which are for code completion, but the instruct models are said to be better at handling specific instructions (eg: “do X”)

Anyway I would recommend the following models (pull them with ollama pull <model>:

  • llama2 - this is the default model and the same as 7b-chat-q4_0
  • llama2:7b-q4_K_M (using K_M instead of default)
  • llama2:7b-chat-q5_K_M (same but with q5)
  • llama2:llama2:7b-q5_K_M or llama2:7b-q4_K_M - same as above, but non-chat models

all the models above, but with 13b if your mac has enough memory and power

Testing with DEOVNthink

Chug the model and prompt into the script I posted above and run osascript xxx.applescript while selecting a document in DEVONthink

The prompt I tinkered with is

You are a filename generation AI. Given the following text, output exactly 1 descriptive filename option that could match this content and could be it’s filename on disk. Output the possible file name in quotes. Use spaces instead of underscores to separate words. Do not output a file extension, only the name. Include date if applicable. Only output the filename and nothing else, do not chat, no preamble, get to the point. Your output format should be: Filename: <your suggested filename>" "$(cat " & posixtmpfile & ")"

my problem is that it flat out ignores my instructions. I tell it to not output a file extension but it outputs a file extension. It tell it to only output a filename and it responds with “Sure thing, based on the content you gave me, a filename could be: xxx.txt”

I got around this by doing some extra work, like extracing only text from the results that’s inside quotes, or trying to search for Filename: xxx and extracting that, but sometimes it doesn’t even output that.

Then when the file content is long, it often just ignores my instructions completely and starts telling me things about the document lol

I think using the -chat models is the wrong way to go at this and we have to use the non-chat models which is the way LLMs are designed to work, ChatGPT just spoiled us into the chat method. So a possible prompt could be:

This is the output of a program that outputs filenames based on content

Content:
---
Booking confirmation, flight XXX to YYY
Date: 2023-09-15
Passenger: XXX
---
Filename: Booking Confirmation

Content
---
blah blah blah
---
Filename: Something Else

Content
---
{actual file content here}
---
Filename:

The model will then complete the prompt, so hopefully output something more coherent, but needs more work extracting and telling it when to stop.

It’s also possible to tweak the models even more with Modelfile to change temperature and system prompt, see: GitHub - jmorganca/ollama: Get up and running with Llama 2 and other large language models locally

FROM llama2

# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1

# set the system prompt
SYSTEM """
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
"""

If anyone has ideas or came up with a good way to get this to work, please share :slight_smile:

4 Likes

I think this is brilliant, thanks for sharing.

I’ve been testing this sort of stuff for some time, but local LLM’s severely underperform compared to GPT3.5. I guess we are not there yet. The option is to just pay OpenAI and use their API for now.

My use-case wishlist includes:

  • Automatic Tagging (Might need training on past criteria)
  • Summarize document
  • “Smarter” find related documents, and grouping

Llama3 is out

Llama3 is still not performant compared to online models.