Adding Llama Models
How to install Ollama and add local AI models to Promptly
Promptly supports running local language models through Ollama, including Llama, Mistral, and other open-source models. This guide will walk you through setting up Ollama and configuring models to use with Promptly.
What is Ollama?
Ollama is an open-source tool that lets you run large language models locally on your Mac. It provides a simple way to download, manage, and use various open-source AI models without sending your data to external services.
Benefits of using local models with Ollama include:
- Privacy: Your data stays on your device
- No API costs: Free to use as much as you want
- No internet required: Works offline
- Low latency: Responses can be faster than cloud services
Installing Ollama
- Visit the Ollama website
- Download the macOS installer
- Open the downloaded file and follow the installation instructions
- Once installed, Ollama will run in the background with a menu bar icon
Pulling Your First Model
Before you can use a model in Promptly, you need to download ("pull") it using Ollama:
Using the Ollama App
- Click the Ollama icon in your menu bar
- Select "Pull Model" from the menu
- Choose a model from the list or enter a specific model name
- Wait for the download to complete (this may take several minutes depending on the model size)
Using Terminal
You can also pull models using the Terminal:
# Pull the Mistral 7B model
ollama pull mistral
# Pull Llama 3 8B
ollama pull llama3
# Pull a specific model version
ollama pull codellama:7b
Popular models to start with:
mistral
- Balanced performance and size (7B parameters)llama3
- Meta's Llama 3 model (8B parameters)gemma:2b
- Google's smaller Gemma modelcodellama
- Specialized for coding tasks
Adding Ollama Models to Promptly
Once you've pulled a model with Ollama, add it to Promptly:
- Open Promptly's Preferences (⌘,)
- Navigate to the Models tab
- Find the Ollama group (or create it if it doesn't exist)
- Click the "+" button
- Enter the model details:
- Display Name: A user-friendly name (e.g., "Mistral 7B")
- API Name: Must be prefixed with "ollama:" followed by the model name (e.g., "ollama:mistral")
- Click "Add" to save the model
Hardware Considerations
Local models require significant system resources:
- Memory: Models can use 4GB-32GB RAM depending on their size
- Storage: Model files range from 2GB to 20GB
- CPU/GPU: Models run faster with Apple Silicon Macs or Macs with dedicated GPUs
For the best experience:
- Start with smaller models (7B or less) on machines with limited resources
- Ensure you have at least 8GB of RAM, preferably 16GB+
- Apple Silicon Macs (M1/M2/M3) provide significantly better performance
Troubleshooting
If you're having issues with Ollama models:
- Ensure Ollama is running: Check for the Ollama icon in your menu bar
- Verify model is pulled: Open Terminal and run
ollama list
to see available models - Check API name: The API name in Promptly must exactly match the model name in Ollama, prefixed with "ollama:"
- Restart Ollama: Sometimes restarting the Ollama service can resolve connection issues
- Check system resources: If your Mac is low on memory, models may fail to load
Advanced: Customizing Models
Ollama supports creating custom model configurations using Modelfiles:
# Example Modelfile for a custom Mistral configuration
FROM mistral
# Set different parameters
PARAMETER temperature 0.7
PARAMETER top_p 0.9
# Add a custom system message
SYSTEM You are a helpful AI assistant specialized in explaining complex topics simply.
Save this to a file named "Modelfile" and create your custom model:
ollama create mycustom -f ./Modelfile
Then add it to Promptly with the API name "ollama:mycustom".