Vision Mode

Learn how to use Vision Mode for visual analysis and screen capture

Promptly's Vision Mode enables comprehensive visual analysis capabilities, allowing you to analyze images, screenshots, and visual content using AI models that support vision.

What is Vision Mode?

Vision Mode allows you to capture and analyze visual content directly within Promptly. When enabled, it automatically captures your screen and sends the image along with your text prompt to vision-capable AI models for analysis.

Supported Models

Vision Mode works with the following AI models:

  • OpenAI: GPT-4o and other vision-enabled models
  • Anthropic: Claude 4 and other vision-capable Claude models
  • Google: Gemini 2.0+ models with vision support
  • Ollama: Vision models like LLaVA and Llama 3.2-Vision

Only models that support visual analysis will show the vision toggle option.

Enabling Vision Mode

Using the Vision Toggle

  1. Open Promptly's floating window
  2. Look for the eye icon (👁️) in the interface
  3. Click the eye icon to toggle Vision Mode on/off
  4. When enabled, the icon will be highlighted to indicate Vision Mode is active

Using Keyboard Shortcut

You can also enable Vision Mode using the keyboard shortcut:

  • Default: Cmd+Shift+S
  • This shortcut can be customized in Promptly's Preferences

How Vision Mode Works

When Vision Mode is enabled:

  1. Automatic Screen Capture: Promptly automatically captures your current screen
  2. AI Analysis: The captured image is sent to your selected vision-capable AI model
  3. Combined Analysis: The AI analyzes both your text prompt and the visual content
  4. Contextual Response: You receive responses that take into account what's visible on your screen

Using Vision Mode Effectively

Best Practices

  • Be Specific: Describe what you want analyzed in the image
  • Clear Screens: Ensure relevant content is visible and not obscured
  • Context Matters: Provide additional context in your text prompt when needed

Example Use Cases

  • Code Review: "Analyze this code for potential issues"
  • Design Feedback: "What improvements can be made to this UI design?"
  • Data Analysis: "Summarize the trends shown in this chart"
  • Troubleshooting: "Help me understand what's wrong with this error message"
  • Content Creation: "Create alt text for this image"

Vision Mode with Custom Actions

Vision Mode integrates seamlessly with Custom Actions:

Auto-Vision Mode

  • Enable Auto-vision mode for specific custom actions
  • When triggered, these actions automatically capture screen content
  • Perfect for actions that regularly need visual context

Setting Up Auto-Vision Actions

  1. Open Promptly Preferences
  2. Navigate to Custom Actions
  3. Create or edit an action
  4. Enable the Auto-vision mode option
  5. Save your action

Now whenever this action is triggered, it will automatically include screen capture without needing to manually enable Vision Mode.

Visual Feedback

Promptly provides clear visual indicators for Vision Mode:

  • Eye Icon: Shows current Vision Mode state
  • Highlighted Toggle: Active when Vision Mode is enabled
  • Contextual Tooltips: Hover over the eye icon for status information
  • Model Filtering: Only vision-capable models show the vision toggle

Troubleshooting Vision Mode

Vision Toggle Not Visible

If you don't see the vision toggle (eye icon):

  • Ensure you've selected a vision-capable AI model
  • Check that your selected model supports image analysis
  • Switch to a supported model like GPT-4o or Claude 4

Screen Capture Issues

If screen capture isn't working properly:

  • Check macOS Privacy & Security settings
  • Ensure Promptly has Screen Recording permissions
  • Try toggling Vision Mode off and on again

Poor Analysis Results

For better vision analysis:

  • Use high-contrast, clear images
  • Ensure text in screenshots is readable
  • Provide specific questions about what you want analyzed
  • Consider the limitations of the AI model you're using

Privacy and Security

When using Vision Mode:

  • Screen captures are sent to your selected AI provider
  • Images are processed according to the AI provider's privacy policy
  • Consider sensitive information that might be visible on screen
  • Use Vision Mode thoughtfully with confidential content

Vision Mode makes it easy to get AI assistance with visual content, whether you're analyzing code, reviewing designs, or getting help with any visual information on your screen.