Question 1

Which vision models are included?

Accepted Answer

VisionTagger includes four preconfigured vision models: Qwen2.5-VL 7B Instruct, Gemma 3 4B IT, InternVL3 8B Instruct, and Pixtral 12B. Smaller models generally run faster, while larger models may produce higher-detail output but require more memory, depending on your Mac and chosen settings. Use the trial to compare models and tweak parameters until the results match your workflow and preferred level of detail.

Question 2

Can I use my own models?

Accepted Answer

Yes. If you have a GGUF-compatible vision model and its matching projector file (also GGUF), you can link them in VisionTagger and use them like the built-in options. You are responsible for ensuring your use of third-party models complies with their licenses and terms.

Question 3

Does VisionTagger require an internet connection?

Accepted Answer

VisionTagger runs locally and does not upload your images or generated metadata. An internet connection is only needed to download models in-app and to check for and download app updates.

Question 4

How does the free trial work?

Accepted Answer

The free trial lets you process up to 100 images at no cost, with no time limit. You can explore the full workflow—model selection, built-in sections, custom fields, and export options—before purchasing.

Question 5

Which image formats and sources are supported?

Accepted Answer

VisionTagger supports common image formats such as JPEG, PNG, TIFF, HEIC, and WebP. You can select images from folders on your Mac or directly from your Photos Library.

Question 6

Can I customize the metadata fields?

Accepted Answer

Yes. In addition to built-in sections (Description, Keywords, Content & Style, Safety & Compliance), you can create custom sections and add your own fields. Each field supports a data type (Boolean, Text, or List of Texts) and its own prompt, so you can tailor exactly what the model extracts.

Question 7

What outputs can VisionTagger create?

Accepted Answer

VisionTagger can export JSON or TXT per image, or a single JSON/TXT file for an entire batch. It can also apply Finder tags. For XMP sidecars and embedding metadata into image files, VisionTagger integrates with [ExifTool](https://exiftool.org) (installed separately).

Question 8

Do I need to install ExifTool?

Accepted Answer

[ExifTool](https://exiftool.org) is only required for XMP sidecars and embedding metadata into image files. If you only export JSON/TXT or apply Finder tags, you do not need ExifTool.

Question 9

Can VisionTagger write back to my Photos Library?

Accepted Answer

Yes. VisionTagger can write metadata back to your Photos Library when you choose that output option. You will always see a publish summary before anything is written.

Question 10

Can I tune the model parameters?

Accepted Answer

Yes. In Settings you can adjust generation parameters such as temperature, max tokens, context length, top-P, and top-K using sliders. This helps you balance creativity versus consistency and control output length and detail.

Question 11

How fast is it, and what Mac do I need?

Accepted Answer

VisionTagger requires Apple Silicon (M1 or later) and runs on macOS Tahoe 26.0 or later. Speed depends on your Mac, the selected model, image resolution, and your chosen metadata fields. Smaller models typically run faster; larger models can produce higher-quality results but may require more RAM.

Question 12

How much disk space do models use?

Accepted Answer

Model downloads are stored locally. Plan for roughly 4–8 GB per model (varies by model).

Question 13

Will VisionTagger overwrite existing files or metadata?

Accepted Answer

VisionTagger shows a publish summary before writing any outputs and warns you if existing files may be overwritten. You can review the actions and confirm before anything is saved.

Question 14

Does VisionTagger collect any usage data or analytics?

Accepted Answer

No. VisionTagger does not include analytics or telemetry, and it does not upload your data. Licensing activation and update checks involve network requests as needed for those functions.

Generate image metadata. Locally.

Choose a model — or bring your own

Define your own metadata schema

Export metadata where you need it

Use cases

System requirements

From images to metadata — in six steps

One-Time Purchase

Frequently Asked Questions