Skip to content

Generate image metadata. Locally.

Create rich, structured metadata from images using local vision models — without uploads, subscriptions, or data leaving your Mac.

Requires Apple Silicon Mac with macOS 26

VisionTagger will be available soon. Leave us a note and we'll email you on launch.

VisionTagger generated metadata for an image using local AI

Choose a model — or bring your own

Download preconfigured vision models in-app, or link your own GGUF model and projector files. Choose the model that best fits your images and quality requirements, then fine-tune generation with adjustable parameters to get consistent, repeatable results. All processing runs locally on Apple Silicon (M1 or later), using your Mac's performance instead of a cloud service.

VisionTagger model selection interface

Define your own metadata schema

Generate only the metadata you actually need. Enable built-in sections such as Description, Keywords, Content & Style, and Safety & Compliance — then extend them with custom sections and fields tailored to your workflow. For each field, choose a data type (Boolean, Text, or List of Texts) and write a prompt that instructs the model exactly what to extract. The result is structured metadata that matches your conventions and stays consistent across large batches.

VisionTagger content configuration showing customizable metadata sections and fields

Export metadata where you need it

Publish metadata in the format that best fits your pipeline. For XMP sidecars and embedded metadata, VisionTagger integrates with ExifTool — an industry-standard, widely trusted utility. Your metadata will appear in apps like Adobe Lightroom, Bridge, Capture One, Photo Mechanic, darktable, and any other software that reads XMP. Write back to your Photos Library, export JSON or TXT per image, or generate a single file for an entire run. Add Finder tags for fast organization in macOS. Select multiple outputs at once and configure them together — so one generation pass can feed every destination you use.

Example of VisionTagger publish configuration

Use cases

  • Anyone who wants to find that one image later by generating searchable descriptions, keywords, and tags for your library.

  • Photographers managing large shoots speed up culling, cataloging, delivery, and archive workflows with consistent metadata.

  • Designers organizing asset libraries make mood, style, subject matter, and usage easier to search across projects.

  • Researchers and archivists tagging collections create structured, consistent metadata for datasets, records, and long-term preservation.

System requirements

  • macOS Tahoe 26.0 or later

  • Apple Silicon required (M1 or later)

  • For optimal performance with larger models 16GB RAM or more is recommended

  • Model storage: plan for ~4–8 GB per model (downloaded locally)

From images to metadata — in six steps

One-Time Purchase

€29.99
Launch offer €24.99

VAT included (except US & CA)

Free trial: 100 images, no time limit
Single payment. No recurring fees.
Single user. Multiple Macs.

Secure payment via FastSpring

VisionTagger will be available soon. Leave us a note and we'll email you on launch.

Frequently Asked Questions

Which vision models are included?

VisionTagger includes four preconfigured vision models: Qwen2.5-VL 7B Instruct, Gemma 3 4B IT, InternVL3 8B Instruct, and Pixtral 12B. Smaller models generally run faster, while larger models may produce higher-detail output but require more memory, depending on your Mac and chosen settings. Use the trial to compare models and tweak parameters until the results match your workflow and preferred level of detail.

Can I use my own models?

Yes. If you have a GGUF-compatible vision model and its matching projector file (also GGUF), you can link them in VisionTagger and use them like the built-in options. You are responsible for ensuring your use of third-party models complies with their licenses and terms.

Does VisionTagger require an internet connection?

VisionTagger runs locally and does not upload your images or generated metadata. An internet connection is only needed to download models in-app and to check for and download app updates.

How does the free trial work?

The free trial lets you process up to 100 images at no cost, with no time limit. You can explore the full workflow—model selection, built-in sections, custom fields, and export options—before purchasing.

Which image formats and sources are supported?

VisionTagger supports common image formats such as JPEG, PNG, TIFF, HEIC, and WebP. You can select images from folders on your Mac or directly from your Photos Library.

Can I customize the metadata fields?

Yes. In addition to built-in sections (Description, Keywords, Content & Style, Safety & Compliance), you can create custom sections and add your own fields. Each field supports a data type (Boolean, Text, or List of Texts) and its own prompt, so you can tailor exactly what the model extracts.

What outputs can VisionTagger create?

VisionTagger can export JSON or TXT per image, or a single JSON/TXT file for an entire batch. It can also apply Finder tags. For XMP sidecars and embedding metadata into image files, VisionTagger integrates with ExifTool (installed separately).

Do I need to install ExifTool?

ExifTool is only required for XMP sidecars and embedding metadata into image files. If you only export JSON/TXT or apply Finder tags, you do not need ExifTool.

Can VisionTagger write back to my Photos Library?

Yes. VisionTagger can write metadata back to your Photos Library when you choose that output option. You will always see a publish summary before anything is written.

Can I tune the model parameters?

Yes. In Settings you can adjust generation parameters such as temperature, max tokens, context length, top-P, and top-K using sliders. This helps you balance creativity versus consistency and control output length and detail.

How fast is it, and what Mac do I need?

VisionTagger requires Apple Silicon (M1 or later) and runs on macOS Tahoe 26.0 or later. Speed depends on your Mac, the selected model, image resolution, and your chosen metadata fields. Smaller models typically run faster; larger models can produce higher-quality results but may require more RAM.

How much disk space do models use?

Model downloads are stored locally. Plan for roughly 4–8 GB per model (varies by model).

Will VisionTagger overwrite existing files or metadata?

VisionTagger shows a publish summary before writing any outputs and warns you if existing files may be overwritten. You can review the actions and confirm before anything is saved.

Does VisionTagger collect any usage data or analytics?

No. VisionTagger does not include analytics or telemetry, and it does not upload your data. Licensing activation and update checks involve network requests as needed for those functions.