tao-generate-image-grounding

by NVIDIA

Two-step image grounding pipeline: extracts referring expressions from (image, caption) pairs and grounds them to pixel-space bounding boxes via a VLM. Use when the user wants to ground captions to bboxes, generate phrase-grounded annotations, auto-label images for grounding, or run the image_grounding pipeline. Triggers include 'image grounding', 'phrase grounding', 'ground captions', 'auto-label image grounding', 'image_grounding'.

Description

Installs

View on ClawHub

Install tao-generate-image-grounding with One Click

Get a managed OpenClaw server and install this skill from your dashboard. No SSH, no Docker, no configuration needed.

Deploy with ClawHost