NVIDIA

tao-generate-referring-expressions

by NVIDIA

Four-step image referring-expression pipeline: turns images plus KITTI bounding-box labels into region descriptions, scene captions, grounded referring expressions, and (optionally) verified expressions via VLM distillation. Use when the user wants to generate referring-expression annotations from images with KITTI labels, build region descriptions, produce grouped grounding phrases tied to bboxes, run a double-check verification pass on grounding expressions, auto-label traffic / scene images for referring datasets, or run the image_referring_expression pipeline. Triggers include 'referring expression', 'region description', 'KITTI labels', 'spatial relationship annotation', 'auto-label image referring expression', 'image_referring_expression'.

Description

1
Installs

Install tao-generate-referring-expressions with One Click

Get a managed OpenClaw server and install this skill from your dashboard. No SSH, no Docker, no configuration needed.

Deploy with ClawHost