Multimodal Image Understanding
by Rytia
v1.0.0
通过调用多模态模型来理解图片内容。触发场景:(1) 用户要求分析/描述/提取/OCR 图片信息,且当前模型不支持图像输入(如 deepseek-v4、glm 5.1 等纯文本模型),(2) 用户明确要求"用我的视觉模型"或"调用多模态 API"来看图,(3) 用户显式调用本 skill(/multimodal-i...
Description
50
Downloads
1
Stars
1
Installs
1
Versions
Latest Changes
Install Multimodal Image Understanding with One Click
Get a managed OpenClaw server and install this skill from your dashboard. No SSH, no Docker, no configuration needed.
Deploy with ClawHost