LLM Eval Harness
by charlie-morrison
v1.0.1
Evaluate LLM outputs systematically β run test suites, score responses for accuracy/relevance/safety, compare models, and detect regressions in AI applications.
Description
376
Downloads
1
Installs
2
Versions
Latest Changes
Install LLM Eval Harness with One Click
Get a managed OpenClaw server and install this skill from your dashboard. No SSH, no Docker, no configuration needed.
Deploy with ClawHost