LLM Eval Harness

by charlie-morrison v1.0.1

Evaluate LLM outputs systematically — run test suites, score responses for accuracy/relevance/safety, compare models, and detect regressions in AI applications.

Description

376

Downloads

Installs

Versions

View on ClawHub

Latest Changes

Install LLM Eval Harness with One Click

Get a managed OpenClaw server and install this skill from your dashboard. No SSH, no Docker, no configuration needed.

Deploy with ClawHost