charlie-morrison

LLM Eval Harness

by charlie-morrison v1.0.1

Evaluate LLM outputs systematically β€” run test suites, score responses for accuracy/relevance/safety, compare models, and detect regressions in AI applications.

Description

376
Downloads
1
Installs
2
Versions

Latest Changes

Install LLM Eval Harness with One Click

Get a managed OpenClaw server and install this skill from your dashboard. No SSH, no Docker, no configuration needed.

Deploy with ClawHost