Overview
Open-source LLM evaluation framework (DeepEval) and managed platform from Confident AI for systematic testing.
Details
Confident AI offers DeepEval, an open-source LLM evaluation framework, and a managed platform for systematic LLM testing. DeepEval provides Python-based evaluation similar to pytest with metrics for hallucination, relevance, faithfulness, and many other dimensions.
Tags
runtime, evaluation, open-source, testing