MorganaBench SDK (Version 0.1)
Welcome to MorganaBench SDK
High-level overview
The MorganaBench SDK provides typed, validated Python schemas (Pydantic models) for representing:
Benchmarks (evaluation datasets) for evaluating RAG and agentic systems (includes inputs and expectations)
Executed benchmark results (inputs + expectations + outputs)
This SDK focuses on interoperable data shapes rather than on executing benchmarks itself.
What it is for
Load MorganaBench benchmark files into typed Python
mb.entities.Exampleobjects;Record executed benchmark results by populating
Example.outputsfor evaluation in MorganaBench;Example shapes are designed to interoperate with MLflow and LangSmith.
User Guide:
API Reference:
- API Reference
ChatMessageDateTimeMatcherEmailMatcherEnvironmentEqualsMatcherExampleExpectationsFreeTextMatcherInputMetadataInputsMissingMatcherNoToolCallAssertionOneParameterAssertionOptionalMatcherOutputsParameterGroupAssertionToolCallToolCallAssertionToolResultTurnMetadataValueMatcherparameter_assertion()
Indices and tables
Changelog
We follow Semantic Versioning (semver), where versions are written as x.y.z:
x— major versiony— minor versionz— patch version
Patch updates are always backwards compatible. Major and minor updates may introduce breaking changes.
v0.1.3
Add retrieval trace event schema to outputs
Add citations to outputs schema
v0.1.1
Add the
mbSDK package with Pydantic models for benchmark inputs, expectations, outputs, and examplesProvide schema and example generators plus a
make schematarget and generated JSON schema/JSONL docsAdd serialization tests for entity models and remove the demo test
Update packaging metadata, build backend, and type-checking includes for the SDK
Refresh README and add initial API documentation scaffold
v0.1.0
Initial version
Authored by: | Copyright: 2026, TII AIIR Team | Version: 0.1.4