Active
Project:
AI Eval
Version:
1.0.0-alpha10
Component:
Code
Priority:
Normal
Category:
Bug report
Assigned:
Unassigned
Reporter:
Created:
25 Apr 2026 at 11:22 UTC
Updated:
25 Apr 2026 at 11:22 UTC
Jump to comment: Most recent
The shipped sample dataset at data/sample.yaml contains rows that exercise graders not present in src/Plugin/AiEvalGrader/:
S07, S08 use expected.route and also_valid_routes, intended for a RouteGrader plugin that is not shipped.S09, S10 use expected.tier, expected.entity_type, expected.project_id, and also_valid, intended for a StructuredMatchGrader plugin that is not shipped.These rows came from the local reference implementation, where those graders are defined in a separate module. They were copied into the contrib's sample dataset without their corresponding plugins. Anyone running drush ai-eval:run against the shipped sample with the shipped graders gets nothing useful from these rows: the relevant grader is missing, so the row's expected fields are silently ignored.
sample.yaml with the shipped graders enabled.drush ai-eval:run.Replace the orphaned rows with examples that exercise graders we actually ship:
tool_usage_grader examples using expected_tools.fact_match_grader examples using expected_facts and must_not_contain.sample.yaml noting that domain-specific fields (privacy tiers, entity types, route classification) belong in your own dataset, not the contrib's sample.data/sample.yaml: drop S07-S10, add two tool_usage_grader rows and two fact_match_grader rows.README.md "Dataset Format" examples if any reference the dropped fields.None.
None.
None. data/sample.yaml is documentation, not part of the data model.
Comments