Skip to content

[feat][evaluation] add skip evaluation feature for evaluators#512

Open
HearyShen wants to merge 6 commits intomainfrom
feat/skip_evaluator
Open

[feat][evaluation] add skip evaluation feature for evaluators#512
HearyShen wants to merge 6 commits intomainfrom
feat/skip_evaluator

Conversation

@HearyShen
Copy link
Copy Markdown
Collaborator

Implement ShouldSkipEvaluator method to allow evaluators to skip execution based on input data. This includes adding the interface method, default implementations for prompt and code evaluators, and integration into the evaluation workflow. The feature helps optimize evaluation by skipping unnecessary runs.

What type of PR is this?

Check the PR title

  • This PR title match the format: [<type>][<scope>] <description>. For example: [fix][backend] flaky fix
  • The description of this PR title is user-oriented and clear enough for others to understand.
  • Add documentation if the current PR requires user awareness at the usage level.
  • This PR is written in English. PRs not in English will not be reviewed.

(Optional) Translate the PR title into Chinese

(Optional) More detailed description for this PR(en: English/zh: Chinese)

en:
zh(optional):

(Optional) Which issue(s) this PR fixes

Implement ShouldSkipEvaluator method to allow evaluators to skip execution based on input data. This includes adding the interface method, default implementations for prompt and code evaluators, and integration into the evaluation workflow. The feature helps optimize evaluation by skipping unnecessary runs.
@HearyShen HearyShen changed the title feat(evaluator): add skip evaluation feature for evaluators [feat][evaluator] add skip evaluation feature for evaluators May 7, 2026
@HearyShen HearyShen changed the title [feat][evaluator] add skip evaluation feature for evaluators [feat][evaluation] add skip evaluation feature for evaluators May 7, 2026
HearyShen added 3 commits May 7, 2026 12:15
… and create record when skipped

Modify ShouldSkip methods to return EvaluatorOutputData instead of EvaluatorRecord
Update ShouldSkipEvaluator to create a record with output data when skipped
Adjust tests and mocks to reflect the new behavior
Add mock expectation for ShouldSkipEvaluator in all test cases to ensure proper test coverage of evaluator skipping logic
@codecov
Copy link
Copy Markdown

codecov Bot commented May 7, 2026

Codecov Report

❌ Patch coverage is 89.65517% with 6 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...aluation/domain/service/expt_run_item_turn_impl.go 60.00% 4 Missing and 2 partials ⚠️

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #512      +/-   ##
==========================================
+ Coverage   77.13%   77.15%   +0.01%     
==========================================
  Files         650      650              
  Lines       72599    72657      +58     
==========================================
+ Hits        56001    56059      +58     
- Misses      13257    13258       +1     
+ Partials     3341     3340       -1     
Flag Coverage Δ
unittests 77.15% <89.65%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...odules/evaluation/domain/service/evaluator_impl.go 83.58% <100.00%> (+0.94%) ⬆️
...ation/domain/service/evaluator_source_code_impl.go 84.38% <100.00%> (+0.04%) ⬆️
...ion/domain/service/evaluator_source_prompt_impl.go 75.66% <100.00%> (+0.09%) ⬆️
...aluation/domain/service/expt_run_item_turn_impl.go 85.80% <60.00%> (-0.85%) ⬇️

... and 4 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6f1c016...4b5ef43. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

HearyShen added 2 commits May 8, 2026 15:52
Add error handling for ShouldSkipEvaluator call and log warning when skip check fails
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant