Skip to content

feat(code): Suggest experiment opportunities in setup discovery#2109

Open
jurajmajerik wants to merge 2 commits intomainfrom
experiments/discovery-suggestions
Open

feat(code): Suggest experiment opportunities in setup discovery#2109
jurajmajerik wants to merge 2 commits intomainfrom
experiments/discovery-suggestions

Conversation

@jurajmajerik
Copy link
Copy Markdown
Contributor

@jurajmajerik jurajmajerik commented May 8, 2026

Problem

Setup currently only suggests bug fixes and cleanups. We want it to also suggest experiments to run.

Changes

The setup discovery scan now also looks for places in the code where an A/B test would make sense (CTAs, pricing, signup flows, empty states) and shows them as suggestion cards alongside the bugs.

Behind the posthog-code-experiment-suggestions flag. Off in prod, on in dev.

If the user clicks an experiment card, the agent creates a feature flag, a draft experiment with a primary metric, and wires the variants into the code. It reuses the existing run-experiment skill prompt for that part.

The agent only includes an experiment card if it finds a real candidate. When it does, it drops a lower-priority bug/cleanup card to make room.

How did you test this?

Built a small Vite + React + posthog-js app with a landing CTA, pricing, signup, empty state, and one XSS bug. Ran discovery on it. Got an experiment suggestion plus the XSS. Clicked the experiment card and the agent created the flag, the draft experiment (signup_submitted as primary metric, pricing_plan_selected as guardrail), and updated the code.

Prompt Output
image image

Follow-up: add analytics on what experiments are being suggested before we turn the flag on for more people.

Publish to changelog?

no

Adds an experiment-category tier to setup discovery, gated by the
posthog-code-experiment-suggestions flag. When enabled, the agent
scans for A/B-testable surfaces alongside the existing bug and
PostHog-instrumentation tiers, and the slot-reservation rule keeps
at least one experiment card if a credible candidate exists.
Accepted experiment cards build a task prompt that reuses the
run-experiment skill template to scaffold flag, draft experiment,
and code wiring end-to-end.

Defaults off in production; on in dev via import.meta.env.DEV.
# Conflicts:
#	apps/code/src/renderer/features/setup/services/setupRunService.ts
#	apps/code/src/renderer/features/setup/types.ts
#	apps/code/src/renderer/features/setup/utils/buildDiscoveredTaskPrompt.ts
#	apps/code/src/renderer/features/setup/utils/categoryConfig.ts
#	apps/code/src/shared/types/analytics.ts
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 8, 2026

Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
apps/code/src/renderer/features/setup/prompts.ts:39-45
**Duplicated category list (OnceAndOnlyOnce violation)**

`BASE_ALLOWED_CATEGORIES` in `prompts.ts` and `BASE_CATEGORY_ENUM` in `types.ts` enumerate exactly the same nine base categories. If a new category is added to the schema in the future, both constants must be updated in sync — a gap that will inevitably be missed. The prompt's allowed-categories string should be derived from the schema's enum array (e.g. export `BASE_CATEGORY_ENUM` from `types.ts` and call `.join(", ")` here) so there is a single source of truth.

### Issue 2 of 2
apps/code/src/renderer/features/setup/services/setupRunService.ts:357-359
**`import.meta.env.DEV` makes the flag bypass un-testable in dev**

`isFeatureFlagEnabled(EXPERIMENT_SUGGESTIONS_FLAG) || import.meta.env.DEV` means every developer running a local build always gets experiment suggestions, with no way to exercise the flag-off code path without a production build. If someone wants to verify the fallback behaviour (e.g. confirming that `experiment` cards surfaced during a flag-on session still render after the flag is turned off), they cannot do so from a normal dev environment. Consider whether this was intentional or whether a more explicit local override (e.g. a query-param or an env var) would give more control during testing.

Reviews (1): Last reviewed commit: "Merge branch 'main' into experiments/dis..." | Re-trigger Greptile

Comment on lines +39 to +45
const BASE_ALLOWED_CATEGORIES =
"bug, security, dead_code, duplication, performance, stale_feature_flag, error_tracking, event_tracking, funnel";

function buildDiscoveryRules(includeExperiments: boolean): string {
const allowed = includeExperiments
? `${BASE_ALLOWED_CATEGORIES}, experiment`
: BASE_ALLOWED_CATEGORIES;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Duplicated category list (OnceAndOnlyOnce violation)

BASE_ALLOWED_CATEGORIES in prompts.ts and BASE_CATEGORY_ENUM in types.ts enumerate exactly the same nine base categories. If a new category is added to the schema in the future, both constants must be updated in sync — a gap that will inevitably be missed. The prompt's allowed-categories string should be derived from the schema's enum array (e.g. export BASE_CATEGORY_ENUM from types.ts and call .join(", ") here) so there is a single source of truth.

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/code/src/renderer/features/setup/prompts.ts
Line: 39-45

Comment:
**Duplicated category list (OnceAndOnlyOnce violation)**

`BASE_ALLOWED_CATEGORIES` in `prompts.ts` and `BASE_CATEGORY_ENUM` in `types.ts` enumerate exactly the same nine base categories. If a new category is added to the schema in the future, both constants must be updated in sync — a gap that will inevitably be missed. The prompt's allowed-categories string should be derived from the schema's enum array (e.g. export `BASE_CATEGORY_ENUM` from `types.ts` and call `.join(", ")` here) so there is a single source of truth.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grok is this true? 🤔

Comment on lines +357 to +359
const includeExperiments =
isFeatureFlagEnabled(EXPERIMENT_SUGGESTIONS_FLAG) ||
import.meta.env.DEV;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 import.meta.env.DEV makes the flag bypass un-testable in dev

isFeatureFlagEnabled(EXPERIMENT_SUGGESTIONS_FLAG) || import.meta.env.DEV means every developer running a local build always gets experiment suggestions, with no way to exercise the flag-off code path without a production build. If someone wants to verify the fallback behaviour (e.g. confirming that experiment cards surfaced during a flag-on session still render after the flag is turned off), they cannot do so from a normal dev environment. Consider whether this was intentional or whether a more explicit local override (e.g. a query-param or an env var) would give more control during testing.

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/code/src/renderer/features/setup/services/setupRunService.ts
Line: 357-359

Comment:
**`import.meta.env.DEV` makes the flag bypass un-testable in dev**

`isFeatureFlagEnabled(EXPERIMENT_SUGGESTIONS_FLAG) || import.meta.env.DEV` means every developer running a local build always gets experiment suggestions, with no way to exercise the flag-off code path without a production build. If someone wants to verify the fallback behaviour (e.g. confirming that `experiment` cards surfaced during a flag-on session still render after the flag is turned off), they cannot do so from a normal dev environment. Consider whether this was intentional or whether a more explicit local override (e.g. a query-param or an env var) would give more control during testing.

How can I resolve this? If you propose a fix, please make it concise.

@jurajmajerik jurajmajerik requested review from a team May 8, 2026 13:55
Copy link
Copy Markdown
Contributor

@adboio adboio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is sick, thank you!! one minor greptile comment that seems legit but otherwise lgtm

Comment on lines +39 to +45
const BASE_ALLOWED_CATEGORIES =
"bug, security, dead_code, duplication, performance, stale_feature_flag, error_tracking, event_tracking, funnel";

function buildDiscoveryRules(includeExperiments: boolean): string {
const allowed = includeExperiments
? `${BASE_ALLOWED_CATEGORIES}, experiment`
: BASE_ALLOWED_CATEGORIES;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grok is this true? 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants