feat(mcp): per-server code_mode_run_js + multi-graph MCP demo#2840
feat(mcp): per-server code_mode_run_js + multi-graph MCP demo#2840
Conversation
Adds a second MCP server next to the existing per-operation one, exposing two generic tools instead of one-tool-per-operation: - code_mode_search_tools: takes prompts, generates GraphQL ops, registers them in the session catalog, returns their TS signatures. - code_mode_run_js: runs an async arrow against the session catalog inside a JS sandbox (V8 isolate), with tools.<name>(vars) bound to the registered ops. Includes: - router/internal/codemode: harness, sandbox, server, storage, tsgen, yoko (query-gen client), calltrace, observability. - proto/wg/cosmo/code_mode/yoko/v1: Yoko query-gen Connect API. - router/pkg/config: code_mode config block (auth, sandbox limits, query-generation endpoint, named-ops, mutation approval). - demo/code-mode: local federation + Yoko mock + start scripts + MCP client config snippets (Claude Code, Claude Desktop, Codex). - demo/code-mode-connect: alternate demo against an external yoko Connect supergraph (set YOKO_DIR). - router-tests: end-to-end named-ops integration test.
Previously a single non-serializable leaf (BigInt, function, symbol,
undefined, circular ref) caused the sandbox to drop the entire return
value and surface a NotSerializable error. Multi-source aggregations
that put raw federation errors or upstream payloads into a debug field
lost the rest of an expensive run.
The validator now mutates each bad leaf to the sentinel string
"<<non-serializable: KIND>>" and records {path, kind} in a new warnings
array on the response envelope. The successful result still flows
through. NotSerializable now only fires when JSON.stringify itself
throws after sanitization.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…down Moves the long inline tool/resource description string constants out of server.go into a new internal/codemode/server/descriptions sub-package, where each description lives in its own .md file embedded via go:embed. Lets the prose be edited as readable markdown without touching Go source. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The proxy now retries the initial dial with exponential backoff and supervises the live session via session.Wait(); when the upstream disconnects it reconnects transparently so downstream clients (Claude Desktop, etc.) keep working across router restarts. KeepAlive pings make dead connections surface quickly instead of waiting for the next call. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eState.wg WriteTimeout was 30s while executeTimeout is 120s, so net/http would cut off legitimately long code_mode_run_js responses mid-flight. Lift the write deadline above the configured execute timeout and switch the listener to ReadHeaderTimeout so a body upload doesn't share the 30s budget either. The unused sync.WaitGroup on executeState (and its matching state.wg.Wait() in Execute) was a no-op leftover; remove both. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the building blocks for a demo router that turns any GraphQL graph into MCP tools, hosts multiple operation collections side by side, and hot-reloads tools when .graphql files change on disk. - pkg/mcpserver/introspection.go: fetch upstream SDL via standard introspection so a collection can target any GraphQL endpoint without a local schema file. - pkg/mcpserver/multi_server.go: mount many GraphQLSchemaServer handlers on one shared listener, each with its own path, operations dir, and optional upstream override. - pkg/mcpserver/operations_watcher.go: per-collection directory watcher that picks up added/modified/removed .graphql files and triggers a tools reload after a settling tick. - mcp.config.yaml: working demo config wiring four public graphs (anilist, countries, rickandmorty, swapi) as separate MCP servers.
Merges code-mode-v2 (#2825) and adds a per-sub-server code_mode_run_js MCP tool that lets an LLM compose multiple operations in one V8-sandboxed JS call instead of N MCP round-trips. The new tool is auto-registered on every MultiServer sub-server that has at least one operation loaded. Each sub-server gets its own sandbox bound to its upstream URL; tools.<name>(vars) dispatches via the existing op catalog (no Yoko, no storage backend, no session state). Reuses the codemode/sandbox, codemode/harness, and codemode/tsgen packages from the merged PR. Description bundled inline includes the typed TS catalog plus shape constraints (single async-arrow root, no top-level await, no IIFE, sandbox console disabled, etc.) so the LLM lands on the right syntax first try. Compile fixes from the merge: - Add EnsureIndexed(ctx) error to yoko.Searcher and *yoko.Client - Rename opsFragment -> newOpsFragment to match call sites/tests - Update fakeYoko test fake with EnsureIndexed counter - Resolve mcpServer field conflict in router_config.go (keep MultiServer type, add codeModeServer alongside)
The watcher and applyToolDiff already logged on every reload, but the messages didn't make it clear that a tools/list_changed notification was being broadcast to connected MCP clients. Two tweaks: - operations_watcher: "...reloading tools and notifying connected clients" - applyToolDiff: "MCP tool refresh broadcast to connected clients (tools/list_changed)" with a total_tools count; the no-change branch now says "no notification sent to clients" so the distinction between "ticked, nothing useful happened" and "ticked, clients notified" is visible at a glance.
The previous description duplicated the per-op MCP tools' inputSchema info as an inline TypeScript bundle, and used hardcoded example operation names (SearchAnime, GetA/B, GetCountries) that don't necessarily exist on every server. The model could read those examples as available bindings, producing cross-server confusion (e.g. tools.SearchAnime appearing in the countries server's description). Drop the TS catalog and the foreign-named examples. The description now: - Lists the actual bound operation names on this server, nothing more - Points the model at the per-op MCP tools (and get_operation_info) for variable shapes and return types — single source of truth, no dupe - Uses generic placeholder names (tools.SomeOp) in syntax examples - Trims to ~half the prior length (3.6 KB vs 7.0 KB) by removing the redundant "WHEN TO USE" example list, since RETURN-VALUE DISCIPLINE already covers the same ground
WalkthroughAdds a Code Mode MCP server with JS sandbox/harness, storage (memory/Redis), Yoko proto/client/mock, router integration (multi-server), configs/demos/scripts, schema/config updates, and comprehensive tests. ChangesCode Mode Stack
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
✨ Finishing Touches📝 Generate docstrings
|
Router-nonroot image scan passed✅ No security vulnerabilities found in image: |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2840 +/- ##
===========================================
- Coverage 64.32% 41.98% -22.35%
===========================================
Files 311 275 -36
Lines 44308 29500 -14808
Branches 4767 0 -4767
===========================================
- Hits 28502 12386 -16116
+ Misses 15783 15727 -56
- Partials 23 1387 +1364
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 15
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
router/pkg/config/config.schema.json (1)
2777-2885:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftPer-server OAuth configs bypass the current schema guardrails.
The
if/thenat the end ofmcponly validates the top-leveloauthblock.servers[].oauth.enabled: truecurrently passes schema validation without requiringjwks, and it also never triggers the sharedmcp.server.base_urlrequirement needed for OAuth discovery. That lets broken multi-server auth configs through validation and pushes the failure to runtime.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/pkg/config/config.schema.json` around lines 2777 - 2885, The schema's existing if/then only enforces OAuth requirements for the top-level mcp.oauth, but does not validate per-server OAuth blocks (servers[].oauth), so servers[].oauth.enabled: true can bypass requiring jwks and the shared mcp.server.base_url; update the schema to add a conditional that detects any servers array entry with oauth.enabled === true (e.g., an "if" that uses "properties": {"servers": {"items": {"properties": {"oauth": {"properties": {"enabled": {"const": true}},"required":["enabled"]}}}}}) and a corresponding "then" that requires that the server's oauth.jwks has minItems:1 and that the top-level mcp.server.base_url is present (require "jwks" under that server object and add "server": {"required":["base_url"]} to the overall then), ensuring servers[].oauth.jwks and mcp.server.base_url are validated when any per-server oauth.enabled is true; reference the "servers" array, "servers[].oauth.enabled", "servers[].oauth.jwks", and top-level "server.base_url" symbols when making the change.
🟡 Minor comments (14)
router/pkg/config/config.schema.json-2789-2792 (1)
2789-2792:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winEnforce the leading slash that
servers[].pathpromises.The description says this must be a mount path like
/mcp, but the schema accepts any string right now. Adding a simple^/pattern will catch invalid paths before startup.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/pkg/config/config.schema.json` around lines 2789 - 2792, The servers[].path property currently allows any string but the description requires a leading slash; update the JSON Schema for the "path" property (servers[].path) to include a pattern ensuring it starts with "/" (e.g., add "pattern": "^/") so invalid mount paths are rejected by schema validation; keep the existing "description" and "type": "string" and, if present, add a clear "pattern" error-friendly message or ensure existing validation tooling surfaces the pattern failure.router/pkg/mcpserver/introspection.go-110-112 (1)
110-112:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winLimit the non-2xx body read before truncating it.
io.ReadAll(resp.Body)still buffers the entire error response even though only 256 characters are reported. A large HTML/proxy error page can make startup allocate far more memory than needed here.Suggested fix
- raw, _ := io.ReadAll(resp.Body) + raw, _ := io.ReadAll(io.LimitReader(resp.Body, 1024)) return "", fmt.Errorf("introspect %s: status %d: %s", url, resp.StatusCode, truncate(string(raw), 256))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/pkg/mcpserver/introspection.go` around lines 110 - 112, The code reads the entire resp.Body into memory before truncating, which can allocate huge buffers for large error pages; change the read to a bounded read (e.g., use io.LimitReader on resp.Body limited to 256 or 257 bytes) so only the portion needed for truncate(...) is buffered, then pass that limited content into truncate() and the error message; update the error branch that checks resp.StatusCode in introspection.go to use the bounded read against resp.Body (reference resp, resp.Body and truncate).router/internal/codemode/calltrace/calltrace.go-85-91 (1)
85-91:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win
Enabledshould treat nil/emptyFileRecorderas disabled.A typed-nil
*FileRecorderor emptypathcurrently reports enabled, even though recording is effectively off.Suggested fix
func Enabled(recorder Recorder) bool { - switch recorder.(type) { + switch r := recorder.(type) { case nil, NopRecorder, *NopRecorder: return false + case *FileRecorder: + return r != nil && r.path != "" default: return true } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/calltrace/calltrace.go` around lines 85 - 91, The Enabled(recorder Recorder) function currently returns true for a typed-nil *FileRecorder or a FileRecorder with an empty path; change it to treat those as disabled by first checking if recorder == nil, then attempt a type assertion to *FileRecorder (or FileRecorder) and, if that type matches, return false when the asserted pointer is nil or when its path field (e.g., r.path or r.Path) is empty; keep the existing NopRecorder checks for other recorder types and otherwise return true. Ensure you reference Enabled and the Recorder/FileRecorder types and inspect the FileRecorder.path field when deciding enabled/disabled.router/internal/codemode/calltrace/calltrace.go-82-82 (1)
82-82:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winDon’t ignore file write failures.
Discarding
Writeresults can silently lose trace entries.Suggested fix
- _, _ = file.Write(line) + n, err := file.Write(line) + if err != nil || n != len(line) { + return + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/calltrace/calltrace.go` at line 82, The file write result is being ignored (_, _ = file.Write(line)), which can drop trace entries silently; change this to capture the return values (n, err := file.Write(line)) and handle errors from file.Write inside the enclosing function in calltrace.go (e.g., return the error up the call chain or log it via the package's logger/standard error) and ensure you also handle short writes (n != len(line)) appropriately by retrying or treating it as an error so trace writes are never silently lost.demo/code-mode/mcp-configs/claude.desktop.json-4-4 (1)
4-4:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winEnsure the placeholder path is documented in the
README.md.
/ABSOLUTE/PATH/TO/cosmo/...requires manual substitution by each user. If the README doesn't already call this out, add a note explaining where the binary lives and how to build/locate it.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@demo/code-mode/mcp-configs/claude.desktop.json` at line 4, The "command" field in claude.desktop.json contains a placeholder path (the value starting with "/ABSOLUTE/PATH/TO/cosmo/...") that must be manually substituted by users; update README.md to document this: state that the user must replace that placeholder in the "command" key with the actual mcp-stdio-proxy binary location, describe how to build or where to find the mcp-stdio-proxy binary (brief build steps or typical install/output directory), and show one short clear instruction on how to locate the built binary so users can set the correct path.router/mcp.config.yaml-73-82 (1)
73-82:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winHardcoded personal home-directory paths won't work outside the author's machine (or in a Docker build).
Every
file_systemstorage provider path is anchored to/Users/asoorm/go/src/github.com/wundergraph/conference-mcp-demo/.... Withwatch: trueon each server, the router will attempt to watch these paths on startup — on any other host (including a Docker container) this will fail or silently produce no-op watchers.Consider one of:
- Replace with paths relative to the repo root or to
$PWD(e.g.,./demo/code-mode/operations/<name>).- Parameterise via an env-var overlay or a
--config-overrideflag.- Add a prominent
# TODO: replace with your local pathcomment so the next person isn't surprised.📝 Suggested placeholder form
storage_providers: file_system: - id: rickandmorty - path: /Users/asoorm/go/src/github.com/wundergraph/conference-mcp-demo/.operations/rickandmorty + path: /path/to/conference-mcp-demo/.operations/rickandmorty # TODO: update for your environment - id: countries - path: /Users/asoorm/go/src/github.com/wundergraph/conference-mcp-demo/.operations/countries + path: /path/to/conference-mcp-demo/.operations/countries # TODO: update for your environment - id: anilist - path: /Users/asoorm/go/src/github.com/wundergraph/conference-mcp-demo/.operations/anilist + path: /path/to/conference-mcp-demo/.operations/anilist # TODO: update for your environment - id: swapi - path: /Users/asoorm/go/src/github.com/wundergraph/conference-mcp-demo/.operations/swapi + path: /path/to/conference-mcp-demo/.operations/swapi # TODO: update for your environment🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/mcp.config.yaml` around lines 73 - 82, The storage_providers.file_system entries currently use hardcoded absolute home-directory paths (ids: rickandmorty, countries, anilist, swapi) which will fail on other machines or containers; update these to use relative paths (e.g., ./... relative to the repo root) or substitute an environment-variable-based placeholder (e.g., ${OPERATIONS_DIR}/<name>) so paths can be overridden at runtime, and add a short TODO comment above the file_system block indicating that implementers must set OPERATIONS_DIR or adjust paths before running.demo/code-mode/mcp-configs/README.md-113-117 (1)
113-117:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winDocumentation contradicts the actual config value for
require_mutation_approval.Line 115 states the demo config sets
require_mutation_approval: false, butdemo/code-mode/router-config.yamlLine 42 sets it totrue. This will mislead anyone testing prompt04-mutation-not-approved.txt— the comment says to skip it only for approval-rejection demos, but withtrueit will actually enforce approval.📝 Proposed fix
-The historical prompt name mentions "not approved", but the demo config sets `require_mutation_approval: false` in `demo/code-mode/router-config.yaml`. -That means this prompt is not declined by operator approval in the default demo; it should run like a normal mutation if the mock can generate the operation. -Skip this prompt when you specifically need to demonstrate approval rejection. +The demo config sets `require_mutation_approval: true` in `demo/code-mode/router-config.yaml`, +so this prompt will trigger the mutation-approval flow. If you want to skip approval gating, +set `require_mutation_approval: false` before running the demo.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@demo/code-mode/mcp-configs/README.md` around lines 113 - 117, The README incorrectly states the demo config sets require_mutation_approval: false while the actual router config (demo/code-mode/router-config.yaml) sets require_mutation_approval: true; update either the README text or the config so they match: either change the README line referencing require_mutation_approval to state "true" and explain that sample-prompts/04-mutation-not-approved.txt will be declined by default, or change the router-config.yaml require_mutation_approval setting to false if the intent is to not enforce approval by default; ensure references to sample-prompts/04-mutation-not-approved.txt and the require_mutation_approval setting are consistent.demo/code-mode/Makefile-3-4 (1)
3-4:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win
$(wildcard ../cli/.env)checks the wrong path and silently disables env-file loading.Make expands
$(wildcard ...)relative to its own working directory (demo/code-mode/), so../cli/.envresolves todemo/cli/.env— not the repo-rootcli/.env. The conditional is therefore always false, and the--env-fileflag is never appended even whencli/.envexists.The
--env-file ../cli/.envstring embedded in the shell command is fine as-is: the recipe runs undercd ..so the shell seesdemo/../cli/.env=cli/.envfrom the root.Only the wildcard guard needs the corrected path:
🐛 Proposed fix
-wgc_env_arg = $(if $(wildcard ../cli/.env),--env-file ../cli/.env,) +wgc_env_arg = $(if $(wildcard ../../cli/.env),--env-file ../cli/.env,)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@demo/code-mode/Makefile` around lines 3 - 4, The Makefile's conditional uses $(wildcard ../cli/.env) relative to demo/code-mode/, which points at demo/cli/.env and is therefore false; update the guard so it checks the repo-root file by using the correct relative path ../../cli/.env (i.e., change the wildcard in the wgc_env_arg definition), leaving the wgc_router command (--env-file ../cli/.env) unchanged because the recipe runs from the parent dir; this fixes the conditional so --env-file is appended when cli/.env exists.router/core/graph_server.go-1403-1410 (1)
1403-1410:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winBoth Code Mode error branches reuse the
"failed to reload MCP server"message, shadowing the adjacents.mcpServerlog.Both branches emit
"failed to reload MCP server"but they concerns.codeModeServer, nots.mcpServer(which is already handled with the same message just above at lines 1398–1402). Any log-based alert or grep for that message will conflate the two. The first branch also double-wraps: the outer log message already implies a reload failure, making thefmt.Errorf("failed to print Code Mode schema SDL: %w", printErr)wrapper redundant.🔧 Proposed fix
- if printErr != nil { - s.logger.Error("failed to reload MCP server", zap.Error(fmt.Errorf("failed to print Code Mode schema SDL: %w", printErr))) - } else if mErr := s.codeModeServer.Reload(executor.ClientSchema, sdl); mErr != nil { - s.logger.Error("failed to reload MCP server", zap.Error(mErr)) - } + if printErr != nil { + s.logger.Error("failed to print Code Mode schema SDL", zap.Error(printErr)) + } else if mErr := s.codeModeServer.Reload(executor.ClientSchema, sdl); mErr != nil { + s.logger.Error("failed to reload Code Mode server", zap.Error(mErr)) + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/core/graph_server.go` around lines 1403 - 1410, The log messages for the code-mode reload use the wrong text and double-wrap errors: update the two s.codeModeServer error branches so they emit a distinct message (e.g., "failed to reload Code Mode server") instead of "failed to reload MCP server", and simplify the first branch to log the original printErr directly (no extra fmt.Errorf wrapping). Locate the block using astprinter.PrintString(executor.ClientSchema) and s.codeModeServer.Reload(...) and change the s.logger.Error calls to use the new message and pass the original error values (printErr and mErr) so logs clearly distinguish code-mode failures from s.mcpServer failures.router/internal/codemode/server/observability_handler_test.go-50-58 (1)
50-58:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winGuard span indexing before accessing element 0.
Line 57 and Line 101 access
[0]before asserting length; if span emission regresses, the test panics instead of failing with a clear assertion.Suggested hardening
- assert.Equal(t, []tracetest.SpanStub{{ + gotStubs := normalizedSpanStubs(traces.Ended()) + require.Len(t, gotStubs, 1) + assert.Equal(t, []tracetest.SpanStub{{ Name: "MCP Code Mode - Search", SpanKind: trace.SpanKindServer, Attributes: []attribute.KeyValue{ attribute.String("mcp.tool", "code_mode_search_tools"), attribute.String("mcp.status", "success"), }, - InstrumentationLibrary: normalizedSpanStubs(traces.Ended())[0].InstrumentationLibrary, - }}, normalizedSpanStubs(traces.Ended())) + InstrumentationLibrary: gotStubs[0].InstrumentationLibrary, + }}, gotStubs)Also applies to: 94-102
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/server/observability_handler_test.go` around lines 50 - 58, The test indexes normalizedSpanStubs(traces.Ended())[0] before asserting the slice length, which can cause a panic if no spans are emitted; change the test to first capture the slice into a variable (e.g., spans := normalizedSpanStubs(traces.Ended())), assert its length with require.Len/require.Greater/ assert.LenEqual (or require.NotEmpty) to ensure at least one element, then access spans[0] when constructing the expected tracetest.SpanStub for comparison with assert.Equal; apply the same guard for other occurrences (the second access around spans[94-102] / similar sites).router/internal/codemode/sandbox/preamble.go-20-27 (1)
20-27:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winMake splice marker detection whitespace/line-ending tolerant.
Line 23 requires an exact match; CRLF or incidental spacing will miss the marker and return
1, breaking user-code line mapping.Suggested fix
- if line == spliceComment { + if strings.TrimSpace(line) == spliceComment { return i + 2 }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/sandbox/preamble.go` around lines 20 - 27, The splice marker detection in userCodeStartLine currently requires an exact line match and fails on CRLF or incidental spacing; update the check in userCodeStartLine to compare a trimmed version of each line (e.g., use strings.TrimSpace(line)) against spliceComment so CR ("\r") and extra whitespace won't prevent detection, leaving the rest of the function behavior (including the return i+2) unchanged.demo/code-mode/yoko-mock/main_test.go-129-132 (1)
129-132:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winWindows stub misses the Search (
resume) behavior.Line 130–Line 132 only emulate Index output. Search-path tests depend on writing
resumeMessageto the-ofile, so Windows runs won’t reflect the same behavior.Suggested direction
- // Minimal Windows fallback — only Index path is exercised in CI on Unix. - script = "@echo off\r\ntype \"" + indexFile + "\"\r\n" + // Keep parity with Unix stub: detect "resume" and honor "-o <FILE>". + // (or explicitly skip Search tests on Windows if parity is intentionally unsupported)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@demo/code-mode/yoko-mock/main_test.go` around lines 129 - 132, The Windows branch in the runtime.GOOS == "windows" stub only emits the Index output and omits the Search/resume behavior expected by tests; update the Windows fake script (the script variable built in that branch) to also write the resumeMessage to the -o output file path (the same path the non-Windows shell writes to) when invoked with the Search/resume code path so tests that rely on resumeMessage being written to the output file will pass; locate the script construction near the indexFile and resumeMessage variables and add the conditional output logic mirroring the non-Windows branch.router/internal/codemode/server/descriptions/search_tool.md-18-18 (1)
18-18:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winFix malformed quoting in the filter example.
The sentence currently has broken nested quotes, which makes the instruction ambiguous. Rewrite to keep each example quoted cleanly.
✏️ Proposed wording
-- Any required filters/arguments but never specific values ("employee by id - not "employee 123", "employee filtered by department name" - not "employee in department 'Engineering'"). +- Any required filters/arguments but never specific values ("employee by id" — not "employee 123"; "employee filtered by department name" — not "employee in department 'Engineering'").🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/server/descriptions/search_tool.md` at line 18, Fix the malformed nested quotation in the example sentence that currently reads: Any required filters/arguments but never specific values ("employee by id - not "employee 123", "employee filtered by department name" - not "employee in department 'Engineering'"). Rewrite it so each example uses consistent, non-conflicting quotes (for example: "employee by id" — not "employee 123"; "employee filtered by department name" — not "employee in department 'Engineering'") to remove ambiguity and broken nesting.demo/code-mode/start.sh-79-102 (1)
79-102:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win
start_backgrounddoes not useexec, unlikestart_background_root.In
start_background_root(lines 92-102), the command usesexec "$@"which replaces the subshell. However,start_background(lines 79-90) uses"$@"directly withoutexec. This means--downwill signal the subshell rather than thego runprocess directly.For
go runcommands this may work because the subshell exits when the child exits, but it's inconsistent with the pattern instart_background_rootand the comment incode-mode-connect/start.sh(lines 88-90) that explains whyexecis needed.♻️ Suggested fix: Use exec for consistency
start_background() { local name="$1" local cwd="$2" shift 2 echo "Starting $name" ( cd "$cwd" - "$@" + exec "$@" ) > "$LOG_DIR/$name.log" 2>&1 & append_pid "$name" "$!" }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@demo/code-mode/start.sh` around lines 79 - 102, The start_background function launches the command in a subshell but omits exec, causing signals (like --down) to target the subshell rather than the actual child process; make start_background mirror start_background_root by using exec "$@" inside the subshell so the subshell is replaced by the child process (preserve the existing cd "$cwd", redirection to "$LOG_DIR/$name.log", backgrounding and append_pid "$name" "$!").
🧹 Nitpick comments (22)
router/internal/codemode/observability/logging.go (1)
18-25: ⚡ Quick winUse warn/error levels for the failure helpers.
These are real failure paths, so emitting them at
Infomakes them invisible to the log filters and alerts that usually watchWarn/Error. Keeping lifecycle events atInfois fine, but the failure helpers should log at a higher severity.Suggested fix
- logger.Info("code mode transpile failure", + logger.Warn("code mode transpile failure", zap.String("session_id", sessionID), zap.String("diagnostic", diagnostic), )- logger.Info("code mode tool invocation failure", + logger.Error("code mode tool invocation failure", zap.String("session_id", sessionID), zap.String("op_name", opName), zap.Error(err), )Also applies to: 39-47
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/observability/logging.go` around lines 18 - 25, The LogTranspileFailure helper is logging actual failures at Info level; change its call from logger.Info to logger.Error (preserving the message and zap.String fields) so failures emit at error severity, and apply the same change to the other failure helper functions in this file (the other transpile/failure helpers near LogTranspileFailure) so all real failure paths use logger.Error rather than Info while keeping the nil logger short‑circuit and existing field structure intact.router/internal/codemode/calltrace/calltrace_test.go (1)
35-46: ⚡ Quick winExpand coverage for recorder state semantics.
Please add assertions for request/response direction (if
kindis added) andEnabledbehavior for(*FileRecorder)(nil)/ empty-path recorder to prevent regressions.Also applies to: 49-51
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/calltrace/calltrace_test.go` around lines 35 - 46, Update the test to assert recorder state semantics: extend the existing assertions on []Record to check the Record.kind (or direction) values for request vs response entries once a kind field is added, and add cases that verify Enabled() returns false for (*FileRecorder)(nil) and for a FileRecorder configured with an empty path (and true for a normal recorder) to prevent regressions; target the Record type assertions and the FileRecorder.Enabled method in the calltrace tests (e.g., in the Test that currently compares records) and add short subtests asserting Enabled() behavior for nil and empty-path receivers.router/internal/codemode/sandbox/semaphore.go (1)
1-16: 💤 Low valueLGTM — canonical Go buffered-channel semaphore, correctly implemented.
acquirefills a slot (sends),releasedrains one (receives);ctx.Done()exits cleanly on cancellation. One non-blocking contract worth documenting (or enforcing viadefer):releaseblocks indefinitely if called without a preceding successfulacquire. Callers should always usedefer s.release()immediately after a successfulacquirecall to avoid goroutine leaks on any future refactor.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/sandbox/semaphore.go` around lines 1 - 16, The semaphore is correct but lacks a contract warning: add a short doc comment on the Sandbox type and the acquire/release methods (referencing Sandbox.acquire and Sandbox.release and the s.sem channel) stating that release will block if called without a successful acquire and callers must call release only after a successful acquire, recommending the canonical pattern “if err := s.acquire(ctx); err == nil { defer s.release() }”. Optionally include a small usage example in the comment to show the defer pattern to prevent goroutine leaks.router/core/router_config.go (1)
338-342: 💤 Low valueConsider adding
code_modetelemetry toUsage().The MCP usage block (Lines 338–342) tracks MCP settings but omits the new Code Mode server. Adding
usage["code_mode_enabled"] = c.mcp.CodeMode.Enabledhere would keep the usage telemetry consistent with the new feature surface.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/core/router_config.go` around lines 338 - 342, The Usage() MCP telemetry block is missing the new Code Mode flag; inside the Usage() method where usage["mcp"] and related keys are set (the c.mcp.* entries), add a new entry usage["code_mode_enabled"] = c.mcp.CodeMode.Enabled so the Code Mode server state is reported alongside mcp, mcp_enable_arbitrary_operations, mcp_exclude_mutations and mcp_expose_schema.router/internal/codemode/storage/memory_backend_test.go (1)
215-222: ⚡ Quick winPrefer
WaitGroup.Goover manual Add/Done goroutine wiring.Line 219–Line 222 can be simplified and made less error-prone with
wg.Go(...)in this codebase.Suggested refactor
for i := range goroutines { - wg.Add(1) - go func(i int) { - defer wg.Done() + i := i + wg.Go(func() { _, err := backend.Append(ctx, "shared", []SessionOp{{ Name: "shared-op", Body: fmt.Sprintf("query Shared%d { shared%d }", i, i), Kind: OperationKindQuery, Description: fmt.Sprintf("Shared %d", i), }}) errs <- err - }(i) + }) }Based on learnings: In Go code (Go 1.25+), prefer using
sync.WaitGroup.Go(func())and avoid manualwg.Add(1)+go func(){ defer wg.Done() }().🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/storage/memory_backend_test.go` around lines 215 - 222, Replace the manual wg.Add(1) + go func(...){ defer wg.Done() ... }() pattern with the sync.WaitGroup.Go method to simplify and reduce error-proneness: change the goroutine spawn that calls backend.Append(ctx, "shared", []SessionOp{{...}}) and writes to errs so that you call wg.Go(func() { _, err := backend.Append(...); errs <- err }) (keeping the same use of goroutines slice, errs channel, ctx, backend.Append and SessionOp) and remove the explicit Add/Done calls.demo/code-mode/mcp-stdio-proxy/main.go (1)
198-211: 💤 Low valueConsider: Inconsistent error handling between tools and resources.
Tool handlers (lines 177-195) wrap errors using
SetErrorand return the result, while resource handlers (lines 200-210) return errors directly. This may be intentional based on MCP semantics, but verify that returning raw errors from resource handlers is the desired behavior for downstream clients.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@demo/code-mode/mcp-stdio-proxy/main.go` around lines 198 - 211, The resource handler passed to localServer.AddResource currently returns errors directly from upstream.awaitSession and session.ReadResource (in the closure around upstream.awaitSession and session.ReadResource), which is inconsistent with the tool handlers that use SetError on the result and return the result object; update the resource handler to mirror the tool-handler pattern: on failure create a mcp.ReadResourceResult, call SetError with a descriptive message that includes req.Params.URI and the underlying error, and return the result (nil error), so downstream clients receive errors via the MCP result object rather than raw errors.router/internal/codemode/yoko/client_test.go (1)
321-336: 💤 Low valueConsider using
sync.WaitGroup.Gofor cleaner goroutine management.Based on learnings, Go 1.25+ prefers
wg.Go(func() { ... })over manualwg.Add(1)+go func() { defer wg.Done(); ... }()patterns. This applies to lines 321-336 and 364-379.♻️ Example refactor
- var wg sync.WaitGroup - wg.Add(2) - results := make([]*yokov1.SearchResponse, 2) - errs := make([]error, 2) - go func() { - defer wg.Done() - results[0], errs[0] = client.Search(context.Background(), "session-1", []string{"first"}) - }() - <-indexStarted - go func() { - defer wg.Done() - results[1], errs[1] = client.Search(context.Background(), "session-2", []string{"second"}) - }() + var wg sync.WaitGroup + results := make([]*yokov1.SearchResponse, 2) + errs := make([]error, 2) + wg.Go(func() { + results[0], errs[0] = client.Search(context.Background(), "session-1", []string{"first"}) + }) + <-indexStarted + wg.Go(func() { + results[1], errs[1] = client.Search(context.Background(), "session-2", []string{"second"}) + })🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/yoko/client_test.go` around lines 321 - 336, Replace the manual wg.Add(1)+go+defer wg.Done() pattern with sync.WaitGroup.Go to simplify goroutine management: remove wg.Add(2) and instead call wg.Go twice, each wrapping the existing goroutine bodies that call client.Search, storing into results[i] and errs[i]; ensure you capture the loop/index by assigning a local index variable (e.g., i := 0/1) inside each Go closure so results and errs are written to the correct slot; apply the same change to the second pair of goroutines around the client.Search usages (the block using indexStarted and releaseIndex), keeping the time.Sleep and close(releaseIndex) logic unchanged.router/internal/codemode/server/search_handler_test.go (1)
300-327: ⚡ Quick winConsider using
sync.WaitGroup.Gofor cleaner goroutine management.The manual
wg.Add(1)followed bygo func() { defer wg.Done(); ... }()pattern can be simplified usingwg.Go(func())in Go 1.25+, which manages Add/Done automatically. This applies to multiple test functions in this file.Example refactor
- wg.Add(1) - go func() { - defer wg.Done() + wg.Go(func() { result, err := srv.handleSearch(ctx, searchToolRequest(t, "session-1", map[string]any{ "prompts": []string{"orders", "customers"}, })) require.NoError(t, err) results[0] = result - }() + })Based on learnings: "In Go code (Go 1.25+), prefer using sync.WaitGroup.Go(func()) to run a function in a new goroutine."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/server/search_handler_test.go` around lines 300 - 327, Replace the manual wg.Add(1); go func() { defer wg.Done(); ... }() pattern with sync.WaitGroup.Go to simplify goroutine management: for the two goroutines that invoke srv.handleSearch (which use searchToolRequest and write into results), call wg.Go(func() { ... }) so you no longer need explicit wg.Add or defer wg.Done; keep the same body that calls srv.handleSearch with ctx and the "session-1" prompts, assigns into results[0]/results[1], and asserts NoError, and leave the rest of the test (waiting on searcher.block, close, wg.Wait, and assertions on searcher.callCount and results equality) unchanged.router/internal/codemode/storage/redis_backend_test.go (1)
182-207: ⚡ Quick winConsider using
sync.WaitGroup.Gofor the concurrent test.Per repository conventions for Go 1.25+, the manual WaitGroup pattern can be simplified:
Suggested refactor
var wg sync.WaitGroup errs := make(chan error, goroutines) for i := 0; i < goroutines; i++ { - wg.Add(1) - go func(worker int) { - defer wg.Done() + wg.Go(func(worker int) func() { + return func() { ops := make([]SessionOp, 0, opsPerGoroutine) for j := 0; j < opsPerGoroutine; j++ { ops = append(ops, SessionOp{Name: fmt.Sprintf("op_%02d_%02d", worker, j), Body: "query { ok }", Kind: OperationKindQuery}) } _, err := backend.Append(ctx, "session-1", ops) errs <- err - }(i) + } + }(i)) }Based on learnings: "In Go code (Go 1.25+), prefer using sync.WaitGroup.Go(func()) to run a function in a new goroutine."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/storage/redis_backend_test.go` around lines 182 - 207, The test uses the manual WaitGroup pattern (wg.Add, go func, defer wg.Done) to launch goroutines; update it to use sync.WaitGroup.Go for Go 1.25+ by replacing the wg.Add + go { defer wg.Done(); ... } pattern with wg.Go(func() { ... }) to spawn each worker; keep the errs channel and the Append call (backend.Append) and ensure the loop worker variable is captured correctly (pass worker into the closure or use the loop variable safely) so opsPerGoroutine, goroutines, errs, backend.Append and backend.opsKey("session-1") behavior remain unchanged.router/internal/codemode/sandbox/errors.go (1)
179-200: 💤 Low valueConsider validating unknown Base64 characters in VLQ decoding.
If
segmentcontains a character not inbase64VLQ, the map lookup silently returns0, potentially producing incorrect decoded values. Since source maps should always contain valid Base64 VLQ, this is unlikely to cause issues in practice.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/sandbox/errors.go` around lines 179 - 200, The decodeVLQSegment function silently treats unknown Base64 characters as zero due to direct map lookup from base64VLQ; change the lookup to check presence (e.g., value, ok := base64VLQ[r]) and return a descriptive error when ok is false, referencing the offending rune/character, so callers can fail fast instead of producing incorrect decoded values; update any callers/tests if they expect nil error on bad input and ensure the error mentions decodeVLQSegment, the invalid character, and that it violates VLQ/Base64 input.router/internal/codemode/server/lifecycle.go (1)
173-182: 💤 Low valueRedundant header clone.
req.Clone(ctx)already deep-copies request headers. The explicitcloned.Header = req.Header.Clone()on line 179 is unnecessary.♻️ Proposed simplification
func (t staticBearerRoundTripper) RoundTrip(req *http.Request) (*http.Response, error) { next := t.next if next == nil { next = http.DefaultTransport } cloned := req.Clone(req.Context()) - cloned.Header = req.Header.Clone() cloned.Header.Set("Authorization", "Bearer "+t.token) return next.RoundTrip(cloned) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/server/lifecycle.go` around lines 173 - 182, The RoundTrip implementation in staticBearerRoundTripper redundantly clones headers after calling req.Clone; remove the extra cloned.Header = req.Header.Clone() line so the code uses the headers already copied by req.Clone(req.Context()), then set the Authorization header on the cloned request (cloned.Header.Set(...)) and call next.RoundTrip(cloned) as before; this touches the staticBearerRoundTripper.RoundTrip function only.router/internal/codemode/sandbox/sandbox_test.go (1)
623-635: ⚡ Quick winUse
wg.Goinstead of manualwg.Add/go funcpattern.Per repository guidance for Go 1.25+, prefer
sync.WaitGroup.Go(func())to let WaitGroup manage Add/Done automatically.♻️ Suggested fix
var wg sync.WaitGroup for range 5 { - wg.Add(1) - go func() { - defer wg.Done() + wg.Go(func() { _, err := s.Execute(context.Background(), ExecuteRequest{ SessionID: "s1", ToolNames: []string{"ping"}, WrappedJS: `async () => await tools.ping()`, }) assert.NoError(t, err) - }() + }) }Based on learnings: "In Go code (Go 1.25+), prefer using sync.WaitGroup.Go(func()) to run a function in a new goroutine, letting the WaitGroup manage Add/Done automatically."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/sandbox/sandbox_test.go` around lines 623 - 635, The test uses manual wg.Add(1) and launching a goroutine that defers wg.Done; replace that pattern by calling wg.Go(func() { ... }) so the WaitGroup manages Add/Done automatically; update the loop that calls s.Execute with ExecuteRequest (SessionID "s1", ToolNames ["ping"], WrappedJS `async () => await tools.ping()`) to use wg.Go, removing the explicit wg.Add and defer wg.Done lines and keeping the assert.NoError(t, err) inside the function.router/internal/codemode/harness/envelope.go (1)
63-73: 💤 Low valuePotential stack overflow on deeply nested error causes.
The
cloneErrorEnvelopefunction recursively clones theCausechain without depth limiting. While the sandbox itself truncates cause chains at depth 5 (perTestExecuteErrorCauseChainTruncatesAfterDepthFive), if a malformed or externally constructedErrorEnvelopewith a deeper chain is passed, this could cause a stack overflow.🛡️ Optional: Add depth guard to match sandbox behavior
-func cloneErrorEnvelope(err *ErrorEnvelope) *ErrorEnvelope { +func cloneErrorEnvelope(err *ErrorEnvelope) *ErrorEnvelope { + return cloneErrorEnvelopeWithDepth(err, 0) +} + +func cloneErrorEnvelopeWithDepth(err *ErrorEnvelope, depth int) *ErrorEnvelope { if err == nil { return nil } + var cause *ErrorEnvelope + if depth < 5 { + cause = cloneErrorEnvelopeWithDepth(err.Cause, depth+1) + } return &ErrorEnvelope{ Name: err.Name, Message: err.Message, Stack: err.Stack, - Cause: cloneErrorEnvelope(err.Cause), + Cause: cause, } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/harness/envelope.go` around lines 63 - 73, cloneErrorEnvelope currently recurses unbounded through ErrorEnvelope.Cause and can overflow on deeply nested/malformed chains; modify cloneErrorEnvelope (or create a helper) to enforce a maximum depth (match sandbox behavior, e.g., depth limit 5) when cloning Cause chains—stop cloning further causes once the depth limit is reached (return nil or a truncated marker) to prevent stack overflow while keeping Name/Message/Stack for cloned nodes; reference the cloneErrorEnvelope function and the ErrorEnvelope.Cause field when making the change.router/internal/codemode/server/search_handler.go (2)
56-61: 💤 Low valueSingleflight result type assertion without error handling.
The singleflight's
Doreturns(any, error, bool)but the error is ignored with_. IfhandleSearchStatefulwere to return an error in the future, this would panic on the type assertion.Currently
handleSearchStatefulalways returns*mcp.CallToolResultand never errors (the function signature returns only a result), so this is safe. However, the pattern is fragile if the function signature changes.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/server/search_handler.go` around lines 56 - 61, The code ignores the error from s.searchGroup.Do and directly type-asserts the result to *mcp.CallToolResult, which can panic if Do returns an error or a different type; update the call to capture the error and the shared bool: value, err, _ := s.searchGroup.Do(key, func() (any, error) { return s.handleSearchStateful(ctx, sessionID, prompts), nil }); if err != nil { return nil, err } then safely type-assert value to *mcp.CallToolResult (checking the assertion and returning an error if it fails) so searchSingleFlightKey/s.searchGroup.Do/handleSearchStateful use a robust pattern that won’t panic if signatures change.
224-248: 💤 Low value
extractGraphQLVariablesBlockmay return incorrect results for complex operation syntax.The function uses simple byte scanning without proper GraphQL parsing. While it handles nested parentheses, it doesn't account for:
- Comments containing parentheses:
query Foo /* ( */ { ... }- String literals in directives:
query Foo@deprecated(reason: "(old)") { ... }However, since this extracts from generated operation bodies (presumably well-formed), these edge cases may not occur in practice.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/server/search_handler.go` around lines 224 - 248, extractGraphQLVariablesBlock uses a naive byte scan and can be tripped by parentheses inside comments or string literals; update extractGraphQLVariablesBlock to properly skip GraphQL comments and string/ block-string literals when scanning for the variable parentheses or, better, replace this ad-hoc scanner by using a real GraphQL parser/tokenizer (e.g., github.com/graphql-go/graphql or any existing project tokenizer) to locate the operation variable definitions; specifically, implement scanning that recognizes and ignores // and /* */ comments and both "..." and """...""" string literals (or call a tokenizer) before treating '(' and ')' as structural tokens so the function reliably returns the variable block.router/internal/codemode/sandbox/sandbox_preamble.js (1)
45-55: ⚡ Quick winUse strict null checks in the helper functions.
notNullandcompactcurrently rely on==/!=. This repo's JS guidelines require strict equality, so these checks should be expanded explicitly to=== null || === undefinedand the corresponding!==forms.Suggested fix
globalThis.notNull = (v, msg) => { - if (v == null) throw new Error(msg ?? "notNull: value was null/undefined"); + if (v === null || v === undefined) throw new Error(msg ?? "notNull: value was null/undefined"); return v; }; globalThis.compact = (v) => { - if (Array.isArray(v)) return v.map(compact).filter((x) => x != null); + if (Array.isArray(v)) return v.map(compact).filter((x) => x !== null && x !== undefined); if (v && typeof v === "object") { const out = {}; for (const k in v) { const c = compact(v[k]); - if (c != null) out[k] = c; + if (c !== null && c !== undefined) out[k] = c; } return out; }As per coding guidelines "Always use strict equality (
===and!==) instead of loose equality (==and!=)."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/sandbox/sandbox_preamble.js` around lines 45 - 55, Update the helper functions to use strict null/undefined checks: in globalThis.notNull replace the loose check (v == null) with an explicit (v === null || v === undefined); in globalThis.compact change the Array branch filter to use (x !== null && x !== undefined) instead of (x != null), and change the object guard from (v && typeof v === "object") to (v !== null && v !== undefined && typeof v === "object"); likewise when assigning c check c !== null && c !== undefined before setting out[k]. Keep the same return/throw behavior and function names (globalThis.notNull and globalThis.compact).router/internal/codemode/storage/redis_backend.go (1)
345-350: ⚡ Quick winNon-atomic TTL setting could leave keys without expiry.
The
setWithTTLmethod performsSetandExpireas separate operations. If the process crashes or Redis connection fails between these calls, the key could remain without a TTL. Consider usingSetwith theEXoption for atomic TTL setting.🔧 Suggested fix
func (b *RedisBackend) setWithTTL(ctx context.Context, key string, value []byte) error { - if err := b.client.Set(ctx, key, value, 0).Err(); err != nil { - return err - } - return b.client.Expire(ctx, key, b.sessionTTL).Err() + return b.client.Set(ctx, key, value, b.sessionTTL).Err() }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/storage/redis_backend.go` around lines 345 - 350, The current setWithTTL in RedisBackend calls Set and then Expire separately which can leave keys without TTL if interrupted; change it to perform an atomic set-with-ttl by passing b.sessionTTL as the expiration argument to b.client.Set (use the client Set overload that accepts a time.Duration/expiration or the EX option), remove the separate b.client.Expire call, and propagate any error from the single Set call so key creation and TTL are applied in one atomic operation.router/internal/codemode/sandbox/sandbox.go (1)
111-143: 💤 Low valueDead code: MaxConcurrent validation after defaults.
The check on lines 113-115 will never trigger an error because
withDefaults(called on line 112) already setsMaxConcurrenttodefaultMaxConcurrent(4) if it's<= 0. The validation is defensive but unreachable.🧹 Consider removing dead validation
func New(cfg Config) (*Sandbox, error) { cfg = withDefaults(cfg) - if cfg.MaxConcurrent <= 0 { - return nil, errors.New("sandbox max concurrent must be positive") - } if cfg.StorageLookup == nil {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/sandbox/sandbox.go` around lines 111 - 143, The MaxConcurrent validation in New is dead because withDefaults already enforces cfg.MaxConcurrent > 0 (setting it to defaultMaxConcurrent), so remove the redundant check (the if block that returns an error) to eliminate unreachable code; update or remove any related error message handling in New and keep withDefaults as the single source of truth for defaulting/validation of MaxConcurrent.router/internal/codemode/sandbox/execute.go (1)
153-181: ⚖️ Poor tradeoffPotential goroutine leak in
awaitWithContextis bounded but worth noting.The goroutine spawned on line 159 may not return if
value.Await()hangs even afterrt.Close(). The 100ms timeout on line 177 abandons the goroutine without cleanup. While this is documented as acceptable (bounded by GC and per-call freshness), consider adding a metric or log for monitoring abandoned await goroutines in production.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/sandbox/execute.go` around lines 153 - 181, In awaitWithContext, when the ctx.Done() path abandons the spawned goroutine (the one calling value.Await()) after the 100ms timeout, add an observability signal: emit a metric and/or log an error including context (e.g., indicate awaiting promise abandoned, include the promise/value identity and ctx.Err()) so production can monitor these abandoned await goroutines; update the code around the ctx.Done() branch (referencing awaitWithContext, the done channel, value.Await(), and rt.Close()) to increment a counter and/or call processLogger.Errorf(...) before returning nil, ctx.Err().router/internal/codemode/storage/memory_backend.go (1)
258-280: ⚖️ Poor tradeoffBundle truncation could be expensive for large session op lists.
The
renderCappedfunction iteratively removes operations from the end and re-renders until the bundle fits withinmaxBundleBytes. For sessions with many operations and an expensive renderer, this could result in O(n) render calls.Consider a binary search approach if performance becomes an issue, though for typical session sizes this linear approach is likely acceptable.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/storage/memory_backend.go` around lines 258 - 280, The renderCapped function currently re-renders in a linear loop removing one op at a time (O(n) renders); change it to a binary-search style search over the number of ops to keep to reduce render calls: keep the initial full-render check using b.renderer.Render(ctx, ops, b.Schema()), then if too large perform a binary search on the range [0, len(ops)] calling b.renderer.Render(ctx, ops[:mid], b.Schema()) to find the largest mid such that len(truncated) <= b.maxBundleBytes, preserving ctx.Err() checks and error propagation; update logic in renderCapped (and references to b.renderer.Render, b.maxBundleBytes, ops) so it returns the best truncated bundle or an empty string on failure.router/internal/codemode/sandbox/host.go (1)
168-177: 💤 Low valueGraphQL success determination may be misleading.
The success attribute is set based on HTTP status code (
resp.StatusCode < 400), but GraphQL APIs typically return errors in the response body with a 200 OK status. ThegraphQLErrorscheck on line 169-172 logs errors but doesn't influence thecodemode.op.successattribute. This could cause misleading observability data where a request with GraphQL errors is marked as successful.Consider also checking for GraphQL errors when determining success:
🔧 Suggested improvement
result := normalizeGraphQLResponse(resp.StatusCode, respBody) - if errorsJSON := graphQLErrors(result); errorsJSON != "" { + errorsJSON := graphQLErrors(result) + hasGraphQLErrors := errorsJSON != "" + if hasGraphQLErrors { span.SetAttributes(attribute.String("codemode.graphql.errors", errorsJSON)) observability.LogToolInvocationFailure(s.cfg.Logger, state.req.SessionID, name, fmt.Errorf("graphql errors: %s", errorsJSON)) } - span.SetAttributes(attribute.Bool("codemode.op.success", resp.StatusCode < 400)) + span.SetAttributes(attribute.Bool("codemode.op.success", resp.StatusCode < 400 && !hasGraphQLErrors))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/sandbox/host.go` around lines 168 - 177, The codemode.op.success attribute is currently set only from HTTP status (resp.StatusCode < 400) which ignores GraphQL-level errors; update the logic in the block using normalizeGraphQLResponse and graphQLErrors so success is false if either resp.StatusCode >= 400 or graphQLErrors(result) returns a non-empty string: compute errorsJSON := graphQLErrors(result) first, set span.SetAttributes(attribute.Bool("codemode.op.success", resp.StatusCode < 400 && errorsJSON == "")), and keep the existing observability.LogToolInvocationFailure calls to include errorsJSON and non-2xx statuses (use the existing names normalizeGraphQLResponse, graphQLErrors, span.SetAttributes, and observability.LogToolInvocationFailure).router/internal/codemode/server/server.go (1)
325-344: 💤 Low valueCross-origin protection is bypassed for all paths.
The
AddInsecureBypassPattern("/{path...}")on line 327 effectively disables cross-origin protection for the entire MCP server. While this may be intentional for MCP client compatibility, it should be documented or made configurable for production deployments.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@router/internal/codemode/server/server.go` around lines 325 - 344, The handler currently unconditionally calls AddInsecureBypassPattern("/{path...}") which disables CrossOriginProtection for all paths; update Server.handler to only add that insecure bypass when explicitly enabled via a config flag or environment-controlled field (e.g., add a boolean like Server.allowInsecureMCPBypass or use an existing config field) and otherwise omit the AddInsecureBypassPattern call (or restrict it to a specific safe pattern); reference the CrossOriginProtection creation and the call to AddInsecureBypassPattern in Server.handler and make the behavior configurable and documented for production use.
| module github.com/wundergraph/cosmo/demo/code-mode/yoko-mock | ||
|
|
||
| go 1.25.0 | ||
|
|
||
| require ( | ||
| connectrpc.com/connect v1.19.1 | ||
| github.com/dgraph-io/ristretto/v2 v2.4.0 | ||
| github.com/stretchr/testify v1.11.1 | ||
| github.com/wundergraph/cosmo/router v0.0.0 | ||
| google.golang.org/protobuf v1.36.10 | ||
| ) | ||
|
|
||
| require ( | ||
| github.com/cespare/xxhash/v2 v2.3.0 // indirect | ||
| github.com/davecgh/go-spew v1.1.1 // indirect | ||
| github.com/dustin/go-humanize v1.0.1 // indirect | ||
| github.com/pmezard/go-difflib v1.0.0 // indirect | ||
| golang.org/x/sys v0.40.0 // indirect | ||
| gopkg.in/yaml.v3 v3.0.1 // indirect | ||
| ) | ||
|
|
||
| replace github.com/wundergraph/cosmo/router => ../../../router |
There was a problem hiding this comment.
Explicitly pin go.opentelemetry.io/otel/sdk to >= v1.43.0 to avoid PATH-hijacking CVEs.
go.opentelemetry.io/otel/sdk versions v1.21.0–v1.39.0 are vulnerable to PATH hijacking (arbitrary code execution) on macOS/Darwin; the first patch was v1.40.0. A follow-up incomplete fix (CVE-2026-39883) left BSD platforms vulnerable through v1.42.x; upgrading to v1.43.0 or above fully remediates both.
This module's github.com/wundergraph/cosmo/router dependency (replaced with the local path) brings in google.golang.org/grpc, which can transitively resolve a vulnerable otel/sdk version unless explicitly overridden here.
🛡️ Proposed fix
require (
github.com/cespare/xxhash/v2 v2.3.0 // indirect
github.com/davecgh/go-spew v1.1.1 // indirect
github.com/dustin/go-humanize v1.0.1 // indirect
+ go.opentelemetry.io/otel/sdk v1.43.0 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
golang.org/x/sys v0.40.0 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
)Based on learnings: "In every module go.mod, explicitly pin go.opentelemetry.io/otel/sdk to v1.40.0 or later. Do not rely on transitive resolution for this dependency."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@demo/code-mode/yoko-mock/go.mod` around lines 1 - 22, The go.mod currently
relies on transitive versions that may pull a vulnerable
go.opentelemetry.io/otel/sdk; explicitly add a require entry for
go.opentelemetry.io/otel/sdk v1.43.0 (or later) in the existing require block to
pin the safe version, then run go mod tidy to update the go.sum and vendor
state; locate the module declaration and the require blocks in this file (module
github.com/wundergraph/cosmo/demo/code-mode/yoko-mock and the existing
require(...) section) and insert the new require line alongside the other direct
dependencies.
| s.mu.Lock() | ||
| if existing, ok := s.schemas[id]; ok { | ||
| s.mu.Unlock() | ||
| existing.mu.RLock() | ||
| existingSession := existing.sessionID | ||
| existing.mu.RUnlock() | ||
| log.Printf("Index schema_id=%s reused dir=%s session_id=%s", id, existing.dir, existingSession) | ||
| return connect.NewResponse(&yokov1.IndexResponse{SchemaId: id}), nil | ||
| } | ||
| s.mu.Unlock() | ||
|
|
||
| dir, err := os.MkdirTemp("", "yoko-schema-"+id+"-") | ||
| if err != nil { | ||
| return nil, connect.NewError(connect.CodeInternal, fmt.Errorf("create schema temp dir: %w", err)) | ||
| } | ||
| if err := os.WriteFile(filepath.Join(dir, "schema.graphql"), []byte(schemaSDL), 0o600); err != nil { | ||
| _ = os.RemoveAll(dir) | ||
| return nil, connect.NewError(connect.CodeInternal, fmt.Errorf("write schema.graphql: %w", err)) | ||
| } | ||
|
|
||
| sessionID, err := s.runCodexIndex(ctx, dir) | ||
| if err != nil { | ||
| _ = os.RemoveAll(dir) | ||
| return nil, connect.NewError(connect.CodeInternal, fmt.Errorf("codex pre-warm: %w", err)) | ||
| } | ||
|
|
||
| entry := &schemaEntry{dir: dir, sessionID: sessionID} | ||
| s.mu.Lock() | ||
| s.schemas[id] = entry | ||
| s.mu.Unlock() |
There was a problem hiding this comment.
Serialize first-time indexing per schema ID.
Two concurrent Index calls for the same SDL can both miss s.schemas[id], each create a temp dir and Codex session, and then race on s.schemas[id] = entry. The later write wins, which breaks the idempotent contract and leaks the loser’s temp dir/session state. Re-check before storing or gate this path with a per-schemaID singleflight.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@demo/code-mode/yoko-mock/main.go` around lines 157 - 186, Concurrent Index
calls can race creating temp dirs and sessions; after creating dir and session
via os.MkdirTemp and s.runCodexIndex you must re-check s.schemas[id] (and/or use
a singleflight) before storing to avoid leaking the loser’s dir/session. Modify
the code around schemaEntry creation: acquire s.mu, check if entry already
exists (s.schemas[id]) and if so, release lock, clean up the newly created
dir/session (remove dir and close/terminate session if applicable), and return
the existing schema response; otherwise set s.schemas[id]=entry and release the
lock. Ensure cleanup handles both the temp dir and any resources created by
runCodexIndex when abandoning the new entry.
| employee.Tag = tag | ||
| employee.UpdatedAt = time.Now().String() | ||
| return employee, nil |
There was a problem hiding this comment.
Avoid returning the shared in-memory employee pointer.
At Line 66, returning employee leaks internal mutable state after unlocking, which can lead to concurrent read/write races and cross-request aliasing. Keep the in-place update, but return a copy.
Proposed fix
if id == employee.ID {
employee.Tag = tag
employee.UpdatedAt = time.Now().String()
- return employee, nil
+ out := *employee
+ if employee.Details != nil {
+ d := *employee.Details
+ out.Details = &d
+ }
+ return &out, nil
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| employee.Tag = tag | |
| employee.UpdatedAt = time.Now().String() | |
| return employee, nil | |
| employee.Tag = tag | |
| employee.UpdatedAt = time.Now().String() | |
| out := *employee | |
| if employee.Details != nil { | |
| d := *employee.Details | |
| out.Details = &d | |
| } | |
| return &out, nil |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@demo/pkg/subgraphs/employees/subgraph/schema.resolvers.go` around lines 64 -
66, The code currently returns the shared in-memory pointer `employee`, leaking
internal mutable state; instead, after performing the in-place update (setting
`employee.Tag` and `employee.UpdatedAt`), create and return a copy to avoid
aliasing and races: make a new value (e.g., `copied := *employee`) and if
`Employee` contains reference fields (slices, maps, pointers) deep-copy those
(or allocate new slices/maps and copy elements) and return the address/value of
`copied` rather than the original `employee` pointer from the resolver function
in schema.resolvers.go.
| require ( | ||
| connectrpc.com/connect v1.19.1 | ||
| github.com/MicahParks/jwkset v0.11.0 | ||
| github.com/alicebob/miniredis/v2 v2.34.0 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Checking otel/sdk pins in router-tests/go.mod:"
rg -n 'go\.opentelemetry\.io/otel/sdk' --glob 'router-tests/go.mod'
echo
echo "Cross-module otel/sdk visibility:"
rg -n 'go\.opentelemetry\.io/otel/sdk' --glob '**/go.mod'Repository: wundergraph/cosmo
Length of output: 1898
Update go.opentelemetry.io/otel/sdk to v1.40.0 or later in router-tests/go.mod.
The module currently pins go.opentelemetry.io/otel/sdk v1.39.0 (line 36), which is below the required minimum version. Additionally, the replace directive at line 213 downgrades this to v1.28.0, further violating the dependency pinning policy.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router-tests/go.mod` at line 8, Update the go.mod to ensure the OpenTelemetry
SDK dependency meets the minimum required version: change the require entry for
go.opentelemetry.io/otel/sdk to v1.40.0 or later and remove or update any
replace directive that forces go.opentelemetry.io/otel/sdk to v1.28.0 (the
replace at the bottom of the file) so the module is not downgraded; after
updating, run `go mod tidy` to verify the final resolved version is v1.40.0+ and
that there are no remaining replace entries forcing older otel/sdk versions.
| golang.org/x/time v0.9.0 | ||
| ) | ||
|
|
||
| require github.com/tetratelabs/wazero v1.9.0 // indirect |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Checking otel/sdk pins in router/go.mod:"
rg -n 'go\.opentelemetry\.io/otel/sdk' --glob 'router/go.mod'
echo
echo "Checking all module go.mod files for otel/sdk pins:"
rg -n 'go\.opentelemetry\.io/otel/sdk' --glob '**/go.mod'Repository: wundergraph/cosmo
Length of output: 1886
Update go.opentelemetry.io/otel/sdk to v1.40.0 or later in both require and replace directives.
Line 46 pins go.opentelemetry.io/otel/sdk to v1.39.0, and line 197 replaces it with v1.28.0—both below the required v1.40.0 minimum. While updating dependencies here, align these pins with the project policy.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/go.mod` at line 95, Update the go.mod entries that pin
go.opentelemetry.io/otel/sdk: change the require directive currently at v1.39.0
and the replace directive currently at v1.28.0 so both use v1.40.0 or later;
locate and edit the lines referencing module name "go.opentelemetry.io/otel/sdk"
in the require and replace blocks and set their versions to at least v1.40.0 to
satisfy the project policy and keep them consistent.
| func renderBundleBlocks(blocks []string, omitted int) string { | ||
| var b strings.Builder | ||
| b.WriteString(graphQLErrorAlias) | ||
| b.WriteByte('\n') | ||
| b.WriteString(responseAlias) | ||
| b.WriteString("\n\n") | ||
|
|
||
| if len(blocks) == 0 { | ||
| b.WriteString("declare const tools: {};") | ||
| } else { | ||
| b.WriteString("declare const tools: {\n") | ||
| for i, block := range blocks { | ||
| if i > 0 { | ||
| b.WriteString("\n\n") | ||
| } | ||
| b.WriteString(indentBlock(block, " ")) | ||
| } | ||
| b.WriteString("\n};") | ||
| } | ||
|
|
||
| b.WriteString("\n\n") | ||
| b.WriteString(notNullHelper) | ||
| b.WriteByte('\n') | ||
| b.WriteString(compactHelper) | ||
|
|
||
| if omitted > 0 { | ||
| fmt.Fprintf(&b, "\n// truncated: %d ops omitted", omitted) | ||
| } | ||
|
|
||
| return b.String() |
There was a problem hiding this comment.
Keep the bundle prelude in sync with the persisted-ops contract.
renderBundleBlocks no longer emits the Known limitation: union and interface selections are typed as unknown. banner, but the persisted-bundle expectations in router-tests/code_mode_named_ops_test.go still include it. That means bundles rendered here will drift from the tested/output contract.
Suggested fix
const (
defaultMaxBundleBytes = 64 * 1024
graphQLErrorAlias = "type GraphQLError = { message: string; path?: (string | number)[]; extensions?: Record<string, unknown> };"
responseAlias = "type R<T> = Promise<{ data: T | null; errors?: GraphQLError[] }>;"
+ knownLimitationNote = "// Known limitation: union and interface selections are typed as unknown."
notNullHelper = "declare function notNull<T>(value: T | null | undefined, message?: string): T;"
compactHelper = "declare function compact<T>(value: T): T;"
)
@@
b.WriteString(graphQLErrorAlias)
b.WriteByte('\n')
b.WriteString(responseAlias)
- b.WriteString("\n\n")
+ b.WriteByte('\n')
+ b.WriteString(knownLimitationNote)
+ b.WriteString("\n\n")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/internal/codemode/tsgen/tsgen.go` around lines 87 - 116,
renderBundleBlocks no longer emits the "Known limitation: union and interface
selections are typed as unknown." banner, causing a mismatch with the
persisted-ops contract; update renderBundleBlocks to prepend or include that
exact banner string (the same text used in persisted bundles/tests) in the
output alongside graphQLErrorAlias/responseAlias so rendered bundles match
router-tests/code_mode_named_ops_test.go expectations. Ensure the banner
placement and wording exactly match the contracted text and that it appears when
appropriate (i.e., in the same conditions the persisted-bundle expects) within
the function renderBundleBlocks.
| func (s *GraphQLSchemaServer) ensureCodeModeSandbox() (*sandbox.Sandbox, error) { | ||
| if s.codeModeSandbox != nil { | ||
| return s.codeModeSandbox, nil | ||
| } | ||
| sb, err := sandbox.New(sandbox.Config{ | ||
| RouterGraphQLEndpoint: s.routerGraphQLEndpoint, | ||
| StorageLookup: s.codeModeStorageLookup, | ||
| Logger: s.logger.With(zap.String("component", "code_mode_sandbox")), | ||
| }) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("create code mode sandbox: %w", err) | ||
| } | ||
| s.codeModeSandbox = sb | ||
| return sb, nil |
There was a problem hiding this comment.
Guard sandbox lazy initialization against concurrent calls.
handleCodeModeRunJS can run in parallel, and this check-then-assign path lets multiple first requests create separate sandboxes before one overwrites s.codeModeSandbox. That leaks the loser and makes initialization nondeterministic. Please protect this with a mutex or sync.Once.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/pkg/mcpserver/code_mode.go` around lines 144 - 157, The lazy
initialization of s.codeModeSandbox in ensureCodeModeSandbox is racy when
handleCodeModeRunJS runs concurrently; protect it with a mutex-backed
double-check: add a field like codeModeMu sync.Mutex on GraphQLSchemaServer,
then in ensureCodeModeSandbox do the existing fast-path nil check, acquire
s.codeModeMu, re-check s.codeModeSandbox, create the sandbox (sb, err :=
sandbox.New(...)) only if still nil, assign s.codeModeSandbox = sb, release the
mutex, and return sb or the error—this fixes the race without changing the
external behavior (alternatively you may use sync.Once plus a stored initErr
field if you prefer Once semantics).
| for _, h := range m.handlers { | ||
| h.RegisterRoutes(mux) | ||
|
|
||
| // Upstream-bound handlers carry their own SDL — load it now so their | ||
| // tools are ready before any client connects. Supergraph-bound handlers | ||
| // wait for the router's Reload(supergraphSchema, ...) call. | ||
| if h.HasUpstreamSchema() { | ||
| doc, err := parseSDL(h.upstreamSchemaSDL) | ||
| if err != nil { | ||
| return fmt.Errorf("mcp server %q: parse upstream SDL: %w", h.graphName, err) | ||
| } | ||
| if err := h.Reload(doc, nil); err != nil { | ||
| return fmt.Errorf("mcp server %q: initial reload: %w", h.graphName, err) | ||
| } | ||
| } | ||
|
|
||
| // Per-collection operations directory watcher: hot-reloads tools when | ||
| // .graphql / .gql files are added, modified, or removed. | ||
| // Supergraph-bound handlers without an initial Reload yet still benefit — | ||
| // the watcher is no-op until the first Reload populates a schema, after | ||
| // which it picks up file changes on the next tick. | ||
| watchEnabled, interval := h.WatchSettings() | ||
| if watchEnabled && h.OperationsDir() != "" { | ||
| handler := h // capture loop variable for the callback | ||
| err := WatchOperationsDir(handler.Context(), handler.OperationsDir(), interval, func() { | ||
| if err := handler.ReloadOperations(); err != nil { | ||
| m.logger.Warn("hot-reload of MCP operations failed", | ||
| zap.String("name", handler.graphName), | ||
| zap.String("path", handler.path), | ||
| zap.Error(err)) | ||
| return | ||
| } | ||
| m.logger.Info("MCP operations hot-reloaded", | ||
| zap.String("name", handler.graphName), | ||
| zap.String("path", handler.path)) | ||
| }, m.logger.With(zap.String("mcp_server_name", handler.graphName))) | ||
| if err != nil { | ||
| return fmt.Errorf("mcp server %q: start operations watcher: %w", h.graphName, err) | ||
| } | ||
| } |
There was a problem hiding this comment.
Tear down started watchers on partial startup failure.
Start launches per-handler watchers as it iterates. If a later handler then fails parseSDL, Reload, or watcher setup, Start returns an error while the earlier watchers keep running unless the caller remembers to call Stop on a failed start. Consider validating/priming all handlers first, or canceling any already-started handlers before returning the error.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/pkg/mcpserver/multi_server.go` around lines 63 - 102, Start currently
begins per-handler watchers as it loops and may return on a later error while
earlier watchers keep running; fix by either (A) priming/validating all handlers
before starting any watchers (run parseSDL and h.Reload for HasUpstreamSchema()
for every handler first), or (B) track started watcher resources and teardown on
error (maintain a startedHandlers/startedWatchCancels slice while iterating and
on any parseSDL/Reload/WatchOperationsDir error iterate it to call the
appropriate stop/cancel method on each handler or invoke m.Stop() to cleanly
stop already-started handlers) so no watcher remains running after Start returns
an error; use the existing symbols parseSDL, h.Reload, WatchOperationsDir, and
m.Stop/handler stop method names to locate and implement the changes.
| if pendingReload { | ||
| pendingReload = false | ||
| logger.Info("operations directory changed; reloading tools and notifying connected clients") | ||
| onChange() | ||
| } |
There was a problem hiding this comment.
Guard onChange against panics to keep the watcher alive.
A panic in onChange() will terminate the goroutine, silently disabling further reloads.
🛡️ Proposed hardening
if pendingReload {
pendingReload = false
logger.Info("operations directory changed; reloading tools and notifying connected clients")
- onChange()
+ func() {
+ defer func() {
+ if r := recover(); r != nil {
+ logger.Error("operations reload callback panicked", zap.Any("panic", r))
+ }
+ }()
+ onChange()
+ }()
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/pkg/mcpserver/operations_watcher.go` around lines 69 - 73, The call to
onChange() inside the pendingReload block can panic and kill the watcher
goroutine; wrap the invocation in a panic-safe guard (e.g., defer/recover) so
the goroutine stays alive and future reloads still occur. Specifically, in the
pendingReload handling where logger.Info(...) and onChange() are called, invoke
onChange through a small safety wrapper or add a deferred recover around the
call that logs the recovered panic (including the panic value and stack) via
logger.Error and continues execution; do not change pendingReload semantics or
suppress the original logging.
| // fileFingerprint identifies a file's relevant state (modification time + size). | ||
| type fileFingerprint struct { | ||
| modTime time.Time | ||
| size int64 | ||
| } | ||
|
|
||
| // snapshotOperationFiles returns a map of path → fingerprint for every .graphql / .gql | ||
| // file under dir. Used by the watcher to detect added, removed, or modified operations. | ||
| func snapshotOperationFiles(dir string) (map[string]fileFingerprint, error) { | ||
| out := map[string]fileFingerprint{} | ||
| walkErr := filepath.WalkDir(dir, func(path string, d fs.DirEntry, err error) error { | ||
| if err != nil { | ||
| return err | ||
| } | ||
| if d.IsDir() { | ||
| return nil | ||
| } | ||
| ext := strings.ToLower(filepath.Ext(path)) | ||
| if ext != ".graphql" && ext != ".gql" { | ||
| return nil | ||
| } | ||
| info, err := d.Info() | ||
| if err != nil { | ||
| return nil // skip unreadable entries; treat as if absent | ||
| } | ||
| out[path] = fileFingerprint{modTime: info.ModTime(), size: info.Size()} | ||
| return nil |
There was a problem hiding this comment.
modTime + size fingerprints can miss real file edits.
If content changes without size change and filesystem timestamp granularity is coarse, changes can be missed and tools won’t reload.
🔍 More reliable fingerprinting option
type fileFingerprint struct {
modTime time.Time
size int64
+ hash string
}
...
info, err := d.Info()
if err != nil {
return nil // skip unreadable entries; treat as if absent
}
- out[path] = fileFingerprint{modTime: info.ModTime(), size: info.Size()}
+ sum, err := hashFile(path)
+ if err != nil {
+ return nil
+ }
+ out[path] = fileFingerprint{modTime: info.ModTime(), size: info.Size(), hash: sum}
return nil
})Also applies to: 115-122
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@router/pkg/mcpserver/operations_watcher.go` around lines 81 - 107, The
current fileFingerprint (modTime, size) in snapshotOperationFiles can miss
content-only edits; change the fingerprinting to include a content hash: update
the fileFingerprint type (or add a new field like digest string) and in
snapshotOperationFiles read the file contents (e.g., os.ReadFile) and compute a
stable hash (SHA256) to store alongside or instead of modTime/size; use that
digest when comparing snapshots (also update the other snapshot-using code
around the similar block at lines 115-122 to compare digests) so edits that
don't change size or have coarse mtime granularity are detected reliably.
Summary
code_mode_run_jsMCP tool that auto-registers on every MultiServer collection with at least one operation loaded. Each server gets its own V8 sandbox bound to its upstream URL;tools.<OperationName>(vars)dispatches via the existing op catalog (no Yoko, no storage backend, no session state).code_mode.enableddefaults to false).total_toolscount, plus "...reloading tools and notifying connected clients" on the watcher tick.Status
Draft — primarily here so CI can produce a Docker image for the conference demo. Not intended to merge as-is:
pkg/mcpserver/).yoko.Searcherinterface,opsFragmentrename,fakeYokotest stub) — see commit56a3a393ffor details.Commits on this branch (vs main)
28f10365frefactor(mcp): slim code_mode_run_js description, scope to bound nameseadb47be1chore(mcp): clarify reload logs as client-broadcast notifications56a3a393ffeat(mcp): per-server code_mode_run_js tool, merging code-mode-v21720a9260feat(mcp): demo multi-graph MCP server with hot-reloading operations9088da4edchore: wip multi-collection mcpTest plan
tools/liston each sub-server returnscode_mode_run_jsplus the per-op toolscode_mode_run_jsagainst/countriesreturns just a number for "how many countries speak Spanish" (return-value discipline)Summary by CodeRabbit
Code Mode Support & Demo Environments