Skip to content

Workaround: route NuGet RestoreTask to transient TaskHost in server or mt modes#13660

Open
OvesN wants to merge 8 commits intodotnet:mainfrom
OvesN:workaround-restore-issues-in-mt
Open

Workaround: route NuGet RestoreTask to transient TaskHost in server or mt modes#13660
OvesN wants to merge 8 commits intodotnet:mainfrom
OvesN:workaround-restore-issues-in-mt

Conversation

@OvesN
Copy link
Copy Markdown
Contributor

@OvesN OvesN commented Apr 30, 2026

Fixes #13315

Context

NuGet's RestoreTask holds static singletons (PluginManager, EnvironmentVariableWrapper) that assume
one invocation per process. Two MSBuild modes break that assumption:

  • MSBuild Server (DOTNET_CLI_USE_MSBUILD_SERVER=1 / MSBUILDUSESERVER=1): one process services many
    builds back-to-back.
  • Multi-threaded MSBuild (/mt): RestoreTask is already routed to a TaskHost for thread safety, because it is not migrated, but it's a long-lived sidecar reused for every invocation in the build — statics leak between projects.

In both cases NuGet's static state survives past its intended scope.

This PR routes RestoreTask to a transient TaskHost (nodeReuse=false) when either mode is active, so
the spawned MSBuild.exe exits after Execute() and statics die with it

Changes Made

Why the original attempt didn't work

Original commit.
The server-mode trigger read Traits.Instance.UseMSBuildServer, which checks
MSBUILDUSESERVER. That env var is 0/null in the worker process where tasks run,
stripped by:

  1. NodeLauncher.DisableMSBuildServer zeroes it before spawning the Server child
    (recursion guard).
  2. OutOfProcServerNode.HandleServerNodeBuildCommand overwrites the server's env from
    the client's snapshot, which doesn't include MSBuild internals.

Net effect: the workaround branch never fired.

What this PR does

  • Sidecar env var Traits.OriginalUseMSBuildServerEnvVarName (_MSBUILDORIGINALUSESERVER).
    NodeLauncher.DisableMSBuildServer stashes the original MSBUILDUSESERVER value into
    it; OutOfProcServerNode re-applies it after the env-snapshot restore. Exposed as
    Traits.Instance.WasLaunchedInMSBuildServerMode.
  • Allow-list TaskRouter.IsKnownProblematicTask — currently NuGet.Build.Tasks.RestoreTask.
  • Routing decision AssemblyTaskFactory.CreateTaskInstance — when the task is on the
    allow-list AND (/mt mode OR WasLaunchedInMSBuildServerMode), force
    useTaskFactory = true and forceTransientTaskHost = true.

Diagnostic logging added

Per-invocation TaskHost diagnostics in TaskHostTask.Execute (low importance, captured
in binlog) — complements the existing ExecutingTaskInTaskHost message. Records
ProcessId, ParentProcessId, NewNodeContext, IsSidecar, NodeReuseEffective.
Useful when investigating any TaskHost-routing question.

Testing

Unit tests in src/Build.UnitTests/BackEnd/TaskRouter_IntegrationTests.cs:

Manual end-to-end with dotnet restore App.csproj /bl:r.binlog:

  • With DOTNET_CLI_USE_MSBUILD_SERVER=1 (server mode) and again with /mt, opened the
    binlog and confirmed two TaskHost details for task "RestoreTask" lines with
    IsSidecar=False, NodeReuseEffective=False, and different ProcessId between
    multiple dotnet restore invocations.

Notes

AR-May and others added 2 commits April 22, 2026 10:52
Workaround for static singleton state issues in NuGet RestoreTask
(e.g., PluginManager, EnvironmentWrapper) that persist across builds
when running in sidecar TaskHost processes.

When /mt mode or MSBuild server (MSBUILDUSESERVER=1) is active,
RestoreTask is now forced to run in a transient (non-sidecar) TaskHost
that terminates after execution, ensuring all static state is cleaned up.

Changes:
- TaskRouter: Add IsKnownProblematicTask() to identify tasks by full name
- AssemblyTaskFactory: Force transient TaskHost for problematic tasks
- Tests: Add unit and integration tests for the workaround

Fixes dotnet#13315

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…alue of MSBUILDUSESERVER. Add logging for task host spawning details.
@OvesN OvesN force-pushed the workaround-restore-issues-in-mt branch from 9793967 to a1ce582 Compare April 30, 2026 13:08
@OvesN OvesN changed the title Workaround restore issues in mt Workaround: route NuGet RestoreTask to transient TaskHost in server or mt modes Apr 30, 2026
@OvesN OvesN marked this pull request as ready for review April 30, 2026 13:53
Copilot AI review requested due to automatic review settings April 30, 2026 13:53
@OvesN
Copy link
Copy Markdown
Contributor Author

OvesN commented Apr 30, 2026

/review

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 30, 2026

Expert Code Review (command) completed successfully!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements a workaround to isolate NuGet’s RestoreTask (which relies on process-wide static singletons) by forcing it onto a transient TaskHost when running under MSBuild Server mode or /mt, preventing static state from leaking across builds/invocations.

Changes:

  • Add an internal “original server mode” env var (_MSBUILDORIGINALUSESERVER) to preserve server-mode detection through environment snapshotting.
  • Route allow-listed “known problematic” tasks (currently NuGet.Build.Tasks.RestoreTask) to a non-sidecar (transient) TaskHost in /mt or server mode.
  • Add TaskHost diagnostic logging and integration tests validating routing + per-invocation process isolation.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/Framework/Traits.cs Adds _MSBUILDORIGINALUSESERVER env var name and trait for detecting server-mode launch context.
src/Build/BackEnd/Components/Communications/NodeLauncher.cs Stashes/restores the original MSBUILDUSESERVER value into the new internal env var around child process creation.
src/Build/BackEnd/Node/OutOfProcServerNode.cs Preserves the internal env var across SetEnvironment(...) so server node can still detect original server-mode intent.
src/Build/BackEnd/Components/RequestBuilder/TaskRouter.cs Introduces allow-list logic to identify “known problematic” tasks by full type name.
src/Build/Instance/TaskFactories/AssemblyTaskFactory.cs Forces problematic tasks to run in TaskHost and disables sidecar reuse in /mt or server mode.
src/Build/BackEnd/Components/Communications/NodeProviderOutOfProcTaskHost.cs Extends host acquisition to report host PID / creation status for diagnostics.
src/Build/Instance/TaskFactories/TaskHostTask.cs Logs per-invocation TaskHost details (PID, reuse, etc.) into the build log/binlog.
src/Build.UnitTests/BackEnd/TaskRouter_IntegrationTests.cs Adds integration tests for problematic-task routing and “fresh process per invocation” behavior.

Comment thread src/Build.UnitTests/BackEnd/TaskRouter_IntegrationTests.cs Outdated
Comment thread src/Framework/Traits.cs Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expert Review — 24-Dimension Analysis

Summary

# Dimension Verdict
1 Backward Compatibility ✅ LGTM (test-only)
2 ChangeWave Discipline ✅ LGTM (test-only)
3 Performance ✅ LGTM (test-only)
4 Allocation Awareness ✅ LGTM (test-only)
5 Test Coverage ✅ LGTM — covers MT mode, server mode, fresh-process guarantee, and no-workaround fallback
6 Error Message Quality ✅ LGTM (test-only)
7 Logging Fidelity ✅ LGTM (test-only)
8 String Comparison ✅ LGTM — regex matches ASCII digits only, ShouldContain is ordinal
9 API Surface ✅ LGTM (test-only)
10 Target Authoring ✅ LGTM (test-only)
11 Cross-Platform ✅ LGTM — uses Path.Combine, Process.GetCurrentProcess().Id, standard APIs
12 Code Simplification ✅ LGTM — boilerplate duplication is acceptable for test readability
13 Concurrency ✅ LGTM — xunit.runner.json enforces parallelizeTestCollections: false / maxParallelThreads: 1
14 Naming Precision ✅ LGTM — names are descriptive and consistent with existing tests
15 SDK Integration ✅ LGTM (test-only)
16 Evaluation Model ✅ LGTM (test-only)
17 Correctness ✅ LGTM — fake RestoreTask FullName matches TaskRouter.IsKnownProblematicTask check; Build() overload is self-contained
18 Documentation Accuracy 📝 NIT — see inline comment
19 Dependency Management ✅ LGTM (test-only)
20 Scope Discipline ✅ LGTM — tests + comment cleanup in the same PR is reasonable
21 Security ✅ LGTM (test-only)
22 Build Infrastructure ✅ LGTM (test-only)
23 Binary Log Compatibility ✅ LGTM (test-only)
24 Error Handling ✅ LGTM (test-only)

Findings: 1 NIT

One nit on the server-mode test's EnableNodeReuse comment// Load-bearing: see the /mt counterpart. is a fragile cross-reference that could become a dead pointer. Consider duplicating the 1-line explanation inline. Details in the inline comment.

The comment cleanup is well-executed overall: removed XML docs restated what the method name already says, // Arrange/// Act/// Assert markers were noise for these straightforward tests, and the condensed // Load-bearing: comments on the MT test preserve the non-obvious reasoning. The tests themselves are correct and well-structured.

Note

🔒 Integrity filter blocked 2 items

The following items were blocked because they don't meet the GitHub integrity level.

  • #13660 pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • #13660 pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".

To allow these resources, lower min-integrity in your GitHub frontmatter:

tools:
  github:
    min-integrity: approved  # merged | approved | unapproved | none

Generated by Expert Code Review (command) for issue #13660 · ● 9.7M

Comment thread src/Build.UnitTests/BackEnd/TaskRouter_IntegrationTests.cs
@OvesN OvesN requested a review from AR-May April 30, 2026 14:08
/// This is a temporary workaround until the task authors fix their static state issues.
/// See https://github.com/dotnet/msbuild/issues/13315
/// </summary>
private static readonly string[] s_knownProblematicTaskNames =
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

@JanProvaznik JanProvaznik Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or in this case just const string if it's one problematic task...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Execute Restore tasks in the TaskHost node in /mt mode or when msbuild server is on.

4 participants