Skip to content

holo-q/python-devtools-mcp

Repository files navigation

python-devtools

Live runtime inspection for any Python app — MCP-powered.

Connect Claude Code (or any MCP client) to your running Python process.
Query state, eval expressions, inspect objects, read source — all while the app runs.


┌─────────────────┐         TCP/JSON-lines         ┌──────────────────┐
│                  │ ◄──────────────────────────────►│                  │
│   Your App       │       localhost:auto            │   MCP Bridge     │
│   (3 lines)      │                                 │   (stdio ↔ TCP)  │
│                  │                                 │                  │
└─────────────────┘                                 └────────┬─────────┘
                                                             │ MCP stdio
                                                    ┌────────▼─────────┐
                                                    │   Claude Code    │
                                                    │   or any MCP     │
                                                    │   client         │
                                                    └──────────────────┘

Install · Quick Start · Wrapper Mode · Tools · Threading · Security


Install

# Install once (includes MCP bridge dependency)
pip install python-devtools

Or with uv:

uv add python-devtools

Or run the MCP bridge through uv:

uv run --project /abs/project/path --with mcp python-devtools

Quick Start

1. Embed in your app

import python_devtools as devtools

devtools.register('app', my_app)
devtools.register('db', database)
devtools.start(app_id='my-app')  # localhost:<auto free port>

Three lines. Your app now speaks devtools.

2. Connect your agent

The bridge connects to already-running apps only — it never launches your program.

Claude Code — install as a plugin (preferred, MCP applies globally once enabled):

/plugin marketplace add holo-q/python-devtools-mcp
/plugin install python-devtools@python-devtools

Or the manual route — add to your .claude/settings.json:

{
  "mcpServers": {
    "python-devtools": {
      "command": "python-devtools"
    }
  }
}

Codex CLI — one-shot global registration:

codex mcp add python-devtools -- python-devtools

This writes the [mcp_servers.python-devtools] block into ~/.codex/config.toml. Alternatively, the repo ships a .codex/config.toml for project-scoped use (active once you trust the project).

If you prefer uv-managed execution and want to guarantee the MCP SDK is present:

{
  "mcpServers": {
    "python-devtools": {
      "type": "stdio",
      "command": "uv",
      "args": ["run", "--project", "/abs/project/path", "--with", "mcp", "python-devtools", "--app-id", "my-app"]
    }
  }
}

3. Inspect live state

Claude can now reach into your running app:

> run("len(app.users)", app_id="my-app")
→ 42

> inspect("app.config", app_id="my-app")
→ {type: AppConfig, attrs: [{name: debug, type: bool, repr: True}, ...]}

> run("app.users[0].email", app_id="my-app")
→ alice@example.com

> source("type(app.users[0]).validate", app_id="my-app")
→ def validate(self): ...

Discovery (No Fixed Ports)

devtools.start() defaults to port=0, so each app instance binds an available free port.

  • App side: each running instance writes {app_id, host, port, pid} to a local registry
  • Bridge side: tools resolve app_id to the current endpoint from that registry
  • Unknown app_id: the bridge pings candidates and returns the list of running apps
  • Crash/system-crash safety: stale registry records are pruned automatically when liveness checks fail

This removes the need to reserve one static port per app.


Wrapper Mode

Don't want to modify your app's source? Wrap it:

python-devtools --app-id myapp -- uv run myapp.py
python-devtools --app-id flask-dev -- flask run
python-devtools --app-id worker --port 9230 -- python worker.py
python-devtools --readonly -- python myapp.py        # locks down mutation tools

If --app-id is omitted, the wrapper synthesizes one from the entry script and pid (e.g. myapp-12345) so multiple instances of the same script don't collide in the registry.

This injects a devtools server into the child process via sitecustomize.pyno code changes needed. The child gets a TCP server on startup, and __main__ is auto-registered as main:

> run("dir(main)")
→ ['__builtins__', '__file__', 'app', 'config', 'db', ...]

> run("main.app.config['DEBUG']")
→ True
How it works

The wrapper prepends a generated sitecustomize.py to PYTHONPATH. When the child Python interpreter starts, site.py imports it, which:

  1. Chains to any existing sitecustomize.py (removes inject dir from path, imports original, restores)
  2. Starts the devtools TCP server on a free port (or your configured port)
  3. Registers __main__ — the module ref is captured early but populated later with the script's globals

The python_devtools package is also added to PYTHONPATH, so it doesn't need to be installed in the child's environment.

Non-Python children (e.g., python-devtools -- node app.js) are harmless — the env vars are set but nothing reads them.


Pair with the MCP bridge for Claude Code access:

# Terminal 1: run your app with devtools injected
python-devtools --app-id myapp -- uv run myapp.py

# Claude Code config: MCP bridge routes by app_id
# .claude/settings.json
{
  "mcpServers": {
    "python-devtools": {
      "command": "python-devtools"
    }
  }
}

Tools

Tool Description Mutates
running_apps List reachable app IDs discovered from the local registry (stale entries are auto-pruned)
run Eval an expression or exec a statement in the app's live namespace yes
call Call a callable at a dotted path with args/kwargs yes
set_value Set an attribute or item at a dotted path yes
inspect Structured inspection — type, repr, public attrs, recursive
list_path Shallow enumeration — attrs, keys, or items at a path
repr_obj Quick type + repr — fastest tool, minimal overhead
source Get source code of a function, class, or method
state List all registered namespaces and their types
logs Indexed log tail (default), paging (before_id), and follow mode (after_id + wait_seconds)
screenshot Capture the live app's GUI as a PNG (requires set_screenshot_fn)
winshot Render a code snippet in an isolated offscreen window and return a PNG (requires set_winshot_fn) yes
ping Connection health check (returns running apps when app_id is omitted)

Every tool accepts an optional app_id argument. If no default app ID is set on the bridge and the supplied app ID is not found, the bridge pings known endpoints, prunes stale records, and returns the running apps list.

run also accepts optional max_result_chars and max_result_lines limits (defaults: 0 / 0, meaning no truncation). Set either value to > 0 to compact very large text outputs for model readability — the response carries a head/tail preview and a top_patterns summary so models can still reason about long log spew.

Mutation tools (run, call, set_value, winshot) attach a devtools_warning field if the target app's pid/port changes immediately after the call — surfacing crashes or restarts that would otherwise look like silent successes.

Debugging Pattern (Timeline First)

For agent-assisted debugging, prefer logs over brute-force state probing:

  1. Take a tail snapshot: logs(limit=200, app_id="my-app")
  2. Reproduce the issue and follow new lines: logs(after_id=<last_id>, wait_seconds=5, app_id="my-app")
  3. Page older context when needed: logs(before_id=<first_id>, limit=200, app_id="my-app")

Log results include stable id indices so agents can move backward/forward deterministically.

Argparse Integration

For apps that already use argparse:

import argparse
import python_devtools as devtools

parser = argparse.ArgumentParser()
devtools.add_arguments(parser)  # adds --devtools, --devtools-port, --devtools-app-id, --devtools-readonly

args = parser.parse_args()
devtools.from_args(args, app=my_app, db=database)
python myapp.py --devtools --devtools-app-id myapp
python myapp.py --devtools --devtools-app-id myapp --devtools-port 9230
python myapp.py --devtools --devtools-app-id myapp --devtools-readonly

Threading Safety

GUI apps, game loops, and anything with a main-thread constraint need an invoker:

import concurrent.futures
import queue

main_queue = queue.Queue()

def invoke_on_main(fn):
    """Route devtools calls onto the main thread."""
    future = concurrent.futures.Future()
    main_queue.put((fn, future))
    return future.result(timeout=10)

devtools.set_main_thread_invoker(invoke_on_main)
devtools.start()

# In your main loop:
while running:
    while not main_queue.empty():
        fn, future = main_queue.get()
        future.set_result(fn())
    # ... rest of frame

Without an invoker, calls run inline on the TCP handler thread (a one-time warning is emitted).

GUI Capture (screenshot / winshot)

Two optional callbacks expose visual state to the agent. Both run through the main-thread invoker, so they have access to the app's framebuffer / GL context.

def capture() -> bytes:
    """Capture current visual state as PNG bytes."""
    ...

def render_winshot(code: str) -> bytes:
    """Render `code` into an isolated offscreen window, return PNG bytes."""
    ...

devtools.set_screenshot_fn(capture)        # enables the `screenshot` MCP tool
devtools.set_winshot_fn(render_winshot)    # enables the `winshot` MCP tool
  • screenshot captures the whole live app — useful for "what does the user actually see right now".
  • winshot captures only the code the agent passes (a single panel, a test widget, a component in isolation) — useful for verifying UI changes without spinning up the full app state.

Both tools cleanly error out with an actionable message if the app hasn't registered the corresponding callback. The bridge converts the returned PNG bytes into an Image MCP payload so Claude Code renders it inline.

Readonly Mode

Lock down mutation tools for safer inspection:

devtools.start(readonly=True)
{
  "mcpServers": {
    "python-devtools": {
      "command": "python-devtools",
      "args": ["--readonly"]
    }
  }
}

In readonly mode, run, call, set_value, and winshot are not registered — only inspection and screenshot are available. The server also enforces this at the protocol layer, so a stale read-write bridge against a readonly app still gets rejected.

Plugin / Extension Layout

This repo doubles as a Claude Code plugin and a Codex CLI project-scoped extension. The relevant files:

.claude-plugin/
├── plugin.json          # Claude Code plugin manifest
└── marketplace.json     # Lets the repo serve itself as a single-plugin marketplace
.mcp.json                # MCP server registration consumed by Claude Code on plugin enable
.codex/
└── config.toml          # Codex CLI project-scoped MCP registration (loads on trust)
AGENTS.md                # Auto-loaded by Codex; agent-facing usage doc for python-devtools

Claude Code install flow — plugin enable wires the MCP server globally (every Claude Code session, not just inside this repo):

/plugin marketplace add holo-q/python-devtools-mcp
/plugin install python-devtools@python-devtools

Codex CLI install flow — Codex has no plugin system, only config files. Cleanest path:

codex mcp add python-devtools -- python-devtools     # writes ~/.codex/config.toml globally

For project scope only, run codex inside this repo and accept the trust prompt — .codex/config.toml is then honored automatically.

Security

LOCAL_TRUSTED — loopback only, no auth, eval enabled.

This is a development tool, not a production service.

  • Binds to localhost only — non-loopback connections are rejected
  • No authentication — anyone on localhost can connect
  • eval/exec is unrestricted — full access to your Python process
  • The readonly flag disables mutation tools but does not add auth

Do not expose to networks. Do not run in production.

Observable State

For GUI status indicators, the server exposes:

devtools.running          # bool — is the server listening?
devtools.n_clients        # int  — currently connected clients
devtools.n_commands       # int  — total commands processed
devtools.last_command_time  # float — time.time() of last command

Architecture

python-devtools/
├── __init__.py      # Module API — register, start, stop, set_*_fn, add_arguments, from_args
├── _core.py         # DevTools orchestrator — lifecycle, argparse, callback registration
├── _registry.py     # Local app registry (XDG cache) — app_id → host/port/pid lookup
├── _server.py       # TCP JSON-lines server — accept loop, dispatch, log capture, loopback guard
├── _resolve.py      # Object resolution — eval/exec, inspect, serialize, compaction
├── _cli.py          # MCP stdio bridge — app-id router, mutation pid/port watchdog
└── _wrap.py         # Wrapper mode — sitecustomize.py injection via PYTHONPATH

The app runtime server (__init__, _core, _server, _resolve, _registry) is pure stdlib — zero deps in your app's process. The MCP bridge (_cli) uses the bundled mcp dependency, which ships in the base install (no extras to remember).

Wire-level summary

agent ──► MCP stdio ──► _cli (bridge) ──► TCP JSON-lines ──► _server ──► _resolve ──► your objects
                            │                                   │
                            └─ registry (~/.cache/python-devtools/registry/*.json) ──┘

Each devtools.start() writes {app_id, host, port, pid, readonly} to the registry; the bridge resolves the agent's app_id to the freshest live entry, prunes dead ones via ping, and reuses TCP clients per endpoint.


MIT License