docs: custom autosummary template + residual triage (PR 2.5) by igerber · Pull Request #410 · igerber/diff-diff

igerber · 2026-05-10T00:46:16Z

Summary

Closes the Sphinx WARNING gap PR #407 left. PR #407 dropped warnings from 2,170 to 635 (71%) by adding `:no-index:` to all 78 page-level `.. autoclass::` directives. The empirical spike on `bacon.rst` revealed that `:no-index:` on autoclass suppresses CLASS-level cross-reference registration only — it doesn't propagate to autodoc-generated members when `:members:` is implicit (via `autodoc_default_options["members"] = True` in `docs/conf.py:39-44`). 614 of those 635 residuals were member-level duplicates.

PR 2.5 closes both that gap and the 21 non-duplicate residuals so PR 3 (queued: `-W` enforcement in `docs-tests.yml`) can land cleanly.

Result: `make html` is now 0 ERRORs / 0 WARNINGs (down from 2,170 at PR #407 start; 635 at PR #407 merge).

Architecture

User-decided this session: custom autosummary class template that adds `:no-members:` to the autoclass in stubs (Option A). Effect:

Stubs: header + autosummary tables linking to page-level (no inline member docs)
Page-level pages: keep current full inline member rendering (UX preserved)
Class cross-refs `:py:class:DiDResults` → stub URL (per PR docs: add :no-index: to page-level autoclass directives (PR 2 of 2) #407 page-level `:no-index:`)
Member cross-refs `:py:meth:DiDResults.summary` → page-level URL (only place still registered)
Function cross-refs `:py:func:diff_diff.compute_power` → page-level URL (stub now `:no-index:` via custom function.rst template)

Sphinx's `autosummary_generate_overwrite=True` default means a single clean build regenerates all 50 class stubs + 48 function stubs from the new templates — no hand-edit per stub.

Spike-before-bulk discipline

Per `feedback_spike_before_bulk_mechanical_edit` (lesson saved from PR #407): the plan included an explicit empirical spike of just the class.rst template. The spike caught that 41 of 44 remaining residuals were FUNCTION duplicates (not class), surfacing the need for a function.rst template too. The remaining 3 page-vs-page were pinned to specific module-name conflicts (`diff_diff.chaisemartin_dhaultfoeuille`, `diff_diff.stacked_did`, `diff_diff.trop` — function names that shadow module names) and resolved with targeted page-level `:no-index:` on the autofunction. Without the spike, the bulk pass would have committed to a 1-template architecture that left ~44 function dups unhandled.

Constructor visibility note

The new class template removes the explicit `.. automethod:: init` line from stubs (would re-register `init` and defeat the `:no-members:` suppression). The constructor signature still appears on page-level via `autodoc_class_signature = "separated"` (`docs/conf.py:46`). Stubs lose the per-class `init` rendering; page-level keeps it.

Residual triage (Cat A–E)

Five categories of non-duplicate warnings, all fixed:

Cat A — Source docstring (5): `SyntheticDiDResults.in_time_placebo:36` blank-line-before-prose; 4 unreferenced footnotes in `TripleDifference` (added `[1]`/`[2]` refs in narrative)
Cat B — Cross-refs (6): 4 `:doc:` refs to `.md` files that Sphinx can't resolve without `myst-parser` → converted to literal text per existing `docs/choosing_estimator.rst:302` pattern; 2 broken markdown links in T18/T19 notebooks (4 in T19 + 2 broken-rendered REGISTRY.html refs in T18 also fixed at the same time)
Cat C — Orphan citations (3): removed `[CS2021]`, `[AHIW2021]`, `[RR2023]` from `docs/benchmarks.rst` (grep-confirmed safe — no narrative references)
Cat D — Toctree (2): T19 + T20 added to `Tutorials: Business Applications` toctree at `docs/index.rst:80-83`
Cat E — Autosummary entry typos (2): `~DiDResults.ci` → `~DiDResults.conf_int`; removed nonexistent `~CallawaySantAnnaResults.aggregate`

Plan-reviewer found that the original Cat E plan (suggested `suppress_warnings = ["autosummary"]` based on "Rust backend not built") was wrong: Sphinx emits autosummary import-failure warnings without `type=`, so the suppress filter is a no-op. The actual fixes were RST typos.

Page-vs-page function fix (Cat F surprise)

Spike rebuild surfaced 3 final duplicates from `chaisemartin_dhaultfoeuille`, `stacked_did`, `trop` — function names that shadow their module names. The conflict was Python's same-namespace ambiguity: `from .X import X` shadows the module, but Sphinx still registers both. Fix: `:no-index:` on the page-level autofunction (3 sites). Also switched 3 path-qualified autoclass directives to top-level paths (`diff_diff.X.Y` → `diff_diff.Y`) for consistency with project convention (PR #406 export pattern).

Methodology references

N/A - no methodology changes; documentation infrastructure only.

Validation

Tests added/updated: No test changes; doc-snippet test parity preserved (111 passed, 4 skipped — same as PR docs: add :no-index: to page-level autoclass directives (PR 2 of 2) #407 baseline).
Backtest / simulation / notebook evidence: Spike rebuild iterations: baseline 635 warnings → 62 (class template only) → 21 (+ function template) → 18 (+ page-vs-page page-level autofunction `:no-index:`) → 1 (+ Cat A–E triage) → 0 (T18 link fix to `.rst` extension).
HTML spot-checks (verified):
- Page-level `results.html`: 281 (PR docs: add :no-index: to page-level autoclass directives (PR 2 of 2) #407) → 286 (PR 2.5) `<dt` count — slight increase from napoleon docstring rendering tweaks; full member docs preserved
- Stub `diff_diff.DiDResults.html`: class header + autosummary tables only; no inline member `
  ` for class docs (the 21 remaining are napoleon-rendered Attributes section, NOT registered)
- T19 + T20 in tutorials nav: confirmed
- 50 class stubs include `:no-members:`; 48 function stubs include `:no-index:`

Security / privacy

Confirm no secrets/PII in this PR: Yes

Closes the gap PR #407 left: 614 of 635 PR #407-residual warnings were member-level duplicate-object descriptions because :no-index: on page-level autoclass directives doesn't propagate to autodoc-generated members. PR 2.5 adds a custom autosummary class.rst template with :no-members: on the stub autoclass, and a function.rst template with :no-index: on the stub autofunction. All 50 class stubs and 48 function stubs are auto-regenerated on the next clean build (autosummary_generate_overwrite=True default). Architecture: - Class cross-refs resolve to stub URL (per PR #407 page-level :no-index:) - Member cross-refs resolve to page-level URL (only place still registered) - Function cross-refs resolve to page-level URL (stub now :no-index:) - Page-level pages keep current full inline member rendering (UX preserved) - Stubs become "header + autosummary tables linking to page-level" Also addresses 18 non-duplicate residuals so make html now warns 0: - 4 footnote-not-referenced + 1 def-list blank-line in source docstrings - 4 :doc: refs to .md files (myst-parser absent) converted to literal text - 4 broken markdown links to REGISTRY.html in T18/T19 notebooks - 3 orphan benchmarks.rst citations removed - T19 + T20 added to Tutorials: Business Applications toctree - 2 stale autosummary entries fixed (DiDResults.ci, CallawaySantAnnaResults.aggregate) - 3 page-vs-page function dups (chaisemartin_dhaultfoeuille, stacked_did, trop) resolved by adding :no-index: to page-level autofunction directives and switching 3 path-qualified autoclass directives to top-level paths Result: Sphinx make html went 2,170 (PR #407 start) -> 0 warnings (PR 2.5 end). This unblocks PR 3 (queued: -W enforcement in docs-tests.yml). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-05-10T00:52:06Z

✅ Looks good

Executive Summary

No estimator behavior, math, weighting, variance/SE, identification, or default behavior changes were introduced.
Methodology-impacting code changes are docstring-only and align with the Methodology Registry for SyntheticDiD and TripleDifference.
Autosummary class/function stub changes are broadly consistent and address the tracked Sphinx duplicate-warning tech debt.
I found one P2 documentation/inventory regression: three shadowed convenience functions are now unindexed everywhere.
No secrets found in the non-notebook diff. Full Sphinx build was not run; local import failed because numpy is not installed in this review environment.

Methodology

No P0/P1 findings.

Affected methods checked:

SyntheticDiDResults.in_time_placebo: docstring formatting only at diff_diff/results.py:L1269-L1281; no estimator logic changed. Registry documents the in-time placebo reuse of the fit snapshot normalization at docs/methodology/REGISTRY.md:L1660-L1666.
TripleDifference: reference footnote wiring only at diff_diff/triple_diff.py:L356-L364 and diff_diff/triple_diff.py:L474-L481; no estimator logic changed. Registry primary source remains Ortiz-Villavicencio & Sant'Anna at docs/methodology/REGISTRY.md:L1696-L1725.

Code Quality

No P0/P1/P2 findings.

Pattern-wide inference grep found no new inline inference computation in the changed Python hunks. Existing project-wide hits are outside this PR’s changed code.

Performance

No findings.

The PR is documentation/template focused and does not alter runtime estimator paths.

Maintainability

No P0/P1 findings.

P3 — Informational: tracked docs tech debt addressed

Impact: The autosummary refresh and custom templates directly address existing Sphinx duplicate/stale-stub debt already tracked in TODO.md:L134-L139.

Concrete fix: No action required.

Tech Debt

No untracked blocker-level tech debt found.

The remaining limitation around documentation inventory below is not statistical correctness and does not block approval.

Security

No findings.

I grepped the non-notebook diff for common secret patterns and did not find credentials or private keys.

Documentation/Tests

P2 — Shadowed convenience functions lose Python-domain targets

Impact: The three page-level convenience functions that shadow module names are now documented with :no-index:: diff_diff.chaisemartin_dhaultfoeuille at docs/api/chaisemartin_dhaultfoeuille.rst:L159-L160, diff_diff.stacked_did at docs/api/stacked_did.rst:L70-L71, and diff_diff.trop at docs/api/trop.rst:L76-L77. Since these functions are not also present as indexed autosummary stubs, :py:func: cross-references and intersphinx inventory entries for those public convenience functions will not have a function target.

Concrete fix: Keep exactly one indexed target for each shadowed function. A clean shape is to replace the page .. module:: diff_diff.<name> directive with .. currentmodule:: diff_diff.<name> where needed, then remove :no-index: from the .. autofunction:: diff_diff.<name> directive so the public function remains registered without colliding with the module target.

P3 — Validation not locally reproduced

Impact: I did not run make html; this environment is missing runtime docs dependencies (import diff_diff fails on missing numpy). Static audits covered the diff, sibling surfaces, no-index/no-members patterns, workflow-path changes, and secret patterns.

Concrete fix: Rely on CI’s Sphinx build for final confirmation, or run make html in an environment with the docs/runtime dependencies installed.

Reviewer flagged that page-level :no-index: on .. autofunction:: (added in initial PR 2.5 commit to resolve 3 page-vs-page duplicates) made diff_diff.{trop,stacked_did,chaisemartin_dhaultfoeuille} convenience functions absent from the py:function inventory entirely - so :py:func: cross-refs and intersphinx pointers to those public functions wouldn't resolve. Empirical investigation surfaced the actual conflict source: each of the 3 files had a top-level ``.. module:: diff_diff.<name>`` directive that registered the module FQN, colliding with the function FQN of the same name. Removing the .. module:: directive lets .. autofunction:: register the function as the canonical (and only) :py:function entry. The .. autoclass:: directives on those pages were already converted to top-level paths (diff_diff.X form) in the initial PR 2.5 commit and don't depend on the implicit module context. Verified: make html shows 0 ERRORs / 0 WARNINGs / 0 duplicates; all 3 convenience functions present in objects.inv as py:function entries. Doc-snippet test parity preserved (111 passed, 4 skipped). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

igerber · 2026-05-10T14:07:53Z

/ai-review

github-actions · 2026-05-10T14:11:41Z

🔁 AI review rerun (requested by @igerber)

Head SHA: 477c1c19cc38017481f832d2af5d7c2a022a977f

✅ Looks good

Executive Summary

Previous P2 finding is resolved: diff_diff.chaisemartin_dhaultfoeuille, diff_diff.stacked_did, and diff_diff.trop now have indexed page-level autofunction targets.
No estimator math, weighting, SE/variance, assumptions, defaults, or runtime behavior changed.
Methodology-touching edits are docstring/reference formatting only and match docs/methodology/REGISTRY.md.
Required audits completed: sibling-surface mirror, pattern-wide inference grep, reciprocal indexed-target check, and workflow-path audit. No workflow path changes were in scope.
One P2 documentation inventory issue remains: three plotting helper stubs are now :no-index: without a page-level indexed replacement.

Methodology

No P0/P1 findings.

Severity: P3-informational
Impact: SyntheticDiDResults.in_time_placebo changed only docstring spacing at diff_diff/results.py:L1271-L1281. Registry behavior for normalization, regularization reuse, and infeasibility-only NaN handling matches docs/methodology/REGISTRY.md:L1660-L1669.

Concrete fix: No action required.

Severity: P3-informational
Impact: TripleDifference changed only docstring footnote wiring and adds the Gruber reference at diff_diff/triple_diff.py:L356-L364 and diff_diff/triple_diff.py:L474-L481. Registry still identifies Ortiz-Villavicencio & Sant'Anna as the primary implementation source at docs/methodology/REGISTRY.md:L1696-L1705.

Concrete fix: No action required.

Code Quality

No P0/P1/P2 findings.

Pattern-wide grep for inline inference anti-patterns (t_stat = ... / se, if se > 0 else 0.0, compute_p_value, compute_confidence_interval) found no new affected Python hunks. The only Python changes are docstrings.

Performance

No findings.

The PR is documentation and Sphinx-template focused; no runtime estimator path changes were introduced.

Maintainability

No P0/P1 findings.

Severity: P3-informational
Impact: The autosummary template/stub refresh addresses tracked Sphinx duplicate/stale-stub debt in TODO.md:L134-L139. Static checks found all generated class stubs have :no-members:, all generated function stubs have :no-index:, and no generated stub still has .. automethod:: __init__.

Concrete fix: No action required.

Tech Debt

No unmitigated blocker-level tech debt found.

The docs-warning cleanup is already covered by TODO.md:L134-L139. The remaining item below is documentation inventory, not statistical correctness.

Security

No findings.

Secret-pattern grep over changed non-notebook source/docs found no obvious credentials, private keys, or tokens.

Documentation/Tests

Severity: P2
Impact: Three public plotting helpers now have no indexed Python-domain target. Their autosummary stubs are explicitly :no-index: at docs/api/_autosummary/diff_diff.plot_bacon.rst:L6-L7, docs/api/_autosummary/diff_diff.plot_power_curve.rst:L6-L7, and docs/api/_autosummary/diff_diff.plot_pretrends_power.rst:L6-L7, but the reciprocal page-level sweep found no corresponding indexed .. autofunction:: for those names. Nearby sibling pages document other plotting/power helpers but omit these targets: docs/api/visualization.rst:L8-L80, docs/api/power.rst:L108-L148, and docs/api/pretrends.rst:L89-L101.

Concrete fix: Add indexed page-level directives for the three functions, likely diff_diff.plot_bacon in docs/api/visualization.rst, diff_diff.plot_power_curve in docs/api/power.rst, and diff_diff.plot_pretrends_power in docs/api/pretrends.rst; or selectively omit :no-index: from those three stubs if they are intended to be the canonical targets.

Severity: P3-informational
Impact: I did not run make html; the review environment is read-only, and Sphinx builds write output. Static RST/template/indexing audits were completed instead.

Concrete fix: Rely on CI’s Sphinx build, or run make html in a writable docs environment.

R2 reviewer flagged 3 plotting helpers (plot_bacon, plot_power_curve, plot_pretrends_power) as having :no-index: stubs without page-level indexed replacements. Added page-level .. autofunction:: directives in visualization.rst, power.rst, and pretrends.rst. Proactive sweep (per feedback_ci_reviewer_pattern_checks) surfaced 7 sibling dataset loaders (load_card_krueger, load_castle_doctrine, load_divorce_laws, load_mpdta, load_dataset, list_datasets, clear_cache) with the same class of issue: docs/api/datasets.rst used bare names under .. module:: diff_diff.datasets, registering them at diff_diff.datasets.X instead of the canonical diff_diff.X export FQN. Removed the .. module:: directive and switched to fully-qualified diff_diff.X names so they appear at the same FQN as their autosummary listings. Verified: full coverage sweep finds 51/51 functions and 59/59 classes in py:function / py:class inventory under their canonical diff_diff.X FQN. make html shows 0 ERRORs / 0 WARNINGs / 0 duplicates. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

igerber · 2026-05-10T14:55:04Z

/ai-review

github-actions · 2026-05-10T15:00:16Z

🔁 AI review rerun (requested by @igerber)

Head SHA: 0062778c0d6ce44d7e8dcec03fdd6eada4f609d8

✅ Looks good

Executive Summary

No estimator math, weighting, variance/SE, assumptions, or runtime defaults changed.
Methodology-touching hunks are docstring/reference formatting only and match docs/methodology/REGISTRY.md.
Previous P2 plotting-helper issue is resolved: plot_bacon, plot_power_curve, and plot_pretrends_power now have page-level indexed autofunction targets.
Required static audits passed: sibling-surface mirror, no-index function reciprocal targets, autosummary pattern sweep, inference anti-pattern grep, and workflow-scope check.
Only residual item is P3: TODO.md still appears to list Sphinx docs debt this PR addresses.

Methodology

Severity: P3-informational
Impact: SyntheticDiDResults.in_time_placebo changed docstring spacing only at diff_diff/results.py:L1271-L1281. The documented “infeasibility-only NaN” behavior matches docs/methodology/REGISTRY.md:L1667-L1669.
Concrete fix: No action required.

Severity: P3-informational
Impact: TripleDifference only wires existing references into the docstring at diff_diff/triple_diff.py:L356-L364 and diff_diff/triple_diff.py:L474-L481. The primary implementation source and assumption checks remain aligned with docs/methodology/REGISTRY.md:L1696-L1705.
Concrete fix: No action required.

Code Quality

No P0/P1/P2 findings.

Pattern-wide grep for inline inference anti-patterns found no new affected Python hunks. The only Python changes are docstrings.

Performance

No findings.

The PR is documentation/Sphinx infrastructure only; no runtime estimator path changes were introduced.

Maintainability

No P0/P1/P2 findings.

Static mirror checks passed: all class stubs have :no-members:, all function stubs have :no-index:, and no generated stub still has .. automethod:: __init__. The reciprocal function-target audit found every no-index function stub has a page-level autofunction target.

Workflow transitive-dependency audit: no workflow or pytest-selection changes were in the supplied diff. Existing docs workflow paths already include docs, package code, tests/test_doc_snippets.py, tests/conftest.py, pyproject.toml, and the workflow itself at .github/workflows/docs-tests.yml:L6-L26.

Tech Debt

Severity: P3-informational
Impact: TODO.md:L134-L139 still lists Sphinx duplicate/stale-autosummary/autodoc debt that this PR appears to address. This is not a correctness issue, but it may mislead future review/planning.
Concrete fix: After CI confirms the Sphinx build is warning-free, update or remove the resolved TODO rows.

Security

No findings.

Secret-pattern scan over changed Python/RST/Markdown surfaces, excluding the notebook carve-out paths, found no obvious credentials or private keys.

Documentation/Tests

No P0/P1/P2 findings.

Severity: P3-informational
Impact: Prior plotting-helper inventory gap is resolved at docs/api/visualization.rst:L174-L179, docs/api/power.rst:L150-L155, and docs/api/pretrends.rst:L103-L108.
Concrete fix: No action required.

Severity: P3-informational
Impact: I could not run the doc-snippet tests because pytest is not installed in this review environment. I also did not run a full Sphinx build.
Concrete fix: Rely on CI’s docs build/doc-snippet job for executable validation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: custom autosummary template + residual triage (PR 2.5)#410

docs: custom autosummary template + residual triage (PR 2.5)#410
igerber wants to merge 3 commits intomainfrom
docs/sphinx-member-duplicate-fix

igerber commented May 10, 2026

Uh oh!

github-actions Bot commented May 10, 2026

Uh oh!

igerber commented May 10, 2026

Uh oh!

github-actions Bot commented May 10, 2026

Uh oh!

igerber commented May 10, 2026

Uh oh!

github-actions Bot commented May 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

igerber commented May 10, 2026

Summary

Architecture

Spike-before-bulk discipline

Constructor visibility note

Residual triage (Cat A–E)

Page-vs-page function fix (Cat F surprise)

Methodology references

Validation

Security / privacy

Uh oh!

github-actions Bot commented May 10, 2026

✅ Looks good

Executive Summary

Methodology

Code Quality

Performance

Maintainability

Tech Debt

Security

Documentation/Tests

Uh oh!

igerber commented May 10, 2026

Uh oh!

github-actions Bot commented May 10, 2026

✅ Looks good

Executive Summary

Methodology

Code Quality

Performance

Maintainability

Tech Debt

Security

Documentation/Tests

Uh oh!

igerber commented May 10, 2026

Uh oh!

github-actions Bot commented May 10, 2026

✅ Looks good

Executive Summary

Methodology

Code Quality

Performance

Maintainability

Tech Debt

Security

Documentation/Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant