Skip to content

docs: custom autosummary template + residual triage (PR 2.5)#410

Open
igerber wants to merge 3 commits intomainfrom
docs/sphinx-member-duplicate-fix
Open

docs: custom autosummary template + residual triage (PR 2.5)#410
igerber wants to merge 3 commits intomainfrom
docs/sphinx-member-duplicate-fix

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented May 10, 2026

Summary

Closes the Sphinx WARNING gap PR #407 left. PR #407 dropped warnings from 2,170 to 635 (71%) by adding `:no-index:` to all 78 page-level `.. autoclass::` directives. The empirical spike on `bacon.rst` revealed that `:no-index:` on autoclass suppresses CLASS-level cross-reference registration only — it doesn't propagate to autodoc-generated members when `:members:` is implicit (via `autodoc_default_options["members"] = True` in `docs/conf.py:39-44`). 614 of those 635 residuals were member-level duplicates.

PR 2.5 closes both that gap and the 21 non-duplicate residuals so PR 3 (queued: `-W` enforcement in `docs-tests.yml`) can land cleanly.

Result: `make html` is now 0 ERRORs / 0 WARNINGs (down from 2,170 at PR #407 start; 635 at PR #407 merge).

Architecture

User-decided this session: custom autosummary class template that adds `:no-members:` to the autoclass in stubs (Option A). Effect:

  • Stubs: header + autosummary tables linking to page-level (no inline member docs)
  • Page-level pages: keep current full inline member rendering (UX preserved)
  • Class cross-refs `:py:class:DiDResults` → stub URL (per PR docs: add :no-index: to page-level autoclass directives (PR 2 of 2) #407 page-level `:no-index:`)
  • Member cross-refs `:py:meth:DiDResults.summary` → page-level URL (only place still registered)
  • Function cross-refs `:py:func:diff_diff.compute_power` → page-level URL (stub now `:no-index:` via custom function.rst template)

Sphinx's `autosummary_generate_overwrite=True` default means a single clean build regenerates all 50 class stubs + 48 function stubs from the new templates — no hand-edit per stub.

Spike-before-bulk discipline

Per `feedback_spike_before_bulk_mechanical_edit` (lesson saved from PR #407): the plan included an explicit empirical spike of just the class.rst template. The spike caught that 41 of 44 remaining residuals were FUNCTION duplicates (not class), surfacing the need for a function.rst template too. The remaining 3 page-vs-page were pinned to specific module-name conflicts (`diff_diff.chaisemartin_dhaultfoeuille`, `diff_diff.stacked_did`, `diff_diff.trop` — function names that shadow module names) and resolved with targeted page-level `:no-index:` on the autofunction. Without the spike, the bulk pass would have committed to a 1-template architecture that left ~44 function dups unhandled.

Constructor visibility note

The new class template removes the explicit `.. automethod:: init` line from stubs (would re-register `init` and defeat the `:no-members:` suppression). The constructor signature still appears on page-level via `autodoc_class_signature = "separated"` (`docs/conf.py:46`). Stubs lose the per-class `init` rendering; page-level keeps it.

Residual triage (Cat A–E)

Five categories of non-duplicate warnings, all fixed:

  • Cat A — Source docstring (5): `SyntheticDiDResults.in_time_placebo:36` blank-line-before-prose; 4 unreferenced footnotes in `TripleDifference` (added `[1]`/`[2]` refs in narrative)
  • Cat B — Cross-refs (6): 4 `:doc:` refs to `.md` files that Sphinx can't resolve without `myst-parser` → converted to literal text per existing `docs/choosing_estimator.rst:302` pattern; 2 broken markdown links in T18/T19 notebooks (4 in T19 + 2 broken-rendered REGISTRY.html refs in T18 also fixed at the same time)
  • Cat C — Orphan citations (3): removed `[CS2021]`, `[AHIW2021]`, `[RR2023]` from `docs/benchmarks.rst` (grep-confirmed safe — no narrative references)
  • Cat D — Toctree (2): T19 + T20 added to `Tutorials: Business Applications` toctree at `docs/index.rst:80-83`
  • Cat E — Autosummary entry typos (2): `~DiDResults.ci` → `~DiDResults.conf_int`; removed nonexistent `~CallawaySantAnnaResults.aggregate`

Plan-reviewer found that the original Cat E plan (suggested `suppress_warnings = ["autosummary"]` based on "Rust backend not built") was wrong: Sphinx emits autosummary import-failure warnings without `type=`, so the suppress filter is a no-op. The actual fixes were RST typos.

Page-vs-page function fix (Cat F surprise)

Spike rebuild surfaced 3 final duplicates from `chaisemartin_dhaultfoeuille`, `stacked_did`, `trop` — function names that shadow their module names. The conflict was Python's same-namespace ambiguity: `from .X import X` shadows the module, but Sphinx still registers both. Fix: `:no-index:` on the page-level autofunction (3 sites). Also switched 3 path-qualified autoclass directives to top-level paths (`diff_diff.X.Y` → `diff_diff.Y`) for consistency with project convention (PR #406 export pattern).

Methodology references

  • N/A - no methodology changes; documentation infrastructure only.

Validation

  • Tests added/updated: No test changes; doc-snippet test parity preserved (111 passed, 4 skipped — same as PR docs: add :no-index: to page-level autoclass directives (PR 2 of 2) #407 baseline).
  • Backtest / simulation / notebook evidence: Spike rebuild iterations: baseline 635 warnings → 62 (class template only) → 21 (+ function template) → 18 (+ page-vs-page page-level autofunction `:no-index:`) → 1 (+ Cat A–E triage) → 0 (T18 link fix to `.rst` extension).
  • HTML spot-checks (verified):
    • Page-level `results.html`: 281 (PR docs: add :no-index: to page-level autoclass directives (PR 2 of 2) #407) → 286 (PR 2.5) `<dt` count — slight increase from napoleon docstring rendering tweaks; full member docs preserved
    • Stub `diff_diff.DiDResults.html`: class header + autosummary tables only; no inline member `
      ` for class docs (the 21 remaining are napoleon-rendered Attributes section, NOT registered)
    • T19 + T20 in tutorials nav: confirmed
    • 50 class stubs include `:no-members:`; 48 function stubs include `:no-index:`

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

Closes the gap PR #407 left: 614 of 635 PR #407-residual warnings were
member-level duplicate-object descriptions because :no-index: on page-level
autoclass directives doesn't propagate to autodoc-generated members. PR 2.5
adds a custom autosummary class.rst template with :no-members: on the stub
autoclass, and a function.rst template with :no-index: on the stub
autofunction. All 50 class stubs and 48 function stubs are auto-regenerated
on the next clean build (autosummary_generate_overwrite=True default).

Architecture:
- Class cross-refs resolve to stub URL (per PR #407 page-level :no-index:)
- Member cross-refs resolve to page-level URL (only place still registered)
- Function cross-refs resolve to page-level URL (stub now :no-index:)
- Page-level pages keep current full inline member rendering (UX preserved)
- Stubs become "header + autosummary tables linking to page-level"

Also addresses 18 non-duplicate residuals so make html now warns 0:
- 4 footnote-not-referenced + 1 def-list blank-line in source docstrings
- 4 :doc: refs to .md files (myst-parser absent) converted to literal text
- 4 broken markdown links to REGISTRY.html in T18/T19 notebooks
- 3 orphan benchmarks.rst citations removed
- T19 + T20 added to Tutorials: Business Applications toctree
- 2 stale autosummary entries fixed (DiDResults.ci, CallawaySantAnnaResults.aggregate)
- 3 page-vs-page function dups (chaisemartin_dhaultfoeuille, stacked_did, trop)
  resolved by adding :no-index: to page-level autofunction directives and
  switching 3 path-qualified autoclass directives to top-level paths

Result: Sphinx make html went 2,170 (PR #407 start) -> 0 warnings (PR 2.5 end).
This unblocks PR 3 (queued: -W enforcement in docs-tests.yml).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

✅ Looks good

Executive Summary

  • No estimator behavior, math, weighting, variance/SE, identification, or default behavior changes were introduced.
  • Methodology-impacting code changes are docstring-only and align with the Methodology Registry for SyntheticDiD and TripleDifference.
  • Autosummary class/function stub changes are broadly consistent and address the tracked Sphinx duplicate-warning tech debt.
  • I found one P2 documentation/inventory regression: three shadowed convenience functions are now unindexed everywhere.
  • No secrets found in the non-notebook diff. Full Sphinx build was not run; local import failed because numpy is not installed in this review environment.

Methodology

No P0/P1 findings.

Affected methods checked:

  • SyntheticDiDResults.in_time_placebo: docstring formatting only at diff_diff/results.py:L1269-L1281; no estimator logic changed. Registry documents the in-time placebo reuse of the fit snapshot normalization at docs/methodology/REGISTRY.md:L1660-L1666.
  • TripleDifference: reference footnote wiring only at diff_diff/triple_diff.py:L356-L364 and diff_diff/triple_diff.py:L474-L481; no estimator logic changed. Registry primary source remains Ortiz-Villavicencio & Sant'Anna at docs/methodology/REGISTRY.md:L1696-L1725.

Code Quality

No P0/P1/P2 findings.

Pattern-wide inference grep found no new inline inference computation in the changed Python hunks. Existing project-wide hits are outside this PR’s changed code.

Performance

No findings.

The PR is documentation/template focused and does not alter runtime estimator paths.

Maintainability

No P0/P1 findings.

P3 — Informational: tracked docs tech debt addressed

Impact: The autosummary refresh and custom templates directly address existing Sphinx duplicate/stale-stub debt already tracked in TODO.md:L134-L139.

Concrete fix: No action required.

Tech Debt

No untracked blocker-level tech debt found.

The remaining limitation around documentation inventory below is not statistical correctness and does not block approval.

Security

No findings.

I grepped the non-notebook diff for common secret patterns and did not find credentials or private keys.

Documentation/Tests

P2 — Shadowed convenience functions lose Python-domain targets

Impact: The three page-level convenience functions that shadow module names are now documented with :no-index:: diff_diff.chaisemartin_dhaultfoeuille at docs/api/chaisemartin_dhaultfoeuille.rst:L159-L160, diff_diff.stacked_did at docs/api/stacked_did.rst:L70-L71, and diff_diff.trop at docs/api/trop.rst:L76-L77. Since these functions are not also present as indexed autosummary stubs, :py:func: cross-references and intersphinx inventory entries for those public convenience functions will not have a function target.

Concrete fix: Keep exactly one indexed target for each shadowed function. A clean shape is to replace the page .. module:: diff_diff.<name> directive with .. currentmodule:: diff_diff.<name> where needed, then remove :no-index: from the .. autofunction:: diff_diff.<name> directive so the public function remains registered without colliding with the module target.

P3 — Validation not locally reproduced

Impact: I did not run make html; this environment is missing runtime docs dependencies (import diff_diff fails on missing numpy). Static audits covered the diff, sibling surfaces, no-index/no-members patterns, workflow-path changes, and secret patterns.

Concrete fix: Rely on CI’s Sphinx build for final confirmation, or run make html in an environment with the docs/runtime dependencies installed.

Reviewer flagged that page-level :no-index: on .. autofunction:: (added
in initial PR 2.5 commit to resolve 3 page-vs-page duplicates) made
diff_diff.{trop,stacked_did,chaisemartin_dhaultfoeuille} convenience
functions absent from the py:function inventory entirely - so :py:func:
cross-refs and intersphinx pointers to those public functions wouldn't
resolve.

Empirical investigation surfaced the actual conflict source: each of the 3
files had a top-level ``.. module:: diff_diff.<name>`` directive that
registered the module FQN, colliding with the function FQN of the same
name. Removing the .. module:: directive lets .. autofunction:: register
the function as the canonical (and only) :py:function entry. The
.. autoclass:: directives on those pages were already converted to
top-level paths (diff_diff.X form) in the initial PR 2.5 commit and
don't depend on the implicit module context.

Verified: make html shows 0 ERRORs / 0 WARNINGs / 0 duplicates; all 3
convenience functions present in objects.inv as py:function entries.
Doc-snippet test parity preserved (111 passed, 4 skipped).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented May 10, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 477c1c19cc38017481f832d2af5d7c2a022a977f


✅ Looks good

Executive Summary

  • Previous P2 finding is resolved: diff_diff.chaisemartin_dhaultfoeuille, diff_diff.stacked_did, and diff_diff.trop now have indexed page-level autofunction targets.
  • No estimator math, weighting, SE/variance, assumptions, defaults, or runtime behavior changed.
  • Methodology-touching edits are docstring/reference formatting only and match docs/methodology/REGISTRY.md.
  • Required audits completed: sibling-surface mirror, pattern-wide inference grep, reciprocal indexed-target check, and workflow-path audit. No workflow path changes were in scope.
  • One P2 documentation inventory issue remains: three plotting helper stubs are now :no-index: without a page-level indexed replacement.

Methodology

No P0/P1 findings.

Severity: P3-informational
Impact: SyntheticDiDResults.in_time_placebo changed only docstring spacing at diff_diff/results.py:L1271-L1281. Registry behavior for normalization, regularization reuse, and infeasibility-only NaN handling matches docs/methodology/REGISTRY.md:L1660-L1669.

Concrete fix: No action required.

Severity: P3-informational
Impact: TripleDifference changed only docstring footnote wiring and adds the Gruber reference at diff_diff/triple_diff.py:L356-L364 and diff_diff/triple_diff.py:L474-L481. Registry still identifies Ortiz-Villavicencio & Sant'Anna as the primary implementation source at docs/methodology/REGISTRY.md:L1696-L1705.

Concrete fix: No action required.

Code Quality

No P0/P1/P2 findings.

Pattern-wide grep for inline inference anti-patterns (t_stat = ... / se, if se > 0 else 0.0, compute_p_value, compute_confidence_interval) found no new affected Python hunks. The only Python changes are docstrings.

Performance

No findings.

The PR is documentation and Sphinx-template focused; no runtime estimator path changes were introduced.

Maintainability

No P0/P1 findings.

Severity: P3-informational
Impact: The autosummary template/stub refresh addresses tracked Sphinx duplicate/stale-stub debt in TODO.md:L134-L139. Static checks found all generated class stubs have :no-members:, all generated function stubs have :no-index:, and no generated stub still has .. automethod:: __init__.

Concrete fix: No action required.

Tech Debt

No unmitigated blocker-level tech debt found.

The docs-warning cleanup is already covered by TODO.md:L134-L139. The remaining item below is documentation inventory, not statistical correctness.

Security

No findings.

Secret-pattern grep over changed non-notebook source/docs found no obvious credentials, private keys, or tokens.

Documentation/Tests

Severity: P2
Impact: Three public plotting helpers now have no indexed Python-domain target. Their autosummary stubs are explicitly :no-index: at docs/api/_autosummary/diff_diff.plot_bacon.rst:L6-L7, docs/api/_autosummary/diff_diff.plot_power_curve.rst:L6-L7, and docs/api/_autosummary/diff_diff.plot_pretrends_power.rst:L6-L7, but the reciprocal page-level sweep found no corresponding indexed .. autofunction:: for those names. Nearby sibling pages document other plotting/power helpers but omit these targets: docs/api/visualization.rst:L8-L80, docs/api/power.rst:L108-L148, and docs/api/pretrends.rst:L89-L101.

Concrete fix: Add indexed page-level directives for the three functions, likely diff_diff.plot_bacon in docs/api/visualization.rst, diff_diff.plot_power_curve in docs/api/power.rst, and diff_diff.plot_pretrends_power in docs/api/pretrends.rst; or selectively omit :no-index: from those three stubs if they are intended to be the canonical targets.

Severity: P3-informational
Impact: I did not run make html; the review environment is read-only, and Sphinx builds write output. Static RST/template/indexing audits were completed instead.

Concrete fix: Rely on CI’s Sphinx build, or run make html in a writable docs environment.

R2 reviewer flagged 3 plotting helpers (plot_bacon, plot_power_curve,
plot_pretrends_power) as having :no-index: stubs without page-level
indexed replacements. Added page-level .. autofunction:: directives in
visualization.rst, power.rst, and pretrends.rst.

Proactive sweep (per feedback_ci_reviewer_pattern_checks) surfaced 7
sibling dataset loaders (load_card_krueger, load_castle_doctrine,
load_divorce_laws, load_mpdta, load_dataset, list_datasets, clear_cache)
with the same class of issue: docs/api/datasets.rst used bare names
under .. module:: diff_diff.datasets, registering them at
diff_diff.datasets.X instead of the canonical diff_diff.X export FQN.
Removed the .. module:: directive and switched to fully-qualified
diff_diff.X names so they appear at the same FQN as their autosummary
listings.

Verified: full coverage sweep finds 51/51 functions and 59/59 classes
in py:function / py:class inventory under their canonical diff_diff.X
FQN. make html shows 0 ERRORs / 0 WARNINGs / 0 duplicates.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Copy Markdown
Owner Author

igerber commented May 10, 2026

/ai-review

@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 0062778c0d6ce44d7e8dcec03fdd6eada4f609d8


✅ Looks good

Executive Summary

  • No estimator math, weighting, variance/SE, assumptions, or runtime defaults changed.
  • Methodology-touching hunks are docstring/reference formatting only and match docs/methodology/REGISTRY.md.
  • Previous P2 plotting-helper issue is resolved: plot_bacon, plot_power_curve, and plot_pretrends_power now have page-level indexed autofunction targets.
  • Required static audits passed: sibling-surface mirror, no-index function reciprocal targets, autosummary pattern sweep, inference anti-pattern grep, and workflow-scope check.
  • Only residual item is P3: TODO.md still appears to list Sphinx docs debt this PR addresses.

Methodology

Severity: P3-informational
Impact: SyntheticDiDResults.in_time_placebo changed docstring spacing only at diff_diff/results.py:L1271-L1281. The documented “infeasibility-only NaN” behavior matches docs/methodology/REGISTRY.md:L1667-L1669.
Concrete fix: No action required.

Severity: P3-informational
Impact: TripleDifference only wires existing references into the docstring at diff_diff/triple_diff.py:L356-L364 and diff_diff/triple_diff.py:L474-L481. The primary implementation source and assumption checks remain aligned with docs/methodology/REGISTRY.md:L1696-L1705.
Concrete fix: No action required.

Code Quality

No P0/P1/P2 findings.

Pattern-wide grep for inline inference anti-patterns found no new affected Python hunks. The only Python changes are docstrings.

Performance

No findings.

The PR is documentation/Sphinx infrastructure only; no runtime estimator path changes were introduced.

Maintainability

No P0/P1/P2 findings.

Static mirror checks passed: all class stubs have :no-members:, all function stubs have :no-index:, and no generated stub still has .. automethod:: __init__. The reciprocal function-target audit found every no-index function stub has a page-level autofunction target.

Workflow transitive-dependency audit: no workflow or pytest-selection changes were in the supplied diff. Existing docs workflow paths already include docs, package code, tests/test_doc_snippets.py, tests/conftest.py, pyproject.toml, and the workflow itself at .github/workflows/docs-tests.yml:L6-L26.

Tech Debt

Severity: P3-informational
Impact: TODO.md:L134-L139 still lists Sphinx duplicate/stale-autosummary/autodoc debt that this PR appears to address. This is not a correctness issue, but it may mislead future review/planning.
Concrete fix: After CI confirms the Sphinx build is warning-free, update or remove the resolved TODO rows.

Security

No findings.

Secret-pattern scan over changed Python/RST/Markdown surfaces, excluding the notebook carve-out paths, found no obvious credentials or private keys.

Documentation/Tests

No P0/P1/P2 findings.

Severity: P3-informational
Impact: Prior plotting-helper inventory gap is resolved at docs/api/visualization.rst:L174-L179, docs/api/power.rst:L150-L155, and docs/api/pretrends.rst:L103-L108.
Concrete fix: No action required.

Severity: P3-informational
Impact: I could not run the doc-snippet tests because pytest is not installed in this review environment. I also did not run a full Sphinx build.
Concrete fix: Rely on CI’s docs build/doc-snippet job for executable validation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant