perf(redis_admin): drop clear-by-filter preview load; reuse chart numbers#909
perf(redis_admin): drop clear-by-filter preview load; reuse chart numbers#909thomasrockhu-codecov wants to merge 5 commits intomainfrom
Conversation
…bers Render the clear-by-filter confirmation page synchronously from the display hints the frequency chart already has (bucket count, bucket share, sampled-window total, queue depth) instead of doing a streaming LRANGE scan on the request thread. Operators confirm based on the identity-defining (queue, task, repoid, commit) filter, not on a precise count, so the approximation "will tombstone ~X (~Y% of queue at depth Z)" is sufficient for the human in the loop. The chunked background-job worker that ALL three actions (dry-run / keep-one / clear-all) fan out to still computes the exact match count as it walks, so the audit log entry stays precise. Why: the previous synchronous streaming-count walk held the gunicorn worker for many seconds per click on a 200-500k-deep queue, and a separate lazy-preview attempt (PR #907) still costs network + cache round-trips per render. The chart already has these numbers — we just needed to pass them through. Chart form (`_frequency_chart.html`): - Switched from POST to GET (the four `bucket_*` / `total_*` hints are display-only, non-secret, idempotent, bookmarkable; CSRF is unnecessary). - Added four hidden inputs: `bucket_count`, `bucket_pct`, `total_depth`, `total_visible`. Confirmation view (`clear_by_filter_view`): - GET path now does ZERO Redis I/O. Removed the queryset materialise + `_CLEAR_BY_FILTER_SAMPLE_SIZE` slice, the `streaming_celery_count` call, and the synthetic-target `_substitute_filter_any` wiring (the view never calls `celery_broker_clear` directly anymore so the synthetic target is moot — the chunked job builds its own filter from the URL params). - Added `_coerce_int` / `_coerce_float` helpers for graceful URL-hint parsing; missing / malformed hints fall back to "Approximate count not available" rather than 500. - All three POST actions (`dry_run`, `clear_keep_one`, `clear_all`) now uniformly fan out to `start_celery_broker_clear_job` and 302 to the progress page. Dry-run no longer keeps the synchronous shape — the chunked worker reports the exact count there too. Confirmation template (`clear_by_filter.html`): - Replaced the "Matched: N message(s)" header + sample-targets table with a static filter-rows fieldset (queue / task / repo-with-link / truncated-commit-with-tooltip) and an approximate-count callout. - Callout shape: "Will tombstone approximately X messages (~Y% of the queue at depth ~Z)" when all four hints are present, or "Approximate count not available — the chunked clear will compute the exact match count as it scans" otherwise. - Kept the typed-confirmation pattern and the three submit actions exactly as-is; removed the conditional rendering that hid `clear_keep_one` for sub-2 buckets (we no longer know the count on render). Migration note: the chart-form method change (POST → GET) means operators with stale tabs may submit one final POST after this deploys. The view continues to read scope params from POST when present so those POSTs still work.
`streaming_celery_count` was a thin dry-run wrapper over `_streaming_celery_clear` whose only caller was the synchronous preview path in `clear_by_filter_view`. With the preview now rendered from chart-supplied display hints (no Redis I/O at all), the helper is dead code: the chunked clear job calls `_streaming_celery_clear` directly with `dry_run=True` for its own count pass, and per-message clear paths use materialised `CeleryBrokerQueue` rows instead of a streaming triple. Also tightens the `_FilterWildcard` / `_substitute_filter_any` docstrings to reflect the single remaining call site (`_run_celery_broker_clear_job_body`).
Replaces the destructive-action assertions in `CeleryBrokerClearByFilterViewTest` with a focused contract suite that pins the new view's behaviour: * GET with chart hints renders the approximate-count callout and makes ZERO Redis calls (asserted via a `MagicMock` proxy on `redis_admin_conn.get_connection` whose `method_calls` must be empty after the request). * GET without hints renders "Approximate count not available", still no Redis calls. * GET with malformed hints (non-numeric, divide-by-zero) coerces to `None` via `_coerce_int` / `_coerce_float` and falls back to the "not available" branch instead of raising. * GET resolves the `core.Repository` admin link when `repoid` is set so the chart's repo cell and the confirmation page link round-trip to the same URL. * All three POST actions (`clear_all`, `clear_keep_one`, `dry_run`) return 302 to `…/clear-by-filter/job/<uuid>/` (the chunked background-job worker handles the actual clearing). * Typed-confirm gate still applies to both destructive actions. * Legacy `action=confirm` shim still 302s to the progress page. * Refusal of unscoped clears keeps the per-bucket flow non- overlapping with `clear_by_scope`. * Superuser gating preserved. End-to-end exercise of the chunked worker (queue actually drained, audit-log shape, `keep_one` survivor placement, etc.) lives in `test_celery_broker_clear_job.py`. The destructive-mutation tests that used to live on `CeleryBrokerClearByFilterViewTest` were racing the chunked worker thread anyway -- this PR resolves that race by shifting their coverage to the dedicated chunked-job test file (where the existing `_wait_for_job` helper waits for worker completion). Also drops the two `streaming_celery_count` tests (`test_streaming_celery_count_returns_full_queue_match_count`, `test_streaming_celery_count_wildcard_task_name_matches_any_task`) because the helper itself is gone -- the chunked-job worker calls `_streaming_celery_clear` directly, and the underlying wildcard semantic is already covered by `test_celery_broker_clear_synthetic_target_with_wildcard_task_name_drains_all_matches`. `test_clear_by_filter_view_dry_run_preview_unchanged` in `test_celery_broker_clear_job.py` updated to `test_clear_by_filter_view_dry_run_redirects_to_progress_page`: asserts the new 302-to-progress shape and that `start_celery_broker_clear_job` is invoked with `dry_run=True`.
Updates the URLs table row for `clear-by-filter/` and the "chart-driven targeted clear" section to reflect that the confirmation page now: * Renders fully synchronously with **zero Redis I/O** on GET. * Receives `bucket_count` / `bucket_pct` / `total_visible` / `total_depth` from the chart's per-row form, extrapolates `count ≈ bucket_count / total_visible * total_depth`, and surfaces a "approximately N (~Y% of queue at depth Z)" callout. * Falls back to "Approximate count not available" when the hints are absent or malformed (hand-crafted URL). * Spawns the chunked worker for **all three** POST actions (`dry_run` included) -- the destructive 200-render shape is gone; everything 302s to the progress page. Also amends the per-bucket "Clear queue…" bullet under "Frequency chart" to mention the new hint passthrough so the docs explain why the page renders instantly even when the queue is hundreds of thousands deep.
Two trivial whitespace nits surfaced by `ruff format --check`: the helper-block separator after `_coerce_float` and an inlined `reverse(...)` call in `clear_by_filter_view`. No semantic change.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 25d9949. Configure here.
| when match_count >= 2. Clears every match EXCEPT the | ||
| lowest-index one; "drop duplicate retries but leave one | ||
| running." | ||
| * `clear_all` — destructive (red text). Clears every match. |
There was a problem hiding this comment.
POST form drops chart hints on confirmation failure
Medium Severity
The POST form carries queue_name, task_name, repoid, and commitid as hidden inputs but omits the four chart-hint fields (bucket_count, bucket_pct, total_visible, total_depth). When a destructive POST fails typed confirmation, params is request.POST, so _coerce_int(params.get("bucket_count")) returns None for all four hints. The approx dict becomes None and the template falls back to "Approximate count not available" — even though the operator just saw the approximate count on the initial GET. The operator loses impact context at exactly the moment they need to re-confirm.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 25d9949. Configure here.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #909 +/- ##
=======================================
Coverage 91.89% 91.90%
=======================================
Files 1316 1316
Lines 50586 50576 -10
Branches 1625 1625
=======================================
- Hits 46485 46480 -5
+ Misses 3795 3790 -5
Partials 306 306
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
…t numbers) Conflict resolution notes: - services.py docstrings for `_substitute_filter_any` and the chunked clear job: combined #908's "dry-run vs live-clear must agree" framing with #909's more general "rule survives a future field/truthiness change" framing. Both are accurate for the post-merge call sites; the merged wording covers both modes. - test_celery_broker_queue.py: took #909's structure (no preview test class) wholesale because the preview surface is gone. #908's `_post_clear` thread-join helper is unnecessary in the view tests because the post-#909 view tests only assert on the 302 redirect URL, not on post-worker queue state. End-to-end drain assertions live in `test_celery_broker_clear_job.py`, which #908 already wired up with a thread-join helper that survives this merge.
|
Folded into #905; keeping this open as the per-feature review record. |
… maths Two CI failures on the consolidated branch HEAD: 1. `test_streaming_clear_verify_before_lset_skips_drifted_entry` was a left-over from PR #899 that asserted the verify-before- LSET guard skipped drifted slots. PR #908 deletes that guard in `_streaming_celery_clear` (the whole point of the restoration), so the test now sees `total_lset == 1` for a drifted slot and fails. The merge picked up #909's copy of the test file (it was the conflict-resolution baseline) which re-introduced the test. Apply #908's deletion verbatim and keep its replacement comment in place. 2. `test_typed_confirm_failure_preserves_chart_hint_callout` asserted `round(78500 / 20000 * 211052) == 828381`. The correct value is 828379 (the docstring arithmetic was off- by-two). The test logic and the production behaviour are both correct; the expected literal is what was wrong.


Why
The lazy preview added in #907 still costs seconds because the
count requires a streaming
LRANGEscan of the full queue —the very work the preview was supposed to defer. On the
deepest queues (200k–500k+ messages) the operator-facing
confirmation page can take 5–15 seconds to first byte.
But the operator already saw the bucket count and queue depth
on the chart that linked them here. The chart's per-row "Clear
queue" button is the only way to land on this page in normal
use, and the chart computed both numbers to render its own
row. So the confirmation page can extrapolate an "approximately
N messages (~Y% of queue)" callout from those passed-through
hints — no further Redis call required, and no async polling /
skeleton placeholder either.
What
_frequency_chart.html): per-row "Clearqueue" form now carries
bucket_count,bucket_pct,total_visible, andtotal_depthas hidden inputs. Methodis
GETsince these are non-secret display hints.clear_by_filter_viewGET: rip out all lazy-previewwiring (no
LLEN, nostreaming_celery_count, no previewcache, no background job, no sample materialization). Read
the four hint params via new
_coerce_int/_coerce_floathelpers (so a hand-edited URL never 500s) and render the
confirmation page synchronously.
clear_by_filter_viewPOST: all three actions(
dry_run,clear_keep_one,clear_all) now uniformlyspawn the chunked background-job worker and 302 to the
progress page. The synchronous-
dry_run-rerender shape isgone — the chunked worker walks the queue and computes the
exact match count as it scans, regardless of whether the
pass tombstones anything. Audit log lands at job completion
under
mode=chunked-dry-runfor the dry-run case.clear_by_filter.html: skeleton placeholders, JSinclude, and CSS include all deleted. Replaced with a static
filter fieldset (queue, task, repo with admin link, commit
with full-SHA tooltip) and an "Approximate impact"
fieldset showing either the extrapolated count or
"Approximate count not available — the chunked clear will
compute the exact match count as it scans." Typed-
confirmation pattern + three submits unchanged.
services.streaming_celery_count: deleted (only callerwas the deleted preview surface; chunked worker uses
_streaming_celery_cleardirectly withdry_run=True).clear-by-filter/URLs row + "chart-driventargeted clear" section rewritten to document the chart-hint
passthrough, the synchronous-no-Redis-I/O GET, and the
uniform 302-to-progress shape across all three actions.
(preview/skip-count routes, JS, CSS, cache helpers, the two
REDIS_ADMIN_CLEAR_BY_FILTER_PREVIEW_*settings) was nevermerged to
main, so there was nothing on the file system todelete beyond
streaming_celery_count. If perf(redis_admin): lazy clear-by-filter preview (count off the request thread) #907 lands first,this PR will need a follow-up to drop those leftovers; until
then the surface is already clean.
Approximate-count display
Rendered inline (no JS) when all four hints are present and
total_visible > 0:Falls back to:
…when any hint is missing / non-numeric / divides by zero.
Tests
CeleryBrokerClearByFilterViewTestrewritten as a contractsuite over the new view shape:
with the right numbers and makes zero Redis calls
(asserted via a
MagicMockproxy onredis_admin_conn.get_connectionwhosemethod_callsmustbe empty after the request).
total_visiblefalls back to "Approximate count notavailable", still no Redis calls.
core.Repositoryadmin link whenrepoidis set.
clear_all/clear_keep_one/dry_run(with validtyped-confirm where required) each 302 to
…/clear-by-filter/job/<uuid>/;start_celery_broker_clear_jobis invoked once with the right
dry_run/keep_oneflags.200 + error message.
action=confirmform shape still aliases through tothe chunked-job path.
test_clear_by_filter_view_dry_run_preview_unchangedintest_celery_broker_clear_job.pyrenamed to…_dry_run_redirects_to_progress_pageand updated to assertthe new shape.
End-to-end exercise of the chunked worker (queue actually
drained, audit-log shape,
keep_onesurvivor placement) livesin
test_celery_broker_clear_job.py(existing).Migration note
This PR removes URL routes that #907 would have introduced
(
clear-by-filter/preview/,clear-by-filter/skip-count/)but those routes were never merged to
main, so there'snothing externally linkable to break. The
clear-by-filter/route itself is unchanged.
If #907 merges first, rebase this PR on top of it; the
leftover preview helpers / settings / static files / preview
test file that #907 introduces should then be deleted as part
of the rebase.
Test plan
uv run ruff check redis_admin/uv run ruff format --check redis_admin/pytest -q apps/codecov-api/redis_admin/tests/(294passed, run inside the api container which provides
Postgres for the Django ORM tests)
drill-down on a deep queue, click "Clear queue…" on a
bucket, confirm the page renders sub-second with the
approximate-count callout.
Made with Cursor
Note
Medium Risk
Changes the celery broker admin clear-by-filter flow and how destructive/dry-run actions are executed (now always asynchronous), which could affect operator workflows and queue-clearing behavior if the new job-spawn path misfires. Risk is mitigated by preserving typed-confirmation gates, legacy
action=confirmcompatibility, and expanded tests around zero-Redis GET and redirects.Overview
Speeds up the celery broker
clear-by-filteroperator flow by making the confirmation page fully synchronous with zero Redis I/O, reusing count/depth hints passed from the frequency chart to show an approximate impact instead of doing a live scan.Updates the chart’s per-bucket Clear queue… action to
GETthe confirmation URL withbucket_count/bucket_pct/total_visible/total_depthhints, and simplifies the confirmation template to a static filter summary (including a repo admin link) plus the approximate-impact callout.Unifies
POSThandling so all three actions (dry_run,clear_keep_one,clear_all) now spawn the existing chunked background clear job and redirect to the progress page; removes the now-unusedstreaming_celery_counthelper and updates docs/tests accordingly.Reviewed by Cursor Bugbot for commit 25d9949. Bugbot is set up for automated code reviews on this repo. Configure here.