Skip to content

fix(transcript_mirror): swallow CancelledError in eager-flush done callback#943

Open
SuperMarioYL wants to merge 1 commit intoanthropics:mainfrom
SuperMarioYL:fix/transcript-mirror-batcher-cancelled-callback
Open

fix(transcript_mirror): swallow CancelledError in eager-flush done callback#943
SuperMarioYL wants to merge 1 commit intoanthropics:mainfrom
SuperMarioYL:fix/transcript-mirror-batcher-cancelled-callback

Conversation

@SuperMarioYL
Copy link
Copy Markdown

Summary

Fixes #930.

The eager-flush task in TranscriptMirrorBatcher.enqueue is registered with

self._flush_task.add_done_callback(lambda t: t.exception())

The intent is to retrieve the task's exception so asyncio doesn't warn about an unretrieved one, but on a cancelled task Task.exception() itself raises CancelledError (Python 3.8+). asyncio's callback machinery then logs

ERROR asyncio: Exception in callback ... <lambda>(<Task cancell...>)
  File ".../transcript_mirror_batcher.py", line 91, in <lambda>
    self._flush_task.add_done_callback(lambda t: t.exception())
                                                 ^^^^^^^^^^^^^
asyncio.exceptions.CancelledError

every time an in-flight eager-flush task is cancelled — i.e. on every event-loop teardown with pending eager flushes, and whenever Query.close() cancels child tasks while a drain is mid-flight. Functionality is unaffected (the inner _drain handles cancellation correctly via async with self._lock); the bug is pure log noise but it pollutes test output and production logs for users running with session_store_flush=\"eager\".

Fix

Replace the lambda with a small named _swallow_done_exception helper that skips cancelled tasks before calling .exception():

def _swallow_done_exception(task: asyncio.Task[None]) -> None:
    if task.cancelled():
        return
    task.exception()

Equivalent to the inline form lambda t: None if t.cancelled() else t.exception() but names the intent.

Test

Added TestTranscriptMirrorBatcher.test_eager_flush_cancellation_does_not_log_callback_error:

  • Builds a batcher with max_pending_entries=0 so any enqueue schedules an eager flush.
  • Uses a hanging store so the eager flush blocks indefinitely.
  • Cancels the in-flight flush task and yields the loop so the done callback fires inside the caplog.at_level("ERROR", logger="asyncio") block.
  • Asserts the asyncio logger recorded no Exception in callback records at ERROR level.

Verified the test fails on master (one CancelledError record) and passes with the fix.

Validation

  • python -m ruff check src/ tests/ — clean
  • python -m ruff format --check src/ tests/ — clean
  • python -m mypy src/ — clean (Success: no issues found in 24 source files)
  • python -m pytest tests/ — 746 passed, 5 skipped (full suite)

Notes

…llback

Replace the lambda passed to ``add_done_callback`` with a small named
function that skips cancelled tasks before calling ``Task.exception()``.

Why
---

In Python 3.8+, ``Task.exception()`` raises ``CancelledError`` when the
task was cancelled. The previous lambda did not handle that, so asyncio's
callback machinery logged

    ERROR asyncio: Exception in callback ... <lambda>(<Task cancell...>)
        File ".../transcript_mirror_batcher.py", line 91, in <lambda>
            self._flush_task.add_done_callback(lambda t: t.exception())
        asyncio.exceptions.CancelledError

every time an in-flight eager-flush task was cancelled — i.e. on every
event-loop teardown with pending eager flushes, and whenever
``Query.close()`` cancels child tasks while a drain is mid-flight.

Functionality is unaffected (the inner ``_drain`` already handles
cancellation correctly via the ``async with self._lock``). The fix is
purely about silencing the ``Exception in callback`` log noise that
otherwise pollutes test output and production logs whenever the SDK
shuts down with pending eager flushes.

Test
----

Added ``test_eager_flush_cancellation_does_not_log_callback_error``
which:

1. Builds a batcher with ``max_pending_entries=0`` so any enqueue
   schedules an eager flush.
2. Uses a hanging store so the eager flush blocks indefinitely.
3. Cancels the in-flight flush task and waits for the done-callback
   to fire.
4. Asserts that the ``asyncio`` logger recorded no
   ``Exception in callback`` records at ERROR level.

The test fails on master (one CancelledError record) and passes
with the fix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: TranscriptMirrorBatcher.enqueue's add_done_callback lambda raises CancelledError when task is cancelled

1 participant