add db cleanup for PyPy#472
Open
mattip wants to merge 2 commits intopython:mainfrom
Open
Conversation
zzzeek
reviewed
May 10, 2026
| # drop rows created by the previous benchmark | ||
| session.query(Person).delete(synchronize_session=False) | ||
| session.query(Address).delete(synchronize_session=False) | ||
| session.expunge_all() |
There was a problem hiding this comment.
I think it would be a more realistic test to use a new session for each loop here
like
for loops in range(loops):
with DBSession() as session:
# everything else here
cheaper than calling expunge_all() and makes it clear where we're starting the operation
zzzeek
reviewed
May 10, 2026
| total_dt = 0.0 | ||
|
|
||
| for loops in range(loops): | ||
| with DBSession() as session: |
There was a problem hiding this comment.
cool. does it work / fix the problem? this "with" syntax for the session was introduced probably long after this test suite was written
Contributor
Author
There was a problem hiding this comment.
Yes, this fixes the warning/crash.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Claude helped me come up with this fix for hundreds of emitted warnings, with the justification:
SQLite reuses primary key IDs after a full table DELETE (it has no AUTOINCREMENT sequence to preserve), the next loop iteration inserts new Person and Address objects with the same IDs (1..N) as the previous run.
On CPython this is harmless: refcounting immediately collects the old objects when new_person is reassigned each iteration, clearing the identity map's weakrefs before the next loop. On PyPy, GC is deferred - the old objects remain alive, so the identity map still holds live entries for IDs 1..N. When the new objects are committed with the same keys, SQLAlchemy's
_register_persistentdetects the collision and emits:SAWarning: Identity map already had an identity for (Person, (1,), None), replacing it with newly flushed object. Are there load operations occurring inside of an event handler within the flush?This warning fires for every inserted row on every benchmark loop, flooding stderr and causing the benchmark to fail on PyPy.
Fix: call
session.expunge_all()immediately after the bulk deletes to explicitly clear the stale session state before timing begins.It would be nice to have someone who understands more about sqlalchemy than me review this for correctness, I am not sure this is the right place for a fix.