Problem/Motivation
Drupal uses buffered database queries by default. This loads the entire result set into memory before it's available to the caller.
PDO also supports unbuffered queries - these are streamed to the client as the results come back.
In Drupal we have a lot of operations both during updates/batch/cron and also regular page loading where we load multiple rows from the database and iterate over them. I think we could reduce our memory usage by using buffered queries.
Big question is - what breaks when we do that?
Steps to reproduce
Proposed resolution
1. In this issue and/or spin-off issues, fix test failures when unbuffered queries are in use. This would allow sites to actually use them (at least with core and well-behaving contrib modules).
2. In another spin-off issue, add a core test environment that configures the database connection to be unbuffered or possibly default all core tests to use unbuffered queries (since buffered queries are more permissive).
3. Figure out whether we can make that same change for contrib modules (probably needs to be 12.x or 13.x only).
4. Once core + contrib are tested, we could consider flipping the default in core in a major release.
Remaining tasks
User interface changes
Introduced terminology
API changes
Data model changes
Release notes snippet
| Comment | File | Size | Author |
|---|---|---|---|
| #5 | Screenshot 2026-02-27 at 17.49.09.png | 186.08 KB | catch |
| #5 | Screenshot 2026-02-27 at 17.49.05.png | 181.88 KB | catch |
| #2 | Screenshot from 2026-02-27 14-11-13.png | 293.34 KB | catch |
| #2 | Screenshot from 2026-02-27 14-11-07.png | 293.12 KB | catch |
Issue fork drupal-3576212
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
catchOn the Umami front page with a cold cache, this takes 2mb of the peak memory usage for ::execute() and more like 4mb off the incl (not peak) memory usage.
I'm not seeing meaningful change in the headline figures for the request though.
Will also try on a page that's loading a lot more rows from the db.
Comment #4
catchWell, quite a lot breaks. Looks like it's mostly in migrate.
Comment #5
catchMore xhprof screenshots - in this case from a custom JSON:API endpoint that loads 2000 entities * 3 in chunks of 100.
I'm also seeing 10-15mb off the total memory usage (both peak and non-peak) in xhprof, and can't see another explanation for it being lower apart from this change, e.g. from about 105mb for this endpoint down to 90mb.
Comment #6
catchComment #7
heddnMany of the failures are related to
General error: 2014 Cannot execute queries while other unbuffered queries are active. Consider using PDOStatement::fetchAll(). Alternatively, if your code is only ever going to run against mysql, you may enable query buffering by setting the Pdo\Mysql::ATTR_USE_BUFFERED_QUERY attribute.Comment #8
catchComment #9
catchSlowly working through the test failures.
Migrate's SQL idmap was explicitly setting a statement as a property then iterating over it when other queries are run - this needed a fetchAll().
Entity loading was doing some cache gets while iterating over a result set, this is already dealt with by #3576129: Reduce calls to FieldDefinition::getColumns() so cherry-picked across here.
Node search also passed a statement around while executing other queries - another ::fetchAll().
There are some test failures in migrate_drupal that I think are different to idMap, but they look fairly plugin-specific. All that code will go in #3572280: Remove Migrate Drupal so ignoring that for now, these appear to be nearly 100% of the remaining kernel test failures.
I've added some bullets to the issue summary on how I think we can approach this - essentially get core compliant, then enable it for tests, then the same for contrib, then eventually (and optionally) flip the default for new installs.
Comment #10
catch#2347867: Race conditions with lock/cache, session storage - add a non-transactional database connection would 'fix' a lot of the remaining test failures here - because what's commonly happening is foreaching over a query, then checking some kind of entity metadata (like whether the entity type has an integer or string ID key), and this then loads the entity definitions, which does a cache get, which issues a database query. If the cache backends used a different database connection, then MySQL wouldn't mind this at all.
Comment #11
catchComment #12
catchThis isn't fully green, but that's not really the point of this issue as such:
- all functional tests are green
- remaining kernel test failures entirely (or at least mostly) in migrate_drupal which is about to be removed
- the change that causes the test failures will be removed from the MR prior to commit, it's there to flush out the test failures.
Comment #13
alexpottAre we sure about this always being a good thing? Using buffered queries means that the database server can releases locks, table resources, and the result cursor immediately after sending all rows — freeing the server to handle other queries sooner. Also even when iterating around a set of 50 results I think it might be quicker to use an unbuffered query - yes it''ll use more memory but that's a trade off we're making. It feels like this is going to need some careful investigation about whether this does provide benefits. I suspect that the benefits might be very situation dependent and not generic enough to warrant switching the default behaviour.
Comment #14
catchThat might be true but right now it's not possible to use unbuffered queries with core at all, so I think we could do 1-3 in the issue summary, and by the time we've done all that we might have some data about whether we want to go ahead with #4.
For most select queries in core we do very little with the results except foreach over them and collect them into an array anyway.
Comment #15
alexpottBut this change will still result from queries have longer read locks against a table right - so anything wanting to write to the table will have to wait longer - no?
Comment #16
catchNo because InnoDb doesn't do read locking at all, only when you issue
SELECT .. FOR UPDATEto explicitly acquire a lock (andFOR SHAREI think but never used or even encountered that in the wild). Are you thinking of MyISAM or am I missing something? https://dev.mysql.com/doc/refman/8.4/en/innodb-locking-reads.htmlAccording to https://www.php.net/manual/en/mysqlinfo.concepts.buffering.php the thing that doesn't get freed until the entire set is fetched is only the mysql connection (and the ability to make other queries on the same connection, which is what the MR has fixes for) - but we keep the mysql connection until the end of the request anyway so that bit is not really relevant for us at all.
Comment #17
alexpottRe #16 well that's good to be wrong about! That makes it more likely this change might work out as a good thing. Does the transaction isolation level make a difference here? I wonder what happens if a commit happens while you are reading from a buffered query. I asked Claude and said this...
READ COMMITTED
Each row fetch sees the latest committed state at the time that specific row is fetched
A commit between two of your fetch_row calls will be visible in subsequent fetches
This is more pronounced than with buffered queries — the exposure window is the full iteration time, not just query execution time
Classic non-repeatable read problem, stretched across potentially minutes
REPEATABLE READ (InnoDB default)
The MVCC read snapshot is established at the start of the transaction (or first read)
All row fetches, no matter how spread out in time, read from the same snapshot
Concurrent commits during your slow iteration are invisible — InnoDB serves old row versions from the undo log as needed
This is the key reason REPEATABLE READ is the default — it makes unbuffered iteration safe and consistent
... so I do wonder about the impact of this.
Comment #18
catchI wouldn't trust what Claude says about anything...
MR is green now: https://git.drupalcode.org/project/drupal/-/pipelines/758574
Just to be clear, this issue will not switch from buffered to unbuffered queries, all I want to do is make core compliant in this issue, then an issue to add test coverage (not sure how yet, might need a test-only driver that's otherwise identical to the mysql driver?).
Then we can have a proper policy issue about whether to switch the default and how after all that. The commit to change the default in this issue is only to get useful test runs.
Comment #19
alexpott@catch but claude is correct here. We are increasing the scope for non-repeatable reads with using the recommended READ COMMITTED transaction level. Because the window for data to change while we are reading the table is increasing.
Comment #20
catchReverted that initial commit for now, so the MR contains only the changes actually proposed here (and the entirety of #3576129: Reduce calls to FieldDefinition::getColumns()).
Need to remember to unrevert that query to validate any further changes in this MR though.
Comment #21
catchNot sure this window is 'longer' except in very specific situations.
For a buffered query, MySQL executes the query, and all of the results get loaded into memory before PHP gets a statement back.
For an unbuffered query, PHP immediately gets a statement back, and then MysQL streams the results to PHP as they come in.
When there is a single row returned, there should be no measurable difference between the two as far as I can tell - you get one row back and process it and you're done.
For multiple rows, it should mean that in the unbuffered case, we start processing results earlier than the unbuffered case, then whether we finish processing those rows earlier or later would depend on how much work is being done in PHP. It could well be that the full result set is processed in the same time, because we are essentially interleaving i/o wait with CPU, instead of a big i/o wait followed by CPU.
If you are doing a CPU intensive task like writing out CSV files or processing images or something for each row, then sure it could make a difference, but I don't see it increasing the window much or at all for Drupal.
Where it will make a bigger difference is the relative load between PHP and MySQL.
Comment #22
smustgrave commentedAppears to need a rebase.
Comment #23
catchRebased.
Comment #24
needs-review-queue-bot commentedThe Needs Review Queue Bot tested this issue. It fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".
This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.
Comment #25
catchRebased.
Comment #26
catchComment #27
needs-review-queue-bot commentedThe Needs Review Queue Bot tested this issue. It fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".
This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.
Comment #28
catchRebased again.
Comment #29
mondrakecan we have a run for all DBs please? it seems I cannot trigger them.
Comment #30
mondrakeDo we still need to have iterable StatementInterface objects after this? It looks like when a foreach is involved, now the approach is to iterate over a fetchAll result.
Comment #31
catchAt the moment, the StatementInterface being iterable doesn't save any memory because the queries are buffered.
After this MR, there is a memory saving if you iterate over StatementInterface, but you can only do so if you don't execute other queries while iterating - because to do so causes a MySQL error.
The changes in the MR are to remove the places in core that do that, so that it'll possible to run core with unbuffered queries. But I don't think it means dropping StatementInterface.
Kicked off pipelines on other database combinations.
Comment #32
mondrakeThis addresses point 1 of the proposed resolution in the IS. Per se, I think this is OK and independent from the other points.
Right now,
StatementInterfaceis defined asand
StatementBaseis defined asthis make StatementInterface directly iterable in a foreach loop. My question is whether this is still relevant once we do this, or we could move the iterability to the
ResultBaseclass instead.Green on all db, this LGTM now.
Comment #33
needs-review-queue-bot commentedThe Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".
This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.
Comment #34
catchRebased - the node search plugin moved. Manually re-applied the same changes to the new location.
Comment #35
needs-review-queue-bot commentedThe Needs Review Queue Bot tested this issue. It fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".
This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.
Comment #36
catchMissed one spot in the rebase.