Problem/Motivation
The unnecessary use of Cartesian joins leads to particularly slow queries with large populated tables, where each row of each table might be matched with every row of every other table during the execution the query. In particular, this might occur when a full scan of the join happens looking for something that _does not_ occur in the tables (looking for an entity type that is not tracked, for example).
Two known locations performing Cartesian joins are:
- in the
rest_oai_pmh_is_valid_entity_type()query: https://git.drupalcode.org/project/rest_oai_pmh/-/blob/ca670d019e6e6b261... - in the "liberal cache" plugin: https://git.drupalcode.org/project/rest_oai_pmh/-/blob/ca670d019e6e6b261...
There may be others.
Steps to reproduce
Perform the queries separately, and see that the set of columns returned contains the columns of all tables under the Cartesian join.
Proposed resolution
Rework process and/or queries to avoid Cartesian joins. Given the two queries in question are related exclusively to the existence of rows matching particular conditions, we could move to perform multiple individual queries, and return the disjunction of all of them. Alternatively, reworking the queries to do the separate queries UNION ALL'd together might do the trick in one.
Remaining tasks
- get code into MR
- review
- merge
- release
User interface changes
None.
API changes
None.
Data model changes
None.
Issue fork rest_oai_pmh-3461070
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #3
adam-vessey commentedComment #4
adam-vessey commentedComment #5
adam-vessey commentedComment #6
bibliophileaxeWe have been using this patch in production and it works and has increased performance noticeably.
Comment #9
joecorall commentedThank you!
Comment #10
joecorall commented