Problem/Motivation
The main problem for why Drupal is slow is, because of getting entity data out of the database is slow. Entity data for a single entity instance is stored in many different database tables. Getting a single entity instance out of the database will result in a complicated join query. The relational database storage used by Drupal will result in getting entity data out of the database always being slow.
Conditions and sorts also often have to be run against multiple different tables, which works against database queries. If we were to try to store fields more in multiple columns, we'd also run into index limitations due to sheer index length.
Proposed resolution
Add the database driver for MongoDB as experimental to Drupal Core. All entity instance data is stored in a single JSON document. Getting a single entity instance out of the database is always a very simple query. A single row from a single database table. The same as a keyvalue store and just as fast. The database driver is a full support database driver. Only a MongoDB database is used by Drupal. No other and/or relational database is necessary. Just MongoDB.
The database driver has a contrib module. The code with a readme on how to install Drupal on MongoDB on DDEV can be found here
Remaining tasks
Hard requirements:
- #3398773: Make the conditions in joins in dynamic queries use Condition objects
- #3457537: Add a number addExpression specific functions
- #3474627: Make the nightwatch tests work with MongoDB
- #3474680: CommentNonNodeTest::testCommentFunctionality() is failing for MongoDB
- #3475512: Replace hardcoded database queries with dynamic queries
- #3475719: Set entity schema installation, module configuration installation and content creation in the right order in kerneltests for MongoDB
- #3475921: The method ContentEntityBase::getLoadedRevisionId() should the return value with the correct type
- #3476360: Allow config items to have database driver overrides
- Views yaml files use the entity tables and MongoDB will use only the base table with entity queries. Some solutions needs to be created.
- As last: add the database driver to Core
Not hard requirements:
- #3450706: Let for all database drivers the module name setting default to the driver name
- #3457582: Add method to the database connection class indicating the use of relational database storage
- #3475925: The method ContentEntityBase::getRevisionId() should not return string values
- #3476096: Change the boolean constants to have boolean values in NodeInterface, CommentInterface and FileInterface
Minimum requirements for MongoDB
- The minimum required version for MongoDB is 7.0. This is the most current version of MongoDB.
- A MongoDB replica set is required. MongoDB with a replica set is the minimum for transaction support. A single MongoDB instance does not support transactions. Drupal needs database transactions to do what it needs to do.
API changes
None
Data model changes
Entity instances are stored in JSON documents for MongoDB.
Release notes snippet
TBD
Comments
Comment #5
catchAdding a note about listing queries to the issue summary since that's also a severe performance/scalability issue now that mongodb can significantly help with especially with large datasets.
For views, at devdays we discussed adding an entity query views backend, and converting all core (and starshot once it has them) shipped views to use it. An entity query view will be interoperable between mongodb and relational database drivers. Some previous work/discussion is in https://www.drupal.org/project/efq_views and #3395848: Entity Query views backend.
I think we could add mongodb as alpha/beta to core without resolving the views issue, but would need to fix it to make mongodb stable. However an entity query backend for views has a lot of positives in its own right, not just for mongodb.
Comment #6
lauriiiI've discussed the idea with @daffie and @catch in detail at DrupalCon Lille and Drupal Dev Days and on a high level the proposal is fine. Adding a new database driver is not part of the strategic focus for the team at the moment, which may lead into some delays with reviews in cases where there's higher priority work waiting for feedback from committers.
I'm leaving the tag on because I'd like to be involved in the process again when we have a better sense on what's the impact on UX (such as Views) and the ecosystem.
Comment #7
daffie commentedComment #8
daffie commentedComment #9
mxh commentedIs that statement backed by any performance numbers / benchmarks?
What I've often seen as a performance bottleneck is not the database itself, but the connection to it. When a Drupal site grows (showing in its number of configuration items), one request to a Drupal page may take up hundreds, sometimes over one thousand of database queries. This is because of cache tag lookups, config reads and finally when content entities are queried. When having a local database without any latency, performance is usually not a problem.
Comment #10
catch@mxhcache tag lookups can be replaced with redis so they won't hit the database at all.
On a site with lots of data, entity listing queries can individually take a very long time - anything from hundreds of milliseconds to dozens of seconds for a single query if it has conditions and sorts on different database tables.
It's less that it's the main reason that Drupal is slow, since sites with much less data have different performance issue, and more that it's one of the hardest issues to address.
Comment #11
mxh commentedThanks for the reply and tip @catch using Redis might actually an option to try out. My key point in #9 is that it doesn't make sense to replace a Porsche with a Ferrari for getting faster when the speed limit is at 50 on a one-lane bridge. I'm sorry that I might have brought in something off-top here - will move it into a separate issue in case there's more room for discussion on it.
Comment #12
daffie commentedAdded #3474627: Make the nightwatch tests work with MongoDB as a child issue.
Comment #13
catchComment #14
daffie commentedAdded #3474680: CommentNonNodeTest::testCommentFunctionality() is failing for MongoDB as a child issue
Comment #15
daffie commented#3457537: Add a number addExpression specific functions is added to the list.
Comment #16
daffie commentedRemoved #3457537: Add a number addExpression specific functions it once from the list. It was there twice.
Comment #17
daffie commentedRemoved #3450699: Install the user module in the site install process as it is no longer necessary.
Comment #18
daffie commentedAdded #3475512: Replace hardcoded database queries with dynamic queries.
Comment #19
daffie commentedAdded #3475719: Set entity schema installation, module configuration installation and content creation in the right order in kerneltests for MongoDB as a child issue
Comment #20
daffie commentedAdded #3475921: The method ContentEntityBase::getLoadedRevisionId() should the return value with the correct type as a child issue.
Comment #21
daffie commentedAdded #3475925: The method ContentEntityBase::getRevisionId() should not return string values as a very nice to have child issue
Comment #22
daffie commentedAdded #3476175: Change the filter in the overview page of the dblog module to a condition object to the list of child issues.
Comment #23
daffie commentedAdded #3476096: Change the boolean constants to have boolean values in NodeInterface, CommentInterface and FileInterface as a not hard requirement.
Comment #24
daffie commentedAdded #3476360: Allow config items to have database driver overrides as an alternative solution for #3395848: Entity Query views backend.
Comment #25
andypostMeantime new PECL 2.0.0 release is out https://github.com/mongodb/mongo-php-driver/releases/tag/2.0.0
There's some deprecations and so I filed #3518608: Upgrade mongodb to 2.0