Problem/Motivation

The main problem for why Drupal is slow is, because of getting entity data out of the database is slow. Entity data for a single entity instance is stored in many different database tables. Getting a single entity instance out of the database will result in a complicated join query. The relational database storage used by Drupal will result in getting entity data out of the database always being slow.

Conditions and sorts also often have to be run against multiple different tables, which works against database queries. If we were to try to store fields more in multiple columns, we'd also run into index limitations due to sheer index length.

Proposed resolution

Add the database driver for MongoDB as experimental to Drupal Core. All entity instance data is stored in a single JSON document. Getting a single entity instance out of the database is always a very simple query. A single row from a single database table. The same as a keyvalue store and just as fast. The database driver is a full support database driver. Only a MongoDB database is used by Drupal. No other and/or relational database is necessary. Just MongoDB.

The database driver has a contrib module. The code with a readme on how to install Drupal on MongoDB on DDEV can be found here

Remaining tasks

Hard requirements:

Not hard requirements:

Minimum requirements for MongoDB

- The minimum required version for MongoDB is 7.0. This is the most current version of MongoDB.
- A MongoDB replica set is required. MongoDB with a replica set is the minimum for transaction support. A single MongoDB instance does not support transactions. Drupal needs database transactions to do what it needs to do.

API changes

None

Data model changes

Entity instances are stored in JSON documents for MongoDB.

Release notes snippet

TBD

Comments

daffie created an issue. See original summary.

catch credited alexpott.

catch credited lauriii.

catch credited longwave.

catch’s picture

Version: 11.0.x-dev » 11.x-dev
Issue summary: View changes
Related issues: +#3395848: Entity Query views backend

Adding a note about listing queries to the issue summary since that's also a severe performance/scalability issue now that mongodb can significantly help with especially with large datasets.

For views, at devdays we discussed adding an entity query views backend, and converting all core (and starshot once it has them) shipped views to use it. An entity query view will be interoperable between mongodb and relational database drivers. Some previous work/discussion is in https://www.drupal.org/project/efq_views and #3395848: Entity Query views backend.

I think we could add mongodb as alpha/beta to core without resolving the views issue, but would need to fix it to make mongodb stable. However an entity query backend for views has a lot of positives in its own right, not just for mongodb.

lauriii’s picture

I've discussed the idea with @daffie and @catch in detail at DrupalCon Lille and Drupal Dev Days and on a high level the proposal is fine. Adding a new database driver is not part of the strategic focus for the team at the moment, which may lead into some delays with reviews in cases where there's higher priority work waiting for feedback from committers.

I'm leaving the tag on because I'd like to be involved in the process again when we have a better sense on what's the impact on UX (such as Views) and the ecosystem.

daffie’s picture

Version: 11.x-dev » 11.0.x-dev
Issue summary: View changes
daffie’s picture

Version: 11.0.x-dev » 11.x-dev
mxh’s picture

The main problem for why Drupal is slow is, because of getting entity data out of the database is slow.

Is that statement backed by any performance numbers / benchmarks?

What I've often seen as a performance bottleneck is not the database itself, but the connection to it. When a Drupal site grows (showing in its number of configuration items), one request to a Drupal page may take up hundreds, sometimes over one thousand of database queries. This is because of cache tag lookups, config reads and finally when content entities are queried. When having a local database without any latency, performance is usually not a problem.

catch’s picture

@mxhcache tag lookups can be replaced with redis so they won't hit the database at all.

On a site with lots of data, entity listing queries can individually take a very long time - anything from hundreds of milliseconds to dozens of seconds for a single query if it has conditions and sorts on different database tables.

It's less that it's the main reason that Drupal is slow, since sites with much less data have different performance issue, and more that it's one of the hardest issues to address.

mxh’s picture

@mxhcache tag lookups can be replaced with redis so they won't hit the database at all.

Thanks for the reply and tip @catch using Redis might actually an option to try out. My key point in #9 is that it doesn't make sense to replace a Porsche with a Ferrari for getting faster when the speed limit is at 50 on a one-lane bridge. I'm sorry that I might have brought in something off-top here - will move it into a separate issue in case there's more room for discussion on it.

daffie’s picture

catch’s picture

daffie’s picture

daffie’s picture

daffie’s picture

Issue summary: View changes

Removed #3457537: Add a number addExpression specific functions it once from the list. It was there twice.

daffie’s picture

Issue summary: View changes
daffie’s picture

daffie’s picture

daffie’s picture

daffie’s picture

daffie’s picture

daffie’s picture

andypost’s picture

Meantime new PECL 2.0.0 release is out https://github.com/mongodb/mongo-php-driver/releases/tag/2.0.0

There's some deprecations and so I filed #3518608: Upgrade mongodb to 2.0

Version: 11.x-dev » main

Drupal core is now using the main branch as the primary development branch. New developments and disruptive changes should now be targeted to the main branch.

Read more in the announcement.