Problem/Motivation

PHP 8.1 adds Fibers, opening this meta to group together potential use cases.

https://php.watch/versions/8.1/fibers

https://www.php.net/releases/8.1/en.php

https://clue.engineering/2021/fibers-in-php

Steps to reproduce

Proposed resolution

Fibers only work when you're using an async API.

i.e. it needs to look something like this:

1. Fiber::start() executes a callback

2. Code inside the callback calls a non-blocking function - send an async guzzle request, send an async MySQL query, immediately after doing so, it calls Fiber::suspend()

4. The code that called Fiber::start() can then call some other thing.

5. When we get back to the code in #2 (because other fibers finished or suspended) we'll be at the point after Fiber::suspend() and the code can then continue if it's got something back, or Fiber::suspend() again.

This means we need at least two things to make Fibers work:

1. Something that controls multiple similar code paths (cron execution, queue running, big pipe placeholder rendering).

2. Code that has an 'async mode' for being called inside a fiber, where it will execute non-blocking things and call Fiber::suspend()

Per discussion in this issue, combined with a blog post from Matt Glaman, we may have an initial candidate.

1. We add Fibers to Big Pipe placeholder rendering, so that the rendering of each placeholder happens inside a fiber.

2. We add async MySQL support via a new mysqli driver, then we implement support for that, as well as Fiber::suspend(), in the views SQL Query plugin. Async MySQL queries: #3259709: Create the database driver for MySQLi

This will allow views listing queries to be executed non-blocking, so that other big pipe placeholders can be rendered in the meantime, those other placeholders might execute other async queries, or they might be loading and rendering entities - but either way it will fill in the time while queries are returning, whether they're 30ms or 1s.

Views queries are a relatively easy candidate, because they happen in one place, they often take long enough to make a difference to response times, and there's only one at a time. There is lots of other database and file i/o in core but this could be harder to implement with Fibers (like scanning and parsing YAML files). But once we've got one example going we can add more.

Other possible applications:
Queues: #1189464: Add an 'instant' queue runner
Cron: #3257727: Run automated cron in a Fiber
Rebuilds: #3257725: Add a cache prewarm API

Remaining tasks

User interface changes

API changes

Data model changes

Release notes snippet

Comments

catch created an issue. See original summary.

catch’s picture

Issue summary: View changes
berdir’s picture

Random thought: Could we use fibers for render lazy builders? I'm not sure if there is a downside to using them for fast blacks (we might want to do that long-open issue to indicate blocks/render things are not-cache-worthy which could then also mean not-fiber-worthy), but for slower blocks that do sql queries or load things, that could be very interesting?

catch’s picture

Because the fiber doesn't actually run in parallel, I think we'd need the MySQL query to run async/non-blocking. So in the fiber, execute a non blocking query, suspend the fiber, get on with other stuff in the main thread, then go back to see if the query has come back.

Because with lazy builders we know what we're going to run before we actually run it, and they're self-contained, we ought to be able to make that work - but we'd have to build the async bits into the entity query API, views etc. to be able to do it.

aaronmchale’s picture

but we'd have to build the async bits into the entity query API, views etc. to be able to do it

That might actually be worth doing as a stepping stone to solving #2218651: [meta] Make Drupal compatible with persistent app servers like ReactPHP, PHP-PM, PHPFastCGI, FrankenPHP, Swoole, almost like a first step to see how much we can make async or how these Core APIs behave.

catch’s picture

MySQL's async support is only available with mysqli, not PDO. So we'd need... a mysqli database driver first. But it would allow us to do async things without forking.

Another very vague idea I had was file_scan_directory() and similar - i.e. when we're scanning and parsing files, doing the filesystem work in a Fiber using non blocking streams, and the CPU-intensive parsing outside it, interleaving the two instead of one then the other. But whether we'd actually see a measurable gain from that is hard to tell without trying.

daffie’s picture

MySQL's async support is only available with mysqli, not PDO. So we'd need... a mysqli database driver first. But it would allow us to do async things without forking.

If we get such a database driver, how much better would Drupal get? What can we do what we now cannot?

mondrake’s picture

A mysqli driver, extending from the PDO-mysql one, shouldn't be too much effort. Actually, in DruDbal there is an implementation. In essence, the significant differences would be, in the Statement class:

  • no support to named placeholders in SQL statements (only positional ones)
  • stringification of resultsets must be done explicitly
  • rowCount() requires some tweaks
catch’s picture

@daffie let's take something like https://www.drupal.org/dashboard - there's four or five different views on there in blocks.

If you hit a completely cold cache, you have to execute all the views building those blocks, including both the listing query, then rendering the entities within them.

The idea being explored is that for each lazy builder - let's assume we're able to loop over them, you'd render it up until you have to run a views listing query. When it's time to run that query (or whatever appropriate point to delegate to the Fiber), it's done inside a Fiber with a non-blocking query. You then ::suspend() the Fiber and return to the main process, and start the next lazy builder, if that also has to run a views listing query, you'd also run it non blocking and ::suspend(), then you loop over all the lazy builders again to see if the various queries have come back, then render the results for anything that has, then continue polling until everything is rendered.

The page still can't finish rendering until the slowest query is finished, but if you have five slow queries that take 500ms each, then you might be able to run them all at once in the space of 600ms-1s instead of definitely having to run them sequentially in 2.5s. While you're running those simultaneous queries, you can also get on with PHP/CPU-intensive tasks like rendering - assuming we're able to arrange things so that these can be interleaved.

Fibers only allow us to paralellize i/o, not CPU, because there's not any parallel execution at all, so it has to be something like an async MySQL or SOLR query or http request to make any difference, and there has to be something to be getting on with while it's running - either more i/o or something else.

daffie’s picture

@catch: Thank for for your explanation!

To me this all sounds as something that is very important (if not crucial) for Drupal to have. I am not sure if it is as easy as @mondrake is saying about creating such a database driver. Only he has experience with MySQLi and I do not.
I have created #3259709: Create the database driver for MySQLi. It would be great if @catch as he is a release manager could post a reply on that issue.

Edit: Changed the link to #3259709: Create the database driver for MySQLi.

andypost’s picture

Instead of writing db driver I'm sure better start with render, for example FibersRenderStrategy or kinda. It could be a big win for sites serving mostly page cache from Apcu/redis/memcache

andypost’s picture

Good example of async drivers is [open]swoole which moving redis/mysql connectors in/out of core all previous year.

There's was a lot discussed in context of PDO within fibers in rfc - IIRC it end up with no-way now because of tons of globals inside of pdo-extensions(
But! All native php-extensions pgsql/sqlite works fine, no idea about mysql but mysqli should be

There's another step forward https://wiki.php.net/rfc/mysqli_support_for_libmysql

catch’s picture

@daffie the link in #10 is back to this issue, which issue did you mean?

daffie’s picture

@daffie the link in #10 is back to this issue, which issue did you mean?

My mistake: It is: #3259709: Create the database driver for MySQLi.

berdir’s picture

That's not the same thing, that's just changing how blocks within regions are rendered.

catch’s picture

Issue summary: View changes
kingdutch’s picture

#2218651: [meta] Make Drupal compatible with persistent app servers like ReactPHP, PHP-PM, PHPFastCGI, FrankenPHP, Swoole was mentioned in #5 but not added as related issue, doing so now

In my opinion #2218651: [meta] Make Drupal compatible with persistent app servers like ReactPHP, PHP-PM, PHPFastCGI, FrankenPHP, Swoole is the more concrete problem to tackle. Fibers feels like more of an implementation detail. In most cases I think Drupal should lean on libraries that abstract that away for us.

https://clue.engineering/2021/fibers-in-php is a good article on the topic for anyone not familiar with Fibers (added to the IS too).

fgm’s picture

Relevant: @matt glaman's experiments with making BigPipe concurrent using fibers https://mglaman.dev/blog/can-we-use-concurrency-speed-streamed-bigpipe-r...

andypost’s picture

Fibers doesn't provide parallelism, it's just coroutine in the same thread

catch’s picture

It looks to me like the work Matt Glaman has done in the blog post and the mysqli driver could meet in the middle.

Following on from #9:

1. Render bigpipe placeholders as (or in?) fibers.
1b. Add a mysqli driver with async support
2. Add support to views to execute the main listing query in ::execute() async, it would call Fiber::suspend() immediately after doing so.
3. When the fiber is resumed, views checks if its query has come back, and if so keeps going, if not ::suspend() again.
4. We could also do this outside views in PagerSelectExtender::query() specifically, where we execute two queries, both the count and the listing query, that gives us two places to Fiber::suspend() then.

It looks like Fiber::getCurrent() would be enough to tell if we're inside a Fiber or not, which would allow us to run the non-async code path for those cases, that way nothing changes.

Going to be some steps in the middle like dealing with

$result->setFetchMode(\PDO::FETCH_CLASS, 'Drupal\views\ResultRow');

in views query execution.

We also need a way to detect if the driver supports async queries and an API to execute them.

The view SQL query ::getConnection() method supports both replicas and a base_database setting on the view, this would allow sites to set up a separate async connection for mysqli leaving all other queries using the PDO driver. Would mean configuring each individual view as to whether it should use the async connection or not though. Or a site could switch to the mysqli driver for everything then it would just come down to whether we're executing inside a fiber or not.

Solr also supports async, so we could take a similar approach in search_api contrib too.

fgm’s picture

@andypost indeed, that's why I mentioned concurrency, not parallelism (cf. https://go.dev/blog/waza-talk) . That's also mostly what Node.js does and it still usually provides a nice performance boost when multiple I/O related tasks are concurrent, like here (DB + rendering).

catch’s picture

Title: [PHP 8.1] Fibers » [meta] Use Fibers for concurrency in core
Issue summary: View changes

Re-titling now we actually require PHP 8.1

In slack @mglaman mentioned possibly using Fibers for post-response tasks. We'd need to find post-response things where we can actually use non-blocking APIs, but if we can that'd be a good idea.

I think Big Pipe + Views + mysql async is likely to be the most dramatic improvement here and also ironically simpler than some other options, since it's quite centralised, so have updated the issue summary with an outline of how it seems to be coming together.

catch’s picture

Issue summary: View changes
catch’s picture

mondrake’s picture

How about adding a ‘Fibers’ tag to group together all Fibers related issues?

catch’s picture

Issue tags: +fibers

Updated #3257725-12: Add a cache prewarm API with a more concrete proposal - no patch or anything but would be good to get more eyes on the idea. Let's add that tag yeah.

andypost’s picture

FYI latest xdebug release included flamegraph support

and using fibers in tests for it https://github.com/xdebug/xdebug/commit/41ce86ab55979f386c9e612bd377b6ed...

Version: 10.0.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 11.x-dev » main

Drupal core is now using the main branch as the primary development branch. New developments and disruptive changes should now be targeted to the main branch.

Read more in the announcement.