Following up on my idea from #2908394-37: Use Composer to build sites without forcing users to learn Composer, I thought it would be interesting to find out if what I proposed there would actually work.

An hour or so later.. turns out that it does work perfectly! :)

CommentFileSizeAuthor
#2 ship-it.png14.76 KBamateescu
#2 php-libraries-in-a-phar.patch8.46 MBamateescu

Comments

amateescu created an issue. See original summary.

amateescu’s picture

StatusFileSize
new8.46 MB
new14.76 KB

Here's what I did. First, let's start with the basics, create a new folder and add some PHP libraries through composer:

mkdir test
cd test
composer require commerceguys/addressing:1.0.0-beta3
composer require commerceguys/intl:0.7.4
composer dump-autoload -a --no-dev

You will end up with the following structure:

$ ls
composer.json  composer.lock  vendor

Now, I used the box utility to generate our Phar archive, so we will also need two more files. One is box.json where we put the configuration needed by box to generate the Phar, and the other is phar-bootstrap.php, which is a Phar file stub that will be executed at inclusion of the Phar, which makes it perfect for handling the autoloading of the files in the archive.

Their content is this:

box.json

{
    "files": [
        "composer.json",
        "composer.lock"
    ],
    "directories": ["vendor"],
    "output": "libraries.phar",
    "stub": "phar-bootstrap.php"
}

phar-bootstrap.php

<?php
Phar::mapPhar();

$basePath = 'phar://' . __FILE__ . '/';
require $basePath . 'vendor/autoload.php';

__HALT_COMPILER();

The contents of the test folder should look like this now:

$ ls
box.json  composer.json  composer.lock  phar-bootstrap.php  vendor

Now that everything is properly set up, we just need to run $ box build, which will generate a libraries.phar file, as instructed in the box.json configuration file mentioned above.

The easiest way to test that our archive behaves like we want it to is to create a simple index.php file with the following code:

<?php

require 'libraries.phar';

var_export(class_exists('CommerceGuys\Addressing\Address'));

Running it will give this awesome output:

$ php index.php 
true            

Now, we can go one step further and check that it also works in the context of a Drupal installation, so I attached a patch that does just that. After applying it on a 8.5.x codebase, I'm very happy to share this screenshot of /devel/php :D

This means that you can now install the Address module either from the tarball or through drush without any mucking around with composer or Ludwig ;)

ressa’s picture

This looks amazing, thanks for testing your idea out and proving that it works! Would this service be very RAM/CPU-demanding for drupal.org? I have heard that the Composer process can require close to 2 GB of RAM, but perhaps that's not the case here?

dsnopek’s picture

This is super cool!

I'd love to see a contrib version of this working before embarking down the path of getting it in core (since that path can take a long, long time to walk).

bojanz’s picture

@dsnopek
Compared to Ludwig, this would be much harder to do in contrib.
I received pushback for asking maintainers to commit a ludwig.json that I wrote for them, then do some manual testing.
Requiring them to create and host a phar would be science fiction.
Also note that this approach currently requires modifying a core file to add the include line, we have no place to hook into from a module that is early enough (this caused huge problems for Composer Manager back in the day, and was only recently worked around in Ludwig)

That said, I am very interested in this concept.
@amateescu
We should figure out whether there is a memory cost to this approach. How does autoloading work? Does requiring the phar load every file into memory, or is it still on-demand?

dsnopek’s picture

@bojanz I think the idea is that there's an external service that generates the phar based on the composer.json's of modules in the Drupal codebase, and the Drupal site builder then downloads the phar and puts in the right place on their site (or, if they have insecure file permissions, it could be written automatically by a Drupal module). I think you are right that this would require patching core to load the phar, though. But I don't think this would require module maintainers to do anything additional if they have a working composer.json.

I think what would be required to do this in contrib is:

  • Code for the service which can generate the phars. This could be a small Symfony app, maybe?
  • A Drupal module to generate links to that service when composer stuff needs to be installed
  • A core patch to load the phar

That's still nontrivial. :-) But it would allow folks to test and iterate on it before proposing something like this for core inclusion.

amateescu’s picture

@bojanz:

We should figure out whether there is a memory cost to this approach. How does autoloading work? Does requiring the phar load every file into memory, or is it still on-demand?

We don't load every file into memory, it is still on-demand. That's kinda awesome :)

As for memory usage, I measured it with:

echo memory_get_usage() . PHP_EOL; # 361264
require 'libraries.phar';
echo memory_get_usage() . PHP_EOL; # 1279792

So the memory usage of just including the Phar is 897 bytes, less than 1M. I suspect that's the size of the autoload classmap generated by composer. I also did the same test earlier today with core's composer.json, and the memory usage of requiring that Phar was somewhere around 1.6M, probably because of the larger classmap.

@ressa:

This looks amazing, thanks for testing your idea out and proving that it works! Would this service be very RAM/CPU-demanding for drupal.org? I have heard that the Composer process can require close to 2 GB of RAM, but perhaps that's not the case here?

This process will be resource-intensive indeed, even on drupal.org servers, but I assume we can build a cache around the submitted composer.json files, probably by using a hash of the file contents, so most of the time users will simply get a pre-built Phar :)

@dsnopek:

You are spot on with the steps needed for moving forward with this concept :)

amateescu’s picture

Code for the service which can generate the phars. This could be a small Symfony app, maybe?

It would be great if this code could be easily wrapped in a Symfony Console command, so Drush and Drupal Console can just use it as-is without the need to duplicate all the effort.

andypost’s picture

any idea how phar works with opcache? there should be at least overhead on decompress classes

amateescu’s picture

any idea how phar works with opcache? there should be at least overhead on decompress classes

This user note from the manual says that phar:// is supported by OPcache, so we should be fine in this regard.

Edit: this is also confirmed by @ircmaxell: https://stackoverflow.com/questions/29023056/can-php-5-5-5-load-phar-fil...

wim leers’s picture

👏 Can't wait to see where this goes!

bojanz’s picture

@amateescu
Thank you for clarifying, those are great news!

The main problem to overcome is the problem of packaging dependencies that already ship with core. You accidentally already tested this use case, Address pulled in doctrine/collections which is already in core.
Another good example is payment gateway SDKs, they love Guzzle, but it's not practical to ship a Guzzle in each gateway PHAR when it's already included in core. It gets even worse with some packages pulling in half of Symfony, which would probably break core if included.
Ideally we'd compare the list of module packages with the ones in core, and strip what's already included (or fail if the constraints aren't compatible).

bojanz’s picture

There will be a Composer meeting at DrupalCon Vienna, Tuesday 5PM at "The Brasserie". This will be one of the topics discussed.

catch’s picture

This looks really interesting. We still have the update manager/authorize.php code in core for securely writing to the file system from the UI, it's under-maintained to say the least but it's there.

mile23’s picture

This process will be resource-intensive indeed, even on drupal.org servers, but I assume we can build a cache around the submitted composer.json files, probably by using a hash of the file contents, so most of the time users will simply get a pre-built Phar :)

So hang on.... What's the scope here?

Are we saying that we'll download a tarball of a module, and then Drupal will say, "woops... needs a phar file!" and request one for the module?

Because that's a huge amount of complexity for a process that stands a good chance of breaking in a lot of different places.

If we're going to have d.o type 'composer install' for us, then we have to include core, so that it can tell us all the useful stuff that composer would otherwise. Because if we don't, then this is a tool that only the Composer haters will use, it won't work anyway, and no one will maintain it, which is how you end up with:

We still have the update manager/authorize.php code in core for securely writing to the file system from the UI, it's under-maintained to say the least but it's there.

So that means we should do the following:

  • Determine the composer.json file used to build this site. Which is generally impossible so no need to read further... :-) (drupal/drupal might be it, but we don't know. Commands like COMPOSER=../../../../../../composer.json composer install are allowed.)
  • Figure out if we need to merge other files. So basically our Drupal site will re-implement wikimedia/composer-merge-plugin so we can get drupal/core's composer.json requirements, which will break when drupal/drupal eventually no longer does merges.
  • Merge in composer.json files from extensions.
  • Merge in platform specifications so we'll learn about PHP version problems if they exist.
  • Turn that into one big composer.json file to submit to d.o.
  • d.o performs the install and generates the phar file.
  • If there's an error, we have to figure out how to report that back to the user.
  • If there's no error, site downloads 50mb phar file, places it in web-accessible directory.
  • Site deletes its existing vendor/ directory because we just made it obsolete.
  • Site uses autoloader from phar file.

That's a lot of complexity. Is it more complex than figuring out why people hate Composer and solving that instead?

amateescu’s picture

I'm not sure I want to reply to the negative tone in #15 so I'll just say this:

Yes, I think all this "complexity" is worth doing because it's something that we can do for people who can't (or won't) use Composer themselves. This comment from @webchick sums it up pretty well: #2477789-7: Use composer to build sites.

mile23’s picture

I don't mean to have a negative tone, in #15 or elsewhere.

There's like a million different issues with slightly similar goals, and none of them really define what's actually needed. That's why I started up #2908394: Use Composer to build sites without forcing users to learn Composer So we could get a definition of the problem.

It might be that building a phar file on d.o is the best solution to the problem once it's defined. I don't know.

Regarding @webchick's comment #7 in that issue.. Scroll down to #13 and start reading: #2477789-13: Use composer to build sites

moshe weitzman’s picture

Wow, this impressive. The work of a real mad scientist.

However, I'm not sure many people would use it. Presumably the target audience are folks who wont learn to use Composer. Are those folks really going to deploy phar files into their sit?. That setup gets very hard to debug and audit. Editing vendor code becomes impossible. Patching vendor code becomes impossible.

My first impression is that this cure is as bad as the disease (Composer aversion). Of course this is still an experiment - perhaps my reservations will be addressed or become unimportant.

greg.1.anderson’s picture

While this technique looks like it could potentially evolve into something useful, please be aware that what you have built here is a system with two autoloaders, which can be unstable. It is unclear to me whether your long-term goal is to put all of a site's modules into a single phar, or if you plan on having one phar per module. The former will be difficult to re-use cached builds as you suggest in #7; the later would produce a large number of autoload files, one per module.

Please see the trouble with two autoloaders for an explanation. When building a system with multiple autoload files, it is possible to run into problems if you mix two different versions of the same library, even if the versions are semver-compatible. If, in an experiment, you build two autoload files on the same day, and they both contain the most recent versions of all of the libraries, then the resulting system is very likely to work. If, however, you attempt to roll out the same plan on a larger scale, and employ caching, it becomes more likely that you will end up with duplicate mismatched dependencies.

This process could be improved a lot if you wrote a tool to scan through the composer.lock file of Drupal, and generate appropriate conflict entries for all of Drupal's dependencies, excluding everything newer and older than what is in Drupal. Then, if you built all of your module phar files against that lock file, you should never experience a problem with dependency mismatches between a module and Drupal core. You would also need a separate cache of .phar files for each minor release of Drupal. However, you still would run the risk that two separate modules might conflict with each other, if each had a slightly different version of a dependency not provided by Drupal.

Respectfully, I think you should consider the points made in #15 again. The issue in not only complexity; there is also the question whether the system will work at all. Composer manager did manage to correctly merge together all of the different composer.json files and build a single, self-consistent autoload file, so the task is not impossible. However, that project is now deprecated by its maintainers, again, due to complexity.

Before further progress can be made here, a strategy for maintaining consistent versions for dependencies must be designed. If something presents itself, then the question of complexity and maintainability must be addressed. I'd really like it if we could come up with a good system that behaved functionally equivalently to drush pm-download, but at the moment I remain skeptical about how feasible that goal is. I will help out if I can, though.

greg.1.anderson’s picture

Although, I just had an idea. Possibly this might work if Drupal immediately loaded the autoload files from all of the phar-based modules. You would have to make sure to use the same technique Drupal core is, to maintain the order of the autoloaders. Maybe keep Drupal's on top. I am using this sort of technique successfully in a Drush PR. See: https://github.com/drush-ops/drush/blob/0ab8d0c5ff9db30231c4e0d808a04307...

The important part is that you'd have to make sure that all of the autoload files were installed before you ran any code from any module. If you could guarentee that, then this technique might work with the easier-to-manage one-phar-per-module strategy.

However, complexity is still an issue. If equivalent efforts were made to optimize `composer update`, and it became easier to run Composer operations from the Drupal admin interface, then we might be better off. Just food for thought.

I might suggest that for your next trick, you build two phar modules and deliberately put different patch versions in each one. If you can then guarentee that files for that library will only be loaded from one of the two phars, then you will be getting somewhere.

fgm’s picture

Seeing this makes me wonder: could it not be the first step towards a "compile" of a whole site, which could then be delivered as a single phar called from the front controller. This is only tangentially related to the need for (presumably entry-level) some sites not to use a Composer CLI, but looks like something which could align with the new out-of-the-box experience for 8.5, as it means distributions could come as just a phar, probably an index.php, and assets.

alison’s picture

Very interesting, impressive work, thank you @amateescu!

Expresses interest in seeing what happens next...

groovedork’s picture

Would this technique make it harder to do manipulations to code? As a privacy expert I have a tendency to modify modules that reach out to third parties.

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.0-alpha1 will be released the week of January 17, 2018, which means new developments and disruptive changes should now be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

sinasalek’s picture

Although bundling drupal as a phar file is great idea specially for security,simplicity and portability i think it has serious limitations as well.
What about UI files like image, js and css? many modules contain those.
And there is also ui dependencies? consider the increasing popularity of asset-packagist.org , many projects are going to use it for downloading all their dependencies. phar alone certainly can't handle that.
Yes it is possible to serve ui files via php but it's slow and in some cases complicated. for example it can create duplication. and if only one phar is to be generated for each composer config, there will be huge variation that requires considerable amount of disk space

The question is if end user is going to update the site through UI, what difference does it make to use phar or ludwig or a composer ui in drupal?
#2538090: Allow the Update Manager to automatically resolve Composer dependencies

Version: 8.6.x-dev » 8.7.x-dev

Drupal 8.6.0-alpha1 will be released the week of July 16, 2018, which means new developments and disruptive changes should now be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

andypost’s picture

Version: 8.7.x-dev » 8.8.x-dev

Drupal 8.7.0-alpha1 will be released the week of March 11, 2019, which means new developments and disruptive changes should now be targeted against the 8.8.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.0-alpha1 will be released the week of October 14th, 2019, which means new developments and disruptive changes should now be targeted against the 8.9.x-dev branch. (Any changes to 8.9.x will also be committed to 9.0.x in preparation for Drupal 9’s release, but some changes like significant feature additions will be deferred to 9.1.x.). For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

andypost’s picture

Version: 8.9.x-dev » 9.1.x-dev

Version: 9.1.x-dev » 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

nod_’s picture

I'm with fgm in #21 I wish we could have a drupal.phar file that's easy to build/deploy either for distrib or custom projects. I know the problems of autoloaders and ui assets and all that still applies but that's something I'd like to see too.

wim leers’s picture

+1!

Yes it is possible to serve ui files via php but it's slow and in some cases complicated. for example it can create duplication.

I know the problems of […] ui assets and all that still applies but that's something I'd like to see too.

That can easily be solved by having those files served by a CDN or reverse proxy :) CDNs are a commodity nowadays. Heck, even cheap shared hosting has a reverse proxy built in for free nowadays.

nod_’s picture

With aggregation they would be written in a folder. Doesn't help for images but at least all the render blocking things would be fine.

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.5.x-dev » 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 10.1.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

nod_’s picture

FrankenPHP has a way of packaging apps, so we'd have PHP, the webserver, and Drupal in a single executable: https://frankenphp.dev/docs/embed/

Version: 11.x-dev » main

Drupal core is now using the main branch as the primary development branch. New developments and disruptive changes should now be targeted to the main branch.

Read more in the announcement.

andypost’s picture