Convert routes to CMI [#1840914]

Problem/Motivation

Dries and catch both suggested in #1801570: DX: Replace hook_route_info() with YAML files and RouteBuildEvent that routes should be CMI configs instead of the current YAML implementation.

Proposed resolution

Convert route system to use CMI. This should be as simple as modifying Drupal\Core\Routing\RouteBuilder to use CMI instead of manually parsing the YAML files.

Remaining tasks

Submit a patch that converts route system to use CMI.
Convert any existing routes to use CMI

List of Existing Routes (please add any that get changed/committed before this patch gets committed)

router_test

User interface changes

None

API changes

None

Comments

Comment #2

donquixote CreditAttribution: donquixote commented 14 November 2012 at 22:13

Should the "dynamic routes" also be config? Or only the yml ones?

(setting my reservations against config aside for one moment)

Comment #3

catch

he/him

English

CreditAttribution: catch commented 14 November 2012 at 22:23

OK i promised to write up my concerns with the original patch (now status quo), now here they are:

1. Router items are the only place in core we expect manual creation of YAML files, and more or less every Drupal developer will be creating them so it's an in-your-face requirement.

2. There is absolutely nothing in common between the YAML files and the event listener in terms of how routes are defined and altered. This is a massive departure from info hook arrays where at least you're dealing with the same thing.

3. The only pattern I can think of where we have manual file creation + alter hook is .info and hook_system_info_alter(), however hook_system_info_alter() is an emergency/novelty hook that ideally wouldn't exist, and is rarely used.

4. We already have a pattern for static/default items in YAML files with the ability to dynamically create other objects with the same structure, this is CMI and configuration entities.

5. menu_rebuild() is an extremely expensive operation, for example views currently does this to register router items on behalf of views - loads all the views in hook_menu_alter() and asks each display if it provides menu items.

function views_menu_alter(&$callbacks) {
  $our_paths = array();
  $views = views_get_applicable_views('uses_hook_menu');
  foreach ($views as $data) {

If routes were config, then when saving a view, you could do a config()->save() and that'd be the end of it. You'd need to compile the router again but at least any router item change doesn't require loading every single view then.

There have been two primary objections to routes as config:

a. They're not UI editable. We've already got lots of configuration that's no UI-editable, that doesn't make it any less configuration. PHP memory_limit and innodb_buffer_size aren't UI-editable either.

b. What happens when a module updates its defaults? This needs to be solved for CMI anyway - for views, image styles and other things where defaults would previously have auto-updated. There are existing models for this kind of issue - for example configuration files on Ubuntu - if the default changes during a version upgrade, it alerts you, lets you overwrite, keep yours, diff them etc. If we solve that for CMI then it's solved for routes as well.

Comment #4

donquixote CreditAttribution: donquixote commented 14 November 2012 at 22:55

Point 1: Future of info hooks.
=====================
In that other issue's summary, it was stated that info hooks are considered "legacy". This is quite a strong statement, and I don't really feel like I agree with it.
The controversy in the other issue was about more than just routes. It was about the future of info hooks.

Even if CMI might be the right choices for routes (which I doubt), I find it uncomfortable to be in a situation where CMI is the only accepted alternative to a once popular but now "legacy" pattern.

I could go on with this, but we might want to move into a separate issue "Future of info hooks".

Comment #5

donquixote CreditAttribution: donquixote commented 14 November 2012 at 23:14

Point 2: Realworld Benefits of CMI?
=========================

5. menu_rebuild() is an extremely expensive operation, for example views currently does this to register router items on behalf of views - loads all the views in hook_menu_alter() and asks each display if it provides menu items.

Views could very well (and should) fix that by itself, it doesn't need CMI for that.

What happens when a module updates its defaults? This needs to be solved for CMI anyway

There are existing models for this kind of issue - for example configuration files on Ubuntu - if the default changes during a version upgrade, it alerts you, lets you overwrite, keep yours, diff them etc.

Which is everything but convenient.
To be optimistic, let's say we find a solution that is 90% automatic, and 10% babysitting.
This can be ok for systems where CMI does bring an actual benefit, or is actually needed. But there is always a trade-off.
For routes (or rather, D7 hook_menu() items), I never felt the desire to interact with those things in a config way.

Comment #6

donquixote CreditAttribution: donquixote commented 14 November 2012 at 23:29

Point 3: State / History / Reproducibility
==========================
or: info hooks vs crud / hook_update_N()

In the other issue, we (or mostly sun/catch) had the argument whether a user could mess with the routes in config.
I agree with sun there, though I would extend "user" to developer, site builder, etc, and beyond that, any process (migration, update, etc) that can modify the configuration.

The heart of the problem is that with config we introduce a state that has a "history", or, "its own life".

-----

Anything we build from info and alter hooks can be deleted or messed around with as much as we want, any rebuild will return it to the exact state it was before. If a bug in a module caused some weird router items, we just fix that bug, rebuild, and are done. We don't need to care which modules or bugs have existed in the past, what matters is only the current code.

The typical development workflow with info hooks is
code - rebuild - see/test - code - rebuild - see/test

-----

With config, suddenly all we do depends on the previous state of the system.
A new module that works ok on site A, might break on site B, because site B has a different "history", and thus a different configuration to start with.

To make your module reliable, your workflow would need to
- code
- put the system into the state you expect to find before the module is installed / updated, your view saved, etc
- run your operation
- test the result
- code
- put your system back into the previous state
- run your operation
- test the result
- put your system back into another one of the states that you expect to find before the operation
- etc

Example:
- _menu_router_build() (D7) is pretty straightforward. We can just wipe the previous state, and start from scratch.
- _menu_navigation_links_rebuild() is hairy, because we need to respect the changes the user has made. And this is still easy, because we have the "customized" bit that tells us which links are modified and which are not.

Comment #7

Berdir

German

Switzerland

CreditAttribution: Berdir commented 15 November 2012 at 00:19

My opinion :)

- It's the first case of standalone yaml files but there is no reason that it stays like this. #1793074: Convert .info files to YAML is the most obvious example.

- Config is AFAIK designed to be idempotent. Meaning, a save/load should not change it through some kind of alter hook. Modules aren't supposed to mess with config of other modules, the only thing they do is eventually provide default configuration. If I want to provide a better /node default view than view.module then you don't alter it, you provide your own default config.

- Routes are not configurable IMHO. And I'm not talking about "UI-editable". All the config settings that we are adding to config but don't have a UI are still *meant* to be the default and can be changed *by design*. That's why they exist. If not, then we'd just hardcode that stuff :) What I mean is that site admins aren't supposed to get in those files and change them.

- Routes, however, are alterable. I think the difference is significant. alter is for other modules and allows them to change existing routes or completely replace them. Which is the reason views defines everything in hook_menu_alter(), so that you can replace an existing module's routes. I don't see how that would work if routes were config?

- Yes, multiple modules altering the same definition is complex and leads to problems. But it's sometimes totally valid and I don't see a better solution to this here? One of the biggest problems are replacing access callbacks, which are going to be a separate hook_node_access()-like events system anyway.

- Updating defaults: The current behavior of default exportables/features/things in 7.x contrib is that they replace as long as you are still on the default. A "do you want to revert your customized thing back to the new default" would be nice but is optional and can be done on top of the existing behavior. This makes totally sense for something like a default view, but for routes definitions? If node.module adds a new route on an update or even changed an existing one (different callback), then I *need* that route. Not having it will kill my site. Even if I altered something else.

- I agree that it's a problem that some routes are yaml and some aren't and are defined in *events*. catch's suggestion in IRC on having defaults and CRUD sounds tempting, especially for performance as it would essentially mean that router rebuild would just go away. But I'm *really* worried about DX of updating routes, in combination with keeping them alterable and then having to update altered routes again that are in a state that you don't know about.

EDIT: To provide some actual examples of the point above:

- What about you replacing /taxonomy/term/%term with a view and then taxonomy.modules renames the callback for that route and wants to update it?
- Or you remove the view again. there's no way we could go back to the original definition of /taxonomy/term/%term because you don't know if and how it has been altered and what it's supposed to be now. You can't unalter something if we can't just rebuild from scratch.

Comment #8

donquixote CreditAttribution: donquixote commented 15 November 2012 at 00:24

(@berdir: I think we basically agree)

talked about this on irc, and here is one example I find quite serious and disturbing.

A) System with info hook (hook_route_info, hook_menu, whatever):
1. you install a module with a little design flaw. (*)
2. the module alters the routes, so now you have some insecure access callbacks.
3. you uninstall the module
4. system restores itself to be safe again. Any trace of the module is wiped on rebuild.

B) System with CMI / Config / Crud
1. you install a module with a little design flaw. (*)
2. the module modifies the routes, so now you have some insecure access callbacks.
3. you uninstall the module
4. the insecure route is still there, because no module feels responsible for cleaning it up. And the user / developer / site builder doesn't know about it, because who seriously looks into the config.

(*) "module with a serious design flaw"
There are many examples how this could happen. It could even be a custom module, where the first version is quite poorly written. Or it could be something you install for a quick try, and then uninstall again.
It could even be that the operation by this module was safe at that time, but the route needs to be updated to remain safe - but the module did forget to remove it on uninstall, etc.

Point is, with (A) you are safe, with (B) you are not.
The Drupal security team won't help you either, because they don't know about the config on your site.

Comment #9

sun

German

Karlsruhe

CreditAttribution: sun commented 15 November 2012 at 00:41

Even though I can't fully make sense of what @Berdir said,

consider me +1 to whatever @Berdir says and will say.

I'm strongly biased, so I do not plan to comment any further on this proposal, unless strictly required. (I'm following.) @Berdir seems to have the necessary and right amount of architectural and human sense, so whatever he thinks is right is probably acceptable for me, too.

Comment #10

Crell CreditAttribution: Crell commented 15 November 2012 at 01:07

There's a subtle point with info hooks that I think needs to be made clear. *Most* info hooks actually turn into Plugins, not CMI. Info hooks have in the past served two subtly different purposes.

1) Declaring the existence of a thingie. In this case the info hook usually has a machine name, and some array of metadata. It doesn't say anything about how or where the thingie is used, just that it's there.

2) Wiring one thingie to another thingie. In this case, the info hook says that thingie X should (by default?) be used in place Y, or connected to thingie Z.

Those are two different tasks, yet we've been using the same tool for both, and sometimes at the same time without being consciously aware of the distinction.

Usage 1 is being replaced with Plugins, usually with annotations. You're declaring "behold, here is my text formatter! Use it wisely." That is not configuration. That the invisimail text formatter is available in the system is not configuration, it is simply an offered service. That the "Filtered HTML" text format makes use of that text formatter is configuration.

I would argue that "wiring thingies together" is configuration. The two thingies have an existence unto themselves. They can -- or perhaps should if done right -- be used independently of a given situation. Invismail's encoding logic can be used anywhere you can call a function with a string. (That's why it's also usable with email fields.)

The trick with menu/router items is that, by that distinction, page callbacks are a thingie that is provided, for which we have had no registration mechanism. They've not been explicitly declared to the system the way, say, text formatters have been. They've been implicitly declared by virtue of the file they live in being included. hook_menu() is a wiring tool to wire page callbacks to a given URL, not a discovery mechanism for the existence of a page callback service; the page callback itself, though, should be able to run at some other URL just fine as long as it gets the right parameters passed to it. (If not, then it's probably using arg() somewhere and that's a bug.)

(The irony is that hook_menu was, in a way, our first "info hook", yet it's the least info-hooky of them!)

Page callbacks are in the process of being replaced with controllers, which serve essentially the same purpose in the food chain. (Give or take _content sub-controllers.) That leaves the wiring question of connecting controllers to URLs. That is, fundamentally, a configuration question.

True, in the Drupal case they're module-provided configuration that is arguably less subject to user-manipulation than many other forms of configuration, but that's not a unique problem here. #1043198: Convert view modes to ConfigEntity is struggling with the idea that view modes really ought to be configuration, and make sense as configuration, and users treat them as configuration, but there's some that are hard coded in Drupal and if you delete those then things break quite badly. Image styles already face the same problem; it's just that deleting the thumbnail default style is sufficiently difficult that the problem rarely appears.

That's not to discount the potential complications that donquixote and Berdir have noted above. The concern about module updates is legitimate. However, hopefully this explains the "routes are config" mindset more fully, and the "it's the same problem elsewhere so we need to solve it anyway" response.

Comment #11

donquixote CreditAttribution: donquixote commented 15 November 2012 at 03:08

@Crell (#10),
your post makes a valid observation, and an interesting theoretical aspect.
I do not agree on some of the conclusions, though.

First, the distinction between (1) and (2) is not as clear in practice, as you make it.
Second, the step from "wiring things together" to "that's configuration" has an aspect I agree with, but that is not the same notion of "configuration" as we imply with CMI.
Third, I would argue that most of D7 info hooks, and especially the alter hooks, actually fall into the (2) category, not in (1) !

--------

Distinction of (1) and (2) (as in #10)

Ok, there are some things that are clearly within the (1) area.
If you implement each text formatter as a class (which you do in D8), and the machine key is the class name (which I don't know atm how that is in D8), then the annotations on those classes clearly fall into category (1). In this case, the annotation does not really do much.

In D7 this was a bit different: The info hook would register a formatter machine key, and associate it with a callback, a title, and a description (and maybe some more stuff). Other modules could go along and change this via an alter hook, and e.g. sneak in a different callback under the same identifier (and title etc). So here we already get into area (2).
So far this is all info + alter hooks. If any of that goes into the database, it can only be seen as cache. Because if you wipe it, it rebuilds itself to the same state as before.

The configuration aspect?
With those altered formatter info, D7 would now ask the user which formatter to use on which field / view mode. Or the same on views fields.
This mapping is then stored in the database - either in the field API settings, or in the views settings, or whatever other system wants to use formatters.

So, here is the entire process:
1) Modules provide the basic "this is who I am" information for their plugins.
2) Modules do some wiring-up and altering, e.g. associating identifiers, titles, descriptions and callbacks, and altering this information on each other. E.g. they could also provide two formatters which both use the same callback.
3) Modules provide very clearly defined configuration spaces, e.g. views fields config, or entity view mode config, where the user can choose from the (altered) plugins. Even at this step, the module that provides the configuration space, can still further constrain the available plugins.
4) Modules use that stored information, in combination with other data, to actually instantiate the plugins, and do with it what they need.

Only in step (3) do we actually save real "data" to the database. In (1) and (2), this was all just a "cache" of information that we could reproduce any time.
Also, the way we store this data in (3) is quite independent of the way those original plugins were provided. The views module decides how it wants its own field formatter info to be stored. The entity view mode config might do it in a different way.

So in fact we have
- some wiring up done on module / alter level, where only "cache" stuff goes to the database.
- some other wiring up on user / config level.

Usually we have a clear separation between those two. And wherever this separation does not exist, we either have a very clever mechanism to resolve possible conflicts, or we have a lot of suffering.

The essential thing is that any time we let the user configure something, we need to think very clearly about the constraints for this "configuration space". With any operation that allows the user (via UI) or anything else (migrate, import, export etc) to update this data, we need some babysitting to keep it all sane and legal.
And whenever we use this data, we need to consider all possible values the user (or the import, auto-generate etc) might have put into the system.

The price of configuration
=====================

So: Every "configuration space" comes at a price.
- The more freedom we allow
- the deeper it interferes with our system
- the more parties can mess with the information (where "parties" can be users or machine-controlled operations)
, the more painful it will be for us to maintain it.

Routes, if we put them all in CMI to be configured by any module that wants, will become a wide-open space, with all the pain we could ever wish for in our twisted imagination.

So what about config files in Symfony, or Java?
==================================

There are systems where the developer is supposed to provide or maintain a ton of config files, that define how the application works.
Instead of enabling or disabling modules, or visiting some UI admin page, the developer / site builder will just work on those config files (and custom code, of course).

Again, this has a price.
The developer has a lot of work to do on package updates etc.
Also, this guy is an experienced developer, who is used to this kind of work, and who is looking at those config files every day.
And hopefully those options are still kind of limited. E.g. the typical Symfony project probably has a lot less routes than the typical Drupal project.

Solution: Constrained config spaces
======================
Instead of making all routes configurable via one shared system, we could e.g. provide a system on top of that, which allows only to change the path, but not the callback, or even worse, the access settings.

Then when we store that, we only store the routes that are intentionally modified, e.g. where the user has ticked a checkbox "override this". Maybe with a note explaining why this modification is necessary.
This system can then use an alter hook implementation to apply the changes to the routes coming from modules.

This way it would be a lot easier for a site maintainer / developer to look into the config to see what has been modified, because this config file would be much smaller than if all routes would be config.

Comment #12

Crell CreditAttribution: Crell commented 15 November 2012 at 05:28

I think there's a subtle semantic distinction we're not on the same page about. I would not call an alter hook for a definition-info hook "wiring". It's altering the definition, not controlling where and how the thingie gets used. (And actually with easy sublcassing I'm not entirely sure we still need alters on plugin definitions in practice, but that's a separate discussion...)

hook_route_text_filter_alter() (if such a thing exists, I don't know) does not allow you in any way to decide from there which text formats make use of that text filter. It only lets you redefine what that text filter is, not where the system uses it. You're right that this is a distinction we've not always been good about making, which is not to our credit. (Just as we've used hooks for events, for definition/registration, visitor pattern, observer pattern, and a half dozen other things, because they were there.)

You're correct that in a pure framework like Symfony full stack the situation is different; it's not entirely a direct conceptual map. However, I still think that routes-as-config is valid even just within a Drupal context.

I don't think we can get away with restricting what a module is "allowed" to change about a route. I actually think changing the access control will be the most common task; OG will want to tack on its own "allow access if the user is in the right OG and has the right OG-relevant permission" checks, for instance, and well it should. If we don't allow that, we broke OG. :-)

Saving just "Overridden" routes, though, is interesting. I'm not sure how I feel about that at the moment. I'll have to sleep on it. Would the intent there be to replace the dynamic/alter setup now with a "static load, then CMI load, and if the CMI load blasts away a static definition due to the same machine name, that's by design"?

Comment #13

EclipseGc CreditAttribution: EclipseGc commented 15 November 2012 at 06:14

So, Crell asked me to come into this issue and discuss the plugin aspect of this for a moment. It took a lunch with Crell walking through different scenarios of "what is a plugin" before we got on the same page, so I'll attempt to do this conversation justice in light of that.

As Crell already mentioned, Plugins are a really great replacement for most info_hook style thingys today. We support altering plugin definitions (via very typical drupal_alter style calls), annotated classes, caching, and more. In addition to this we support a very extensible discovery process that could be made to read just about anything (info, yaml, database, whatever), and represent these things as plugins to the end user/developer.

In the case of routes, we have a few things to consider. The first is that the output that exists on a route, and the actual route itself are not intended to have a 1 to 1 relationship. i.e. no hard coding output to a particular page. Swapping out page callbacks in Drupal 7 and before today is ridiculously painful more often then not... worse still is reusing an existing page callback elsewhere. Given a url that can satisfy the needs of an output component (whatever that is) swapping one for another compatible component should be trivial in drupal 8 (if we all did our job right). This is "preaching to the choir" as I've been a heavy proponent of this for literally years and have been abstracting various portions of D7 to work that way in my own contrib modules. I bring this up because it is very easy to confuse "I put an annotation on a class" with "I hard coded a url to an output" and this is simply not the case. In a perfect world (and I realize that may not be realistic, but bear with me) all output components would be created equally. If you wanted to output a node create screen, you could do that with a single call to a simple component that really only needed to know what type of node. Or to take that even further, what entity type, and which bundle. Likewise, outputting a node view screen would be similarly trivial, as well as users online, help text, custom menus, etc. This is the ultimate goal of blocks as plugins, and building a robust enough platform upon which that can be done is the first step in explaining why I maintain that routes are indeed plugins.

Assuming for a moment some sort of simple interface for a "route plugins" that expects a response method or something, it's easy to imagine how one might extend route classes from blocks or inject block classes into a route, and then rely on that route's annotations in order to place the contents of that block on a particular route. Similarly, plugins supports providing a single physical class that can, for a given criteria, masquerade as many plugins. To equate it to something you might see in D7 typically, it's essentially a foreach loop in an info hook. This can allow a single plugin that might support say... any route that was configured through a user interface by the site builder. Whether THAT data is CMI or not is a completely separate discussion (ultimately plugins doesn't care, it's completely an implementation detail).

I've laid out my "perfect world" scenario, and it's not that far from reality today. Crell wants to move all html outputting page callbacks to some sort of class, I maintain that should be a block plugin. If we all agree on that, then my scenario becomes very real very quickly. This punts, to a certain degree on "is that configuration?" or not, but it also provides a system that drupal developers would be VERY used to, and rather than scattering things everywhere, all routes run through a centrally located plugin system. This means that developers are encouraged to:

Write your output in reusable chunks (blocks)
Use class annotations to register your blocks to routes.
Oh hey, if you like you could also export a route created through the UI into your module's /config directory, but really this is more like features, and less like programming.

This places a pretty clean delineation on WHEN we intend for one to be used over the other, and continues to flow the code in a way that is followable, and hopefully very familiar as we are converting as many things (as makes sense) to plugins, so the opportunity for consistency here is non-trivial.

Hopefully this makes sense of my position in all this, and how plugins might be helpful here. There's definitely a bit of work to do with regard to parameter upcasting, and passing those things through the plugin system, but I made a small proof-of-concept at BAD camp and got it working for non-dynamic pages relatively quickly.

Eclipse

Comment #14

donquixote CreditAttribution: donquixote commented 15 November 2012 at 16:42

@Crell (#12):

I don't think we can get away with restricting what a module is "allowed" to change about a route. I actually think changing the access control will be the most common task; OG will want to tack on its own "allow access if the user is in the right OG and has the right OG-relevant permission" checks, for instance, and well it should. If we don't allow that, we broke OG. :-)

Totally not what I intend.
What I say is, we allow modules to change that, but via an alter hook / event, not via "write to CMI".
As long as it is only modules which do that, an alter hook / event is clearly preferable, because it is a lot easier to reproduce and test.
Whenever you "write to CMI", you need to consider what was there before.
Whenever you "alter", you also need to know what other modules altered before you. But this is a lot more predictable than whatever might be leftover in config.

Saving just "Overridden" routes, though, is interesting. I'm not sure how I feel about that at the moment. I'll have to sleep on it. Would the intent there be to replace the dynamic/alter setup now with a "static load, then CMI load, and if the CMI load blasts away a static definition due to the same machine name, that's by design"?

The CMI could just be one more alter implementation.

Comment #15

donquixote CreditAttribution: donquixote commented 15 November 2012 at 16:48

Btw, I consider the current yml + alter + route build event as conceptually more or less equivalent with info hook + alter hook.
The issues with that are more about DX and aesthetics, I think. One reason is that currently this is the only place where we have this combination. Otherwise, yaml always "smells like CMI".

I personally would prefer an "array of doom" info event + alter event. But the difference is just aesthetics, it is conceptually equivalent to the yml + route info + alter.

Comment #16

donquixote CreditAttribution: donquixote commented 15 November 2012 at 18:20

@EclipseGc (#13)

Swapping out page callbacks in Drupal 7 and before today is ridiculously painful more often then not... worse still is reusing an existing page callback elsewhere.

Actually I didn't find it so difficult.
Write a custom module, hook_menu_alter(), done. Good enough for the occasional case where I need to do that.
(and even if I had the option to change that in a "config", I would prefer to do it in a module for reproducibility)

What is more difficult in D7 is changing the url of something else.
That alone would be ok, but that url will be hardcoded in so many places that are unreachable from my custom module (ok i never tried hook_url_outbound_alter()). In this regard, the move to machine keys for routes seems like a useful idea.

Comment #17

donquixote CreditAttribution: donquixote commented 15 November 2012 at 18:38

all output components would be created equally. If you wanted to output a node create screen, you could do that with a single call to a simple component that really only needed to know what type of node. Or to take that even further, what entity type, and which bundle. Likewise, outputting a node view screen would be similarly trivial

Any of these has a "signature", that is, some requirements on the expected parameters.
So, they are not exactly equal.
In theory, we already have this system in D7: Every page callback expects specific parameters, and if you provide those, it should just work, no matter from where you call it. Unfortunately, a lot of page callbacks in D7 contain stuff that break this concept. E.g. hardcoded arg(), or drupal_json().
DIC and page controller classes should make this easier, but D7 page callbacks basically attempt to do the same.

Treating blocks (partial page output) the same way is indeed new in D8. But this has very little to do with our new routing system. We could easily have provided a "block/%block" route with our good-old hook_menu().

Again: The new "machine key per route" does make a lot of this easier. But this can still be managed with info + alter, and does not require CMI.