Problem/Motivation

YAML discovery poses issues for translations. Currently YML files that have translatable strings (names, labels, etc) are either translatable via config schemas (where the schema defines item types and which ones are translatable) or have hardcoded parsing logic (such as for routing.yml where we get the title from or info.yml where we get descriptions or tabs yml where we'll get tab labels). Once we let the yml structures loose outside of the config system, if we don't have a set structure for them (and here we don't) then we cannot extract translatable strings from them for the community. We may be able to apply config schemas to YAML files outside of the config directory, but that would be an exception, and then we still need to define how we figure out which are those files, so we don't run into other YML files :/ Sounds like it gets confusing quickly as to what would config schema apply to :/

Proposed resolution

Figure this out.

Remaining tasks

Figure this out.

User interface changes

None.

API changes

Likely.

#2078405: Revert YamlDiscovery if it doesn't make sense as first-class discovery and #2065571: Add YAML Plugin discovery

Comments

Gábor Hojtsy’s picture

#2065571-44: Add YAML Plugin discovery @pwolanin:

Well, we do have a set structure for each plugin, just not as a general feature of this type of discovery.

In another issue we are talking about improving plugin documentation by documentation a defaultConfiguration() method that would list the possible keys. Is that a place we could e.g. have a doc annotation and maybe something that tells you what yaml file pattern to look for?

You are suggesting parsing a doc annotation to get a structure to parse another YML file? Yet one more way to define/explain a YML structure?!?

pwolanin’s picture

I'm just brainstorming since in general the plugins are not well documented yet.

So - what are some other possible approaches?

Should we put the string "plugin" into the YAML file name? e.g. node.local_tasks.yml could be node.local_task.plugins.yml. If we did that could we assume e.g. any "title" or "description" key has a translatable value?

Is there some notation we could use in the YAML file itself that would work like @Translation() ?
e.g.:

mymodule_list_tab:
  route_name: mymodule.list
  title: 'List'
  tab_root_id: mymodule_list_tab

becomes:

mymodule_list_tab:
  route_name: mymodule.list
  title: '@Translation(List)'
  tab_root_id: mymodule_list_tab

(Or something) - and the YAML discovery could then know to strip the string and apply t()?

Gábor Hojtsy’s picture

It is safe for all plugins to assume that 'title'/'label'/'description' are going to have values that need to be / should be translatable? And there are not going to be other keys?

So far we don't have direct specification of what is translatable in a .yml file. We looked at it, but it seemed counter-intuitive and wrong for DX. If you need to repeat in each .info.yml file that the description is translatable, it is painful if we can just easily codify that somewhere else. So far all .yml files outside of the config/* dirs had defined structures, so we knew the structure of all of them, so we can code PHP to extract the strings that we knew we need. So far only .yml files in config/* had dynamic data structures, and that is where we have the config schema system to explain the structure, which can be used to discover all translatable strings. So you only need to define the node type config structure for example once but all node type .yml files can then be parsed for translatable names, labels, default values, etc.

I think one option would be to somehow make the schema apply to these .yml files even though they are outside of config/* but that would be a glaring and confusing exception. I'm not sure making up yet another way to define what is translatable in .yml is a good idea :/

neclimdul’s picture

I don't know if "all" is safe. Dealing in absolutes is dangerous. Plugin definitions are free form but they'll be well formed to each plugin system so this doesn't seem like an unsolvable problem.

Just so I'm sure I understand this correctly, we're discussing the discovering the strings that need to be translated so they can be listed in translation UI's and the like?

Gábor Hojtsy’s picture

Just so I'm sure I understand this correctly, we're discussing the discovering the strings that need to be translated so they can be listed in translation UI's and the like?

Exactly!

longwave’s picture

Title/label/description may be safe in many cases, but a plugin type implementation may supply any number of further strings that must also be translated. So, if we must provide a method for any other number of strings to be translated, then we just add title/label/description to that set as needed - however that set is defined.

neclimdul’s picture

Seems like if we're dealing with discovery on a system designed for discovery, we should use that system rather then writing our own discovery. (insert some "yo dawgs" if you must)

Gábor Hojtsy’s picture

For discovering what?

pwolanin’s picture

@neclimdul - as far as I understand, this needs to be compatible with purely static analysis.

pwolanin’s picture

So, it looks like YAML spec suooprts a solution to this in the form of TAGS
http://www.yaml.org/spec/1.2/spec.html
http://en.wikipedia.org/wiki/YAML

Basically YAML *should* support application specific tags where we do like:

my_local_task:
  title: !translation Hello world
  route_name: my_hello
  tab_root_id: my_local_task

In the minimal YAMl schema, any such thing would just be parsed as a string. Any yaml file could be parsed at least via static analysis then to find those lines.

Sadly, Symfony's YAML parser is deficient and includes !translation as part of the string. EclipseGC suggests we try to get a patch into the upstream component - this is arguably a bug since they should at least be ignored.

lsmith in #symfony-dev suggested any patch would need to be in good shape by Oct 1 to make it into 2.4

Gábor Hojtsy’s picture

I'd like to reiterate that IMHO having very different solutions for certain .yml files (routing.yml, info.yml, etc), annotations, config .yml files and then plugin YML files for translation sounds like a *HUGE* mess. It is already complex enough with t(), $this->t(), automated string discoveries and config schemas. It would be great to be able to map this to one of the existing solutions instead of making up yet one more parallel system that core would need to support.

EclipseGc’s picture

Wouldn't the system pwolanin has outlined be a lot simpler that what exists in core current? and just as flexible (if not more so)?

I'm not suggesting that's a good enough reason to try to get it in at this point, but if it is simpler all around, would there be interest in pursuing it for all of core yaml?

It's definitely a DX improvement. No?

Eclipse

Gábor Hojtsy’s picture

I think **purely from the point of extraction**, having local markers in YML files for each translatable string is a godsend. Localize.drupal.org would easily parse all config .yml files and any other .yml file without needing any special code. (Right now parsing of the config .yml files is still ahead of us and going to be an uphill battle that may take a lot of magic and possibly bugos behaviours on the way as accepted side effects due to lack of config versioning).

HOWEVER from a human DX perspective, writing .info.yml files like:

name: !translation Menu
type: module
description: !translation 'Allows administrators to customize the site navigation menu.'
package: !translation Core
version: VERSION
core: 8.x
configure: admin/structure/menu
dependencies:
  - menu_link

sounds like a definite problem. If Drupal already knows those are going to be translatable, why make me work so much to write those obscure things? Also, for Views and all the config entities and every config handling code to be able to tell when saving a config .yml file which one to save with this marker would currently require that saving always go through the config schema and the translatability information is retrieved from there and put into the yml verbatim like this. So lots more upfront work on the .yml generator side as well for all the config entities and regular config.

I believe we considered this "!translation" option in Barcelona last summer and discarded it for the (a) lack of Symfony support and (b) DX problem with having this all around and needing to deal with it all the time.

What would it help with though is:

- it would make the locale code which parses default config for translatable strings *way* easier, it can ignore looking at the schema
- it would make the localize.drupal.org code to parse .yml files much smaller than it already is and make it effortlessly possible to parse all config .yml as well as any plugin .yml file

It would be on the expense of this ugly thing spreading everywhere though. So far we avoided this and kept nicer DX by encoding known patterns about .yml files to the parser as well as providing a description format (the config schema) for locale to parse the files (and config translation to generate forms).

The introduction of !translation would not make schemas useless, since (a) you would need schemas to be able to tell where to export !translation before values in the first place (or write even more code for each config entity / config specifically) and (b) the config schemas are still needed to generate a UI for translation.

Maybe bring this to the CMI and/or plugin discussion in Prague? It would definitely need much more of a consensus.

pwolanin’s picture

Well, or even !t to keep it terse, though I was thinking in parallel to the @Translation annotation.

If I export a config entity (e.g. a View) will there be any novel translatable strings in there? it would seems like novel strings would all be user generated?

For now let's try to at least get some minimal tag support into symfony. Here's the open issue: https://github.com/symfony/symfony/issues/9040

EclipseGc’s picture

@Gábor Hojtsy,

You're point about yaml dumping is well taken, however this would reduce the scope of that current need to JUST dumping. Currently it has to be involved in the parsing as well, and this would eliminate the need for that and treat strings in yaml very similarly to the way they are used everywhere else. Still, your basic argument here is accepted, but trying to push in the basic functionality upstream so we have the option seems like a wise choice on-going.

Eclipse

Gábor Hojtsy’s picture

@EclipseGC: currently the schema is used only for locale module's parsing so it can identify the translatable strings without explicit local markers like !t or !translate. Several people on the config initiative feel strong about not requiring schemas for basic config operation. Requiring schemas would also help with #1653026: [META] Use properly typed values in module configuration where several people have been pushing back on applying schemas for that. So if we'd use schemas for dumping .yml that would be a definite change. Schemas are not needed to read .yml in any way, except if you need to introspect the structure which is in locale module (and the coming config_translation module which builds forms based on the schema).

@pwolanin: yeah you may have a French-Spanish site, you create some views in French and then need to translate to Spanish. We need to identify the translatable strings in the French view. Locale actually does not interfere there because you custom-created those views, but config_translation needs this info (it gets it from the schema). *However* all the views, content types, fields, etc. that are shipped with modules and packaged on d.o will be originally in English and need those strings extracted and identified on localize.drupal.org. Currently, since there is no local marker for translatable strings, localize.drupal.org would need to parse the .yml files through the schema (which is pretty darn hard (but not impossible) due to module dependencies, plugins, etc). The localize.drupal.org parsing is a yet unresolved problem.

Once again I think this should be opened up as a larger scale DX question. If people are not freaked out with !t showing up in their YML and we can painlessly dump that in .yml generators, then it may gain acceptance. So far my understanding was people don't want this in their YML files :/

neclimdul’s picture

We could possibly provide a similar schema with plugin managers. This might be a way we could provide documentation on he the expected YAML structure which is also an outstanding issue.

Unless we're pretty smart i'm not sure if this helps map individual YAML files to a schema though. We'll have to be smart about that.

Could we use !t or !translate in the schema definition and avoid definition writers making mistakes?

pwolanin’s picture

The notion of yaml "Schema" seems pretty weird to me - I'd be much happier tagging the fields inline.

Crell’s picture

I haven't thought through in-file tagging that much, but at the training we ran at MWDS I had a hard time explaining the schema file. It felt redundant even to explain it, much less use it. Also, someone asked why we seemed to be duplicating the label between a form and the schema, and if there was a way to get the label from the schema to use in the form. I had no idea, and still don't.

That's one data point at least, which would lean toward !trans. Although as dawehner just noted in IRC that wouldn't help with select boxes that vary by locale, just pre-coded strings.

Gábor Hojtsy’s picture

The schema is currently only used in core by locale module to find all the translatable strings. It needs to traverse arbitrary nested containers like a view .yml which has arbitrary plugins to find all their translatable strings. https://drupal.org/project/config_inspector shows a few potential ways the schema can be used. Eg. its useful to inspect your config data (debug, find issues). Labels help find what your stuff is about there. It also shows how the schema can be used to generate a form from your data. This may be useful for quick input solutions or where otherwise the data only has a web service based entry point or to generate form scaffolding that can be further tweaked. Schemas may also be used to validate the data that comes in through a web service or migration and tell you where errors are based on labels and nesting information. None of this is happening in core or contrib (yet).

https://drupal.org/project/config_translation is a module that actually uses the config schemas to generate translation forms for *every* configuration in Drupal that may have translatable things (think site name through filter formats, roles and shortcut sets to views). It knows how to build a relatively useful form out of the nested structures using the labels and data types loaded from the schema. So if you have a path field or an integer field, it can know which simple widget to use for that.

The main driver behind the config schemas was (a) lack of another way to build such translation forms (b) lack of another way to figure out translatable things in .yml files with *static parsing*. So it was all multilingual driven yeah. But as you can see there are tons of potential uses outside of languages.

We could have decided that each configuration screen need to be implemented twice, once for the main editing screen and once for the translation screen. Eg. Views would have needed to build an alternate form building pipeline for the plugins to build such a translation form, there would be two versions of the site info settings form, etc. We looked at automatically generating translation forms out of even simple forms like account settings or more complex forms like editor/input format settings or views, but we found no practical way to do that and ensure there is going to be no privilege escalation, etc. Also solution for forms would not have led to any solution for the static parsing of translatable strings, so *then* using !t or !translate would have been mandatory and all config save pipelines would need to know about which values are translatable and mark them appropriately at all times. Currently config save does not need to know about this (and is therefore more speedy for the default case). (There was another alternate route discussed where all config would be able to tell their translatables and export it elsewhere as well, so when you commit config exports to d.o, you'd include translatable text as copies in another file as well. People did not like that for the extra build step and again for the needed built-in know-how that may slow down config in regular operations).

Configuration schemas are an optional component at the moment and you don't need to implement them if you don't care about potential contrib advantages or working in a foreign language / multilingual environment at all.

In the end configuration schemas may be more complex than they need to be so the base config is simpler and can serve the base use case without more complexity in a self-contained way.

EclipseGc’s picture

It knows how to build a relatively useful form out of the nested structures using the labels and data types loaded from the schema. So if you have a path field or an integer field, it can know which simple widget to use for that.

Congrats, you just described fago's TypedData :-)

Eclipse

Gábor Hojtsy’s picture

Bingo, in fact config schemas map to typed data types (although the integration is not perfect): #1905230: Improve the typed data API usage of configuration schema

neclimdul’s picture

Plugins shouldn't have per-language strings in their definition. That's not really implementation metadata which is the point of plugin definitions. I can't speak to things outside plugins though.

As I noted though maybe not clearly, schema's could have additional help for plugins. Providing a location to document the format of your plugin definitions being the clear reason but also possibly a place for default values or other things.

Gábor Hojtsy’s picture

@neclimdul: so if tabs are plugins, where would tab labels go? Tab labels are inherently human readable text and will be in a specific human language. (For d.o hosted modules in English).

neclimdul’s picture

I think one of use missunderstood something. Probably me. They are lots of simple translations in plugins. There are not user content translations like a node title.

It feels like we're making this more complicated then it should be. Are there not 3 options. (1) Plugins gives you strings, (2) we put something in all yaml like !t, or (3) we give you a schema that describes what's translatable. We seem to agree on technial pros and cons, what do people feel the best solution is?

(3) Seems to be the current CMI solution from what I'm understanding which should make it the front runner at this point in the freeze IMHO.

EclipseGc’s picture

Right, so there's actually a point here I've not seen raised. Plugin definitions (yaml meta-data) is never written to disk somewhere like config. It's why we didn't think about, or care about a schema definition of our yaml. !translate is perfectly fine for us because Drupal is never going to have to reformulate and output our files somewhere. If someone has a translation for the string following !translate that's completely ok, and we don't care about anything beyond that.

CMI has a completely different mandate and need since it is manually or programmatically creating yaml files that have some translatable strings in in the final dumped file. It needs to know which of those strings are translatable so that future dumps don't remove the ability to translate them. Plugin definitions don't have this problem. I don't know if that's justification, but I think it was worth saying it.

Eclipse

Gábor Hojtsy’s picture

@neclimdul: re #25: how would your option (a) work if the source code of your d.o contributed project is dropped into a 3rd party system for parsing out strings (that cannot run your code at all). I think the only way it would work is (a) happens *before* committing to d.o and results in an exported list, so you commit the exported string list as a separate thing. IMHO one of four things may to happen:

1. The strings are marked explicitly in the files. This is how t(), $this->t(), @Translate(), etc. operate and !t could be similar.
2. The file structure is well-known so strings can be extracted based on those rules. This is how *.info.yml, *.routing.yml, hook_menu(), etc. works.
3. A parseable structure definition is included, so the data can be parsed with that structure to extract the strings. Config has schema for this.
4. The runtime code can identify the translatables in some way and exports in a static format. Then this method falls back on (2) since that static format would have a known structure to parse. No core examples for this because people seemed to despise "build systems" like this.

Once again the key is that we cannot run from the code in question that has the strings, because we need to run it through a 3rd party systems (and cannot just get all the dependencies, etc. involved) to identify the strings. So if there is nothing better than running the code, then it needs to be run on the dev environment and the strings exported ahead of commit to the community (4). Otherwise we need a static way to know how to find the strings, which is what 1-3 are about. Either local markers, or well known formats or an external description of structure.

@EclipseGC: I don't contest that these .yml files may be candidates for local markers like !t because the structure does not need to be explained necessarily. I want to point out that people will not see the reason why info.yml would not need this while the plugin .yml needs it. So far we managed to follow common rules around how .yml files should be structured.

pwolanin’s picture

For me, the very concept of the config schema files is pretty painful as far as DX goes - I'd find it much easier to explain a local tag/marker.

Even if we have a known structure for some files that we could also fall back on, having an explicit tag seems better to me.

I'm not sure about how to handle config that's written out though.

Gábor Hojtsy’s picture

@pwolanin: take a views or field instance YML and try to autogenerate a translation form based solely on the structure and knowledge about which items are translatable or not and you'll quickly see why we added schemas with types and labels. As said above an alternative would have been that all plugins would provide their own translation forms separate from their editing form and that all save operations would know about translatable strings, so they would export local markets at all times.. Definitely more work vs. writing a declarative structure definition.

Gábor Hojtsy’s picture

Also, once again the schema may be useful for data validation, in migration especially or with web services. It may easily allow to create a web service endpoint which even automatically documents itself :) So lots of potential side benefits that would not have come with writing more form api arrays.

Xano’s picture

Issue summary: View changes

Are there any alternative YAML parsers we can consider instead of Symfony's parser?

pwolanin’s picture

Unfortunately, there don't seem to be any good PHP ones, but please look again.

I discussed with chx a while back the idea of porting one - there is a JS one he thought was much more complete. But that would be a bunch of work. https://github.com/nodeca/js-yaml

Gábor Hojtsy’s picture

Nowadays schema is used for many more things, eg. it is used to cast values so they get proper typed values and are deployment safe. One option to resolve this is we introduce schema for the plugin YAML files, but that would be quite some twisting of the structure, and we may not be able to fully define schema for all plugin YML structures. That would mean we should apply schema for core plugin YML files as well.

pwolanin’s picture

Using YAML tags seems like really the *right* way to do this - we are hamstrung by the crummy Symfony parser that doesn't actually conform to a large part of the 1.2 spec.

If we can't get that working for 8.0, I'd consider falling back to YAML comments as a hack. E.g.

  title:  Foo # translatable

Instead of the desired tag:

  title:  !translatable Foo

The notion of the YAML schema is really confusing in general, so I would be loathe to extend it to more files.

Gábor Hojtsy’s picture

Yeah that is a fine option with me. Just make sure to introduce this to all plugin YAMLs then including menu items, local tasks, etc. for consistency. Less ways to do the same thing the better. Let's not make it magically work in some cases and not in most others.

herom’s picture

I don't like the idea of using yaml tags or comments to mark translatable string, because 1) It doesn't say anything about how to support/mark "context" for a string, 2) It feels too redundant and error-prone, and it's going to end up everywhere (change records, yaml examples, etc).

I think we are trying to support a complex use-case that is really unnecessary. Instead, we can define a simple file named yaml_translatables.yml that lists the translatable keys for every plugin/yaml type. we could also allow contrib modules to define their own yaml_translatables.yml on top of the core list (to fix #2296219: Create schema for non-config YAML files?).

for the currently supported .yml files in core, the yaml_translatables.yml would look like this:

*.info.yml:
  - name
  - description
  - package
  - regions
*.routing.yml:
  - _title:
      context: _title_context
*.links.task.yml:
  - title:
      context: title_context
*.links.action.yml:
  - title:
      context: title_context
*.links.contextual.yml:
  - title:
      context: title_context
*.links.menu.yml:
  - title:
      context: title_context
  - description
config/schema/*.yml:
  - label
Gábor Hojtsy’s picture

@herom: I think the good news is we can start using this NOW in potx, we can put this file into .potx and support it loading any other files from contrib projects as being parsed. We should just come up with a file naming pattern. I'm not happy about "yaml_translatables.yml" but we can iterate on that :D Should we move this issue to the potx queue or open yet one more issue?

herom’s picture

Opened an issue in potx #2322839: Unify YAML translation extraction, and allow extension by contrib., with a PoC patch to show the idea works.

Gábor Hojtsy’s picture

Status: Active » Closed (duplicate)

#2322839: Unify YAML translation extraction, and allow extension by contrib. resolves this in potx module. Contribs will be able to define their own files. It now uses the $your_module/yaml_translation_patterns.yml file name. Better suggestions welcome :)

markhalliwell’s picture

Status: Closed (duplicate) » Active
Related issues: +#3064854: Allow Twig templates to use front matter for metadata support

Not sure why this was closed.

The related issue, if I understand it correctly, is just about extracting translatable keys.

This issue is about core being to return translatable objects instead of strings based purely on YAML data, yes?

If not, I think it should be so we don't have so much overhead of having to determine what data from YAML files to actually translate (#3064854: Allow Twig templates to use front matter for metadata support).

FWIW, tag support has been added to Symfony finally:

https://github.com/symfony/symfony/issues/21185
https://github.com/symfony/symfony/pull/21194

Well, or even !t to keep it terse, though I was thinking in parallel to the @Translation annotation.

As much as !t would keep it relatively simple and unobtrusive, it is also at the risk of creating (or rather continuing) our existing t() Drupalism.

There is absolutely no context as to what "t" means and can sometimes be a slight hurdle for newcomers.

Along the same grain of why, in our coding standards, we don't abbreviate variable names... we really shouldn't do that here either.

I think YAML tags should be explicit.

jhodgdon’s picture

This was closed because a viable solution was created:

a) POTX has a list of what items in YAML files should be extracted.

b) Config in YAML files can mark items as translatable in the schema, and the Config system then properly translates those things.

c) Plugin YAML files -- the Discovery classes, after parsing their plugin data, puts translatable parts of it into TranslatableMarkup to tell Drupal it's translatable via the standard UI string translation system.

Probably at this time we don't want or need to go back to the idea of putting t: or something like that into the YAML files?

markhalliwell’s picture