Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Problem/Motivation
In order to make #2506445: Replace !placeholder with @placeholder in t() and format_string() for non-URLs in tests approachable, we need to break it up into smaller chunks. This issue address !placeholder in the help text of all modules
Proposed resolution
Covers all the help text in hook_help() including a test that ensures there is no double escaping on any help text.
The issue will cause to translate all this help topics again for all languages
this would set back translators with 2.5%
see #16 & #28
Remaining tasks
User interface changes
API changes
Data model changes
Comment | File | Size | Author |
---|---|---|---|
#84 | 2560783-83.patch | 314.34 KB | stefan.r |
#84 | interdiff-80-83.txt | 12.41 KB | stefan.r |
#83 | 2560783-82.patch | 314.35 KB | stefan.r |
#83 | interdiff-80-82.txt | 13.96 KB | stefan.r |
#80 | interdiff.txt | 5.9 KB | lauriii |
Comments
Comment #2
joelpittetComment #3
lauriiiI enabled all the core modules and clicked through all the help pages (:D) and I found out that contextual Links help page has some markup escaped:
I didn't still review the actual patch for simple human errors. Will do that later on :)
Comment #4
joelpittetThanks didn't check to see if there were normal escaped
<
characters.Added to this patch and fixed that image.
Comment #5
joelpittetComment #6
lauriiiNow the translation string includes the img element. I think we need to still pass it as a variable.
Comment #7
joelpittetWhy does it need that? If I do that I'd have to use SafeString::create or render a img array to do that. This approach gives the alt context to the translator.
I'm no translation expert but was thinking this approach was better but could be wrong.
Comment #8
joelpittetHere is the same patch without the fix to show that the test picks that up.
Comment #9
jhodgdonCan I make a suggestion? Maybe in this issue you can just do all the straightforward cases where all you did is replace a !url with a @url in a hook_help(). Then do the other changes module by module so they can get more scrutiny. This patch is fairly large so it would be easier to review/commit if there was nothing controversial in it.
Comment #10
penyaskito@lauriii: I would say #7 is correct, see https://www.drupal.org/node/322774
Comment #11
joelpittet@jhodgdon I noticed that while doing the other ones the review was very straight forward for
hook_help()
to the point where it should be done be robots not people and it was easy enough to do an automated test with the existing test. Even if I had to write a new test it would have been easier.I'd rather take a big chunk of exactly the same pattern repeated and remove it from our painstaking review (so we can concentrate on the harder parts of the review).
The only "controversial" part in this patch is my fix to the bug that was found in how to place an image tag into
t()
.Comment #12
joelpittetThanks for the documentation backup on that change @penyaskito. @lauriii do you agree or is there another issue with the approach I took there we aren't seeing?
Comment #14
lauriiiI'm fine by the current solution :) I questioned it more because I didn't know which is correct! And thanks @penyaskito for your opinion.
Reuploading last passed patch because the test bot will automatically queue it to be tested later :)
Comment #15
alexpottThanks you
--color-words
. This looks good to go.The comment needs a fix up.
So the one question I have is with the new issue to admin filter. Is this really worth the disruption to translations?
Comment #16
Gábor HojtsyYeah each string modified here will appear as a new string to translate on localize.drupal.org. Each line is about one string change here AFAIS, so it would be over 200 strings for translators to translate again. Given Drupal 8 has about 8.2k strings (see https://localize.drupal.org/translate/projects/drupal/releases), this would set back translators with 2.5%. Its a sad technology fact that there is no translation memory and similar string suggestions on localize.drupal.org but at the same time, this change looks cosmetic given the ultimate filtering happening later on anyway, right?
Comment #17
andypostGreat point Gabor!
I see no any reason to escape this internal links and affect tremendous amount of strings to re-translate
PS: is there issue to follow in l10n_server like #587666: Add simple automated translation formatting checks
Comment #18
lauriiiThe reason itself is not that there is existing security problems but instead the fact that ! filter will be deprecated. It also shows bad example for people which by copying the code might cause security problems to themselves.
Comment #19
jhodgdonLooks like this is Needs Work for the comment problem in #15.
Comment #20
Gábor HojtsyIf we want to set an example and accept that over 200 strings will need to be translated again from scratch, we at least better do it sooner than later to give more lead-time to translators.
Comment #21
joelpittet@Gábor Hojtsy the parent change does another 200K of the same type of changes. This just does a bunch all at once.
hope not to put a halt on the whole operation, but if I can help en-mass convert them for the existing translations point me at an issue and I'll do my best to do so.
Comment #22
Gábor HojtsyRight, if we decide to do these, then it would also be useful to do them at once, not one in one beta and another in another (or in a/the RC). That would make the translator burden even bigger redoing and redoing things.
Comment #23
jhodgdonIs it possible to do a copy/replace on the translations somehow on Localize? I have no idea where they are actually stored.
Comment #24
joelpittetCreative drush sar even? https://www.drupal.org/project/sar
Comment #25
Gábor HojtsyLocalize.drupal.org takes each d.o packaged release and finds all the strings in it. If a string is different even in one character, it will be a different string. I am not sure what kind of automated tool would you envision, but its possible to take the .po files for the latest Drupal 8 beta, find the strings which were changed in this patch in there and modify their source / translations to have the new placeholders. Once this patch lands, upload that to localize.drupal.org as suggestions. For that one needs to do this for all languages which requires joining all the translation teams as well. Not impossible to do but not an hour's work I think :)
Comment #26
justAChris CreditAttribution: justAChris as a volunteer commentedPardon my ignorance here, but have the help texts addressed in the above patch actually been translated yet?
I downloaded a sample .po file (Italiano) to look for the some of the strings this patch is updating. Though I didn't try all of them, I was unsuccessful in finding any in the translation file.
Then tried on a install in Italiano via simplytest.me, trying the search help page:
Clicked around in the help section with similar results. I can't speak for all languages and all help texts, but do we know if these help text strings have been translated for any language?
Comment #27
joelpittet@Gábor Hojtsy I was thinking maybe fuzz the comparison operation to treat placeholder type changes as the same string if possible, even better would be strip the place holders out of the string before the comparison operation. That way renaming placeholders wouldn't break existing translations either.
Comment #28
Gábor Hojtsy@justAChris: Drupal 8 is about 56% translated to Italian so I would not be surprised if spot-checking these strings did not result in translations. See https://localize.drupal.org/translate/drupal8, Ukrainian, Spanish, Danish, Hungarian and Finnish lead the way.
@joelpittet: right, so localize.drupal.org has 590k source strings. For each incoming new string to fuzzy match it against the existing strings sounds like a performance concern. Here is the Drupal 7 alpha1 announcement from 5.5 years ago where we explained around it too: https://localize.drupal.org/node/712. Here is an 8 year old issue that was marked as duplicate of a 5 year old issue: #194141: Implement msgmerge type fuzzy matching :) Contributors welcome :)
Comment #29
joelpittet@Gábor Hojtsy thanks for the issue dig-up. If performance is a concern there may be some possible solutions there if they arise, I'll move that discussion over there.
Is this issue blocked on finding a solution or can we proceed @Gábor Hojtsy?
Comment #30
andypostLet's stop here
Comment #31
Gábor Hojtsy@andypost: is not a release manager or committer last I heard :)
@joelpittet: we don't have an automated way to tell the extent of the rework that translators will face, the spot-check with Italian was not very indicative as explained above. so I would not say this is blocked, it requires an executive decision weighting the benefits vs. (the hard to estimate) cost. 18 languages are more than 50% done with Drupal 8's translation that is at least 4k strings. There is no easy way to tell the overlap of those vs. these strings, unless someone has these strings in a .po file form and want to cross-merge with all the .po files from those 18 or so groups.
Comment #32
justAChris CreditAttribution: justAChris as a volunteer commentedYes, spot checking in Italian was a bad example, thanks for that completion % link. Ukrainian looks to be entirely translated on the strings touched in the patch. Trying to get a concrete number though (without counting by hand).
Comment #33
xjmTagging explicitly.
I think we could not make this change after RC. Strings are not supposed to be frozen yet, but I can also see how this would be disruptive now nonetheless (for hook_help() in particular).
However, I think we could consider a script that runs a regex and updates the translations for !placeholder to be equivalent to @placeholder as a one-off (not as an ongoing thing, because it could actually impact translations in other cases).
Comment #34
xjmFWIW, this is not merely a "cosmetic" change;
@
vs.!
has very real consequences in terms of the safeness of URLs etc. even with XSS filtering.Comment #35
Gábor Hojtsy@xjm: so you are saying the help text is not admin XSS filtered later on?
Comment #36
jhodgdon@Gabor - once you pass something through t() it is considered filtered/safe, and the render system no longer escapes or filters it.
Comment #37
justAChris CreditAttribution: justAChris as a volunteer commentedTo help with the scope of this, got a quick script together to check the strings being touched in the patch against the translation files. Probably not 100% accurate, but here's the summary of the top few languages (ordered per https://localize.drupal.org/translate/drupal8)
Ukrainian:
207 Strings213 StringsSpanish: 122 Strings
Danish: 36 Strings
Hungarian:
62 Strings65 StringsFinnish: 13 Strings
Dutch: 16 Strings
French: 1 String
Swedish: 23 Strings
Russian: 0 Strings
Obviously, more strings are going to be found in the rest of the languages (3 in Italian)
I've uploaded the scripts here:
https://github.com/justAChris/DrupalPlaceholderTranslations
The first script (stripHelpPatch.sh) only needs to be run if the patch changes. I've included the output of that script there (helpPatchStrip.txt), which uses the patch from #21
Comment #38
Gábor Hojtsy@justAChris: amazing, thanks for quantifying the side effects of this patch. I think if similar scripts can be used to help at least the top of these teams adapt, then we should be able to get this land without much disruption. Such a script would need to replace the placeholder marker and generate a .po file with those strings only to be uploaded as suggestions once a beta/rc release with these strings is out. Looks like the big ones affected are Ukrainian, Spanish, Danish and Hungarian.
Back to RTBC as per #14 then. It would be **very important** to commit coordinated changes to these strings within the same development period, so we don't have a set of changes for the upcoming beta and then again a new set of changes to the same strings again for another beta, etc.
Comment #39
justAChris CreditAttribution: justAChris as a volunteer commentedSaw some oddities in the search string text file created, so cleaned those and slightly revised the above totals and added a few languages.
For the subsequent translation cleanup, noticed this is one instance where we aren't simply replacing ! with @, not sure if there are others.
Comment #40
joelpittet@justAChris yeah that one is just going to have to break translation as it would normally. That is the proper way to do it as per the docs AFAICT.
Comment #41
Gábor HojtsyYeah we don't need to cover every single case in the update automation, just trying to manage the scale of the changes.
Comment #42
justAChris CreditAttribution: justAChris as a volunteer commentedSo, not to complicate things further, but bring awareness to the table, this hook_help() patch was broken apart from a larger patch in #2506445-96: Replace !placeholder with @placeholder in t() and format_string() for non-URLs in tests,97. Unfortunately, the parts of the original patch not in hook_help() also contain strings that are translated and have !placeholder being replaced similar to this patch.
I know @Gábor Hojtsy indicated we want to update the translations all at once if we go ahead with this, which makes a lot of sense. Having these separated out into different issues may make that difficult.
I've worked with my initial scripts to try to get an indication of the overall effect of the larger patch that replaces most occurrences of !placeholder throughout. It looks like the 213 Strings for Ukrainian becomes around 287 strings.
Though, I'm less confident with the accuracy of this number because the original patch is harder to parse.
Comment #43
Gábor Hojtsy@justAChris: localize.drupal.org does not parse dev releases exactly because the changing nature of strings; so the important part would be to make these changes between two tagged releases "at once" not between one tagged release and another and then another, etc.
Comment #44
jhodgdonRegarding the impact etc... Maybe we could separate the parent issue into:
a) Simple search-and-replace part
b) More complicated parts
Make sure to do (a), which I think is the bulk of the patches, all at once. This patch would be easy to review and easy to script on localize (I guess?). (b) would be a smaller number of strings and could be broken up into a few issues if needed.
Comment #45
justAChris CreditAttribution: justAChris as a volunteer commentedThis is postponed on #2558791: "!"-prefixed tokens should Xss::filterAdmin() but not affect safeness. Most of !placeholder in this patch occurs inside of
<a href=
. which may stay depending on the outcome of that issue.Comment #46
Sutharsan CreditAttribution: Sutharsan as a volunteer commentedRerolling patch for easy migration into single patch at #2506445: Replace !placeholder with @placeholder in t() and format_string() for non-URLs in tests.
Changing status for test bot. Revert status after test.
Comment #47
Sutharsan CreditAttribution: Sutharsan as a volunteer commentedStatus back to 'Closed (duplicate)'. Patch now included in #2506445: Replace !placeholder with @placeholder in t() and format_string() for non-URLs in tests.
Comment #48
justAChris CreditAttribution: justAChris as a volunteer commentedLet's keep this at postponed until we determine if !placeholder in URLs will be changed to a different placeholder. Suggesting that this could become "Replace !placeholder in URLs for hook_help() text" or similar if the placeholders need replacement.
Comment #49
chx CreditAttribution: chx commentedCurious why this would set back translators 2-2.5%. Aren't these strings held in the localize.drupal.org database? If so, can we write some fancy SQL queries? If you need help with them, I can give ideas :)
Comment #50
dawehnerYeah for sure, we discussed that. We then also discussed about adding a custom
hook_update_N()/hook_post_update_NAME()
to deal with that for each individual site.See some of the discussion on #2506427: [meta] !placeholder causes strings to be escaped and makes the sanitization API harder to understand
Comment #51
Gábor HojtsyFancy SQL queries are possible, as discussed above. We would need to keep the translations for pre this change in prior betas, because those of course used the old string (and have their review/submission history tied to that). SQL is possible to copy over some or all of that history or copy over values with a preset user or somesuch. Either way its more work to do than this patch alone, which is what's important to recognize and factor in.
Comment #52
justAChris CreditAttribution: justAChris as a volunteer commentedNote that the following has been placed in a separate issue #2568257: Add double escape test for hook_help(). Not necessary to update current patch here yet, still postponed.
Comment #53
effulgentsia CreditAttribution: effulgentsia at Acquia commentedSee issue summary of #2506427: [meta] !placeholder causes strings to be escaped and makes the sanitization API harder to understand for details as to the title and priority change. Still postponed on #2565895: Add a new :placeholder to SafeMarkup::format() for URLs that handles bad protocols, unless someone want to roll a patch in the meantime that also includes that one until that one lands.
Comment #54
effulgentsia CreditAttribution: effulgentsia at Acquia commentedComment #55
effulgentsia CreditAttribution: effulgentsia at Acquia commentedSorry for the noise, but trying to clarify the title more.
Comment #56
stefan.r CreditAttribution: stefan.r commentedposting a combined patch with #2565895-92: Add a new :placeholder to SafeMarkup::format() for URLs that handles bad protocols.
Did two simple search & replace in the previous patch and manually reviewed that we weren't changing any non-URL @placeholders.
Comment #57
Sutharsan CreditAttribution: Sutharsan as a volunteer commentedFound a few more hook_help strings.
Comment #58
kgoel CreditAttribution: kgoel at Forum One commentedI am reviewing patch#57I was going to review patch #57 but patch in #57 has some of the files that shouldn't be in this patch for example composer.json file. Reviewing #56 patch now.
Comment #59
kgoel CreditAttribution: kgoel at Forum One commentedPatch in #56 didn't apply. Re-rolled
Comment #62
kgoel CreditAttribution: kgoel at Forum One commentedApplied change from #57. Found some instances of "!placeholder" and replaced with ":placeholder".
Comment #65
stefan.r CreditAttribution: stefan.r commentedreroll of just the changes in #56 with all the consecutive interdiffs (in case that's what the test fails were about)
Comment #66
dawehnerIt is really helpful to be able to use
git diff --word-diff
So for static strings I think we wanna use :placeholder but we just need to be aware of it.
Comment #67
stefan.r CreditAttribution: stefan.r commentedSo for static strings I think we wanna use :placeholder but we just need to be aware of it.
@dawehner what do you mean here?
Also discussed with @kgoel how we don't technically /need/ to use
<a href=":cron">:cron</a>
as @cron might be fine outside the attribute, but this seems fine to me anyway.Comment #68
justAChris CreditAttribution: justAChris as a volunteer commentedRegarding #66, I think @dawehner is referencing the following:
We could remove the placeholder completely, but if we did, it would be more difficult to review and also would have a larger impact on translated strings.
Patch needs reroll now that #2568257: Add double escape test for hook_help() is in.
Comment #69
dawehnerYeah its pretty much this question. There are also examples of like "https://www.drupal.org/node/1" in there. Totally out of scope, but I really wonder why its then not just hardcoded.
Went through all the hook_help() implementations. There is one conversion left in contextual_help().
Comment #70
dawehnerdouble post
Comment #71
star-szrEasy reroll:
Comment #75
star-szrI took a quick look over the changes that are not from #2565895: Add a new :placeholder to SafeMarkup::format() for URLs that handles bad protocols and didn't find anything out of place. I used
git diff -U0 --color-words
.Here are the changes for contextual.module.
Comment #77
star-szrJust to back up what @dawehner said in #69 I also reviewed all hook_help() implementations and didn't find any implementations with ! placeholders not covered in this patch.
Strictly speaking this isn't a hook_help() implementation unless I'm missing something. Maybe out of scope?
These changes definitely seem out of scope for this issue.
Not sure if these points have been discussed but it seems like they haven't been. I'm under the assumption that other than tests this should only be changing .module files.
Comment #78
lauriiiWill take a stab on this one now that #2565895: Add a new :placeholder to SafeMarkup::format() for URLs that handles bad protocols is in
Comment #79
lauriiiJust a plain reroll
Comment #80
lauriiiRemoved changes out of the scope of this issue
Comment #83
stefan.r CreditAttribution: stefan.r commentedThis one was interesting to review, even with --color-words :)
Seems we have them all now?
Comment #84
stefan.r CreditAttribution: stefan.r commentedComment #86
googletorp CreditAttribution: googletorp as a volunteer and at Reveal IT commentedI made a manual review of all hook_help in Drupal core with the patch applied (It got a bit rough after a while so might have missed something). Anyways it all looks good, couldn't find a single used of !
On a side note. I noticed that wikipedia links are placed inside the t string (I guess the reason of this is to make the link translatable), but we don't do that with links for php.net which is also localized. Shouldn't we do that for these links as well? We could do this in a followup, since it's not really related to this issue.
Comment #87
dawehnerThat sounds something indeed great to discuss on a follow up: #2571845: Links to wikipedia/php.net should be part of the actual t() string so they can be localized
Comment #89
alexpottCommitted 7a25f51 and pushed to 8.0.x. Thanks!
Comment #90
ifrikThe Help text standard page on drupal.org needs to be updated accordingly as well now, so that future module hook_help texts are written correctly.
Comment #91
justAChris CreditAttribution: justAChris as a volunteer commentedAs a followup, a parsed list of updated strings has been posted here: #2572771-4: List translation string changes following replacement of !placeholder and @placeholder and discuss ways to assist translation teams