#193 | 1559310-193-remove_redirect_404_language_dependency.patch | 308 bytes | stella |
|
#184 | 404_language_aware-1559310-184.patch | 97.59 KB | tduong |
|
#184 | interdiff-1559310-182-184.txt | 1.55 KB | tduong |
#182 | r404_not_enabled_user_has_permission.png | 125.3 KB | tduong |
#182 | r404_enabled.png | 128.91 KB | tduong |
#182 | 404_language_aware-1559310-182.patch | 96.06 KB | tduong |
|
#182 | interdiff-1559310-179-182.txt | 2.93 KB | tduong |
#179 | 404_language_aware-1559310-179.patch | 94.03 KB | tduong |
|
#179 | interdiff-1559310-176-179.txt | 4.04 KB | tduong |
#176 | 404_language_aware-1559310-176-interdiff.txt | 995 bytes | Berdir |
#176 | 404_language_aware-1559310-176.patch | 93.63 KB | Berdir |
|
#175 | 404_language_aware-1559310-175.patch | 93.63 KB | tduong |
|
#175 | interdiff-1559310-173-175.txt | 758 bytes | tduong |
#173 | 404_language_aware-1559310-173.patch | 93.66 KB | tduong |
|
#173 | interdiff-1559310-172-173.txt | 6.57 KB | tduong |
#172 | 404_language_aware-1559310-172.patch | 91.51 KB | tduong |
|
#172 | interdiff-1559310-169-172.txt | 5.64 KB | tduong |
#169 | 404_language_aware-1559310-169.patch | 90.46 KB | tduong |
|
#169 | interdiff-1559310-168-169.txt | 2.5 KB | tduong |
#168 | 404_language_aware-1559310-168.patch | 88.04 KB | tduong |
|
#168 | interdiff-1559310-163-168.txt | 602 bytes | tduong |
#163 | 404_language_aware-1559310-163.patch | 88.04 KB | tduong |
|
#162 | 404_language_aware-1559310-161.patch | 82.38 KB | tduong |
|
#162 | interdiff-1559310-157-161.txt | 4.45 KB | tduong |
#157 | 404_language_aware-1559310-157.patch | 81.97 KB | tduong |
|
#157 | 404_language_aware-1559310-157-utf8_test_only.patch | 1.82 KB | tduong |
|
#157 | interdiff-1559310-154-157.txt | 1.52 KB | tduong |
#154 | Screen Shot 2016-10-07 at 14.57.23.png | 49.41 KB | tduong |
#154 | 404_language_aware-1559310-154.patch | 81.61 KB | tduong |
|
#154 | 404_language_aware-1559310-154_test_only.patch | 11 KB | tduong |
|
#154 | interdiff-1559310-150-154.txt | 7.39 KB | tduong |
#153 | redirect-invalid.patch | 652 bytes | Berdir |
|
#152 | diff_1559310_404lang_151.interdiff.txt | 4.03 KB | miro_dietiker |
#150 | interdiff-1559310-136-150.txt | 1.34 KB | toncic |
#150 | 404_pages_should_be-1559310-150.patch | 84.49 KB | toncic |
|
#146 | interdiff-1559310-142-146.txt | 534 bytes | toncic |
#146 | 404_pages_should_be-1559310-146.patch | 171.82 KB | toncic |
|
#145 | interdiff-1559310-136-142.txt | 1.05 KB | toncic |
#142 | 404_pages_should_be-1559310-142.patch | 171.82 KB | toncic |
|
#136 | interdiff.txt | 3.15 KB | larowlan |
#136 | lang-aware-1559310.136.patch | 84.46 KB | larowlan |
|
#132 | 404_language_aware-1559310-132.patch | 82.55 KB | tduong |
|
#132 | interdiff-1559310-129-132.txt | 1.96 KB | tduong |
#129 | 404_language_aware-1559310-129.patch | 82.54 KB | tduong |
|
#129 | interdiff-1559310-124-129.txt | 10.11 KB | tduong |
#124 | 404_language_aware-1559310-124.patch | 77.24 KB | tduong |
|
#124 | interdiff-1559310-123-124.txt | 798 bytes | tduong |
#124 | interdiff-1559310-120-124.txt | 2.43 KB | tduong |
#123 | 404_language_aware-1559310-123.patch | 77.19 KB | tduong |
|
#123 | interdiff-1559310-121-123.txt | 1.8 KB | tduong |
#123 | interdiff-1559310-120-123.txt | 2.46 KB | tduong |
#121 | redirect settings.png | 226.18 KB | tduong |
#121 | 404_language_aware-1559310-121.patch | 77.23 KB | tduong |
|
#121 | interdiff-1559310-120-121.txt | 2.9 KB | tduong |
#120 | 404_language_aware-1559310-120.patch | 77.39 KB | tduong |
|
#120 | interdiff-1559310-118-120.txt | 1011 bytes | tduong |
#118 | 404_language_aware-1559310-118.patch | 77.12 KB | tduong |
|
#118 | interdiff-1559310-115-118.txt | 28.29 KB | tduong |
#115 | 404_language_aware-1559310-115.patch | 70.71 KB | tduong |
|
#115 | interdiff-1559310-113-115.txt | 21.25 KB | tduong |
#113 | 404_language_aware-1559310-113.patch | 72.49 KB | tduong |
|
#113 | interdiff-1559310-112-113.txt | 984 bytes | tduong |
#112 | 404_language_aware-1559310-112.patch | 72.62 KB | tduong |
|
#112 | interdiff-1559310-109-112.txt | 787 bytes | tduong |
#109 | 404_language_aware-1559310-109.patch | 72.62 KB | tduong |
|
#109 | interdiff-1559310-106-109.txt | 4.08 KB | tduong |
#106 | 404_language_aware-1559310-106.patch | 71.7 KB | tduong |
|
#106 | interdiff-1559310-103-106.txt | 15.42 KB | tduong |
#103 | Screen Shot 2016-06-21 at 19.20.44.png | 86.57 KB | tduong |
#103 | 404_language_aware-1559310-103.patch | 69.06 KB | tduong |
|
#103 | interdiff-1559310-102-103.txt | 11.95 KB | tduong |
#102 | 404_language_aware-1559310-102.patch | 66.48 KB | tduong |
|
#102 | interdiff-1559310-100-102.txt | 49.2 KB | tduong |
#100 | 404_language_aware-1559310-100.patch | 66.44 KB | tduong |
|
#100 | interdiff-1559310-94-100.txt | 42.73 KB | tduong |
#99 | redirect assigned.png | 88.35 KB | tduong |
#99 | redirect.png | 95.64 KB | tduong |
#99 | add redirect.png | 115 KB | tduong |
#99 | view filters.png | 195.26 KB | tduong |
#99 | Fix 404 pages view.png | 170.21 KB | tduong |
#99 | Screen Shot 2016-06-17 at 17.48.07.png | 27.87 KB | tduong |
#99 | Screen Shot 2016-06-17 at 17.47.37.png | 30.58 KB | tduong |
#94 | 404_language_aware-1559310-94.patch | 65.87 KB | tduong |
|
#94 | interdiff-1559310-89-94.txt | 16.38 KB | tduong |
#89 | 404_language_aware-1559310-89.patch | 65.53 KB | tduong |
|
#89 | interdiff-1559310-86-89.txt | 2.65 KB | tduong |
#86 | 404_language_aware-1559310-86.patch | 65.64 KB | tduong |
|
#86 | interdiff-1559310-79-86.txt | 2.66 KB | tduong |
#79 | 404_language_aware-1559310-79.patch | 64.58 KB | tduong |
|
#79 | interdiff-1559310-75-79.txt | 2.38 KB | tduong |
#75 | 404_language_aware-1559310-75.patch | 63.52 KB | tduong |
|
#75 | interdiff-1559310-71-75.txt | 3.74 KB | tduong |
#71 | 404_language_aware-1559310-71.patch | 63.62 KB | tduong |
|
#71 | interdiff-1559310-68-71.txt | 3 KB | tduong |
#68 | 404_language_aware-1559310-68.patch | 61.34 KB | tduong |
|
#68 | interdiff-1559310-65-68.txt | 8.48 KB | tduong |
#65 | 404_language_aware-1559310-65.patch | 60.47 KB | tduong |
|
#65 | interdiff-1559310-64-65.txt | 23.71 KB | tduong |
#64 | 404_language_aware-1559310-64.patch | 42.21 KB | tduong |
|
#64 | interdiff-1559310-61-64.txt | 6.28 KB | tduong |
#61 | Screen Shot 2016-06-09 at 18.27.16.png | 136.37 KB | tduong |
#61 | 404_language_aware-1559310-61.patch | 40.96 KB | tduong |
|
#61 | interdiff-1559310-60-61.txt | 11.57 KB | tduong |
#60 | Screen Shot 2016-06-09 at 12.35.57.png | 88.56 KB | tduong |
#60 | 404_language_aware-1559310-60.patch | 45.92 KB | tduong |
|
#60 | interdiff-1559310-59-60.txt | 4.93 KB | tduong |
#59 | 404_language_aware-1559310-59.patch | 41.29 KB | tduong |
|
#59 | interdiff-1559310-58-59.txt | 7.71 KB | tduong |
#58 | 404_language_aware-1559310-58.patch | 32.74 KB | tduong |
|
#58 | interdiff-1559310-57-58.txt | 4.54 KB | tduong |
#57 | 404_language_aware-1559310-57.patch | 31.96 KB | tduong |
|
#57 | interdiff-1559310-53-57.txt | 6.31 KB | tduong |
#53 | 404_language_aware-1559310-53.patch | 31.23 KB | tduong |
|
#53 | interdiff-1559310-50-53.txt | 6.95 KB | tduong |
#50 | 404_language_aware-1559310-50.patch | 26.81 KB | tduong |
|
#50 | interdiff-1559310-49-50.txt | 1.81 KB | tduong |
#49 | 404_language_aware-1559310-49.patch | 26.6 KB | tduong |
|
#49 | interdiff-1559310-45-49.txt | 893 bytes | tduong |
#46 | 404_language_aware-1559310-45.patch | 26.57 KB | tduong |
|
#46 | interdiff-1559310-41-45.txt | 5.61 KB | tduong |
#41 | 404_language_aware-1559310-41.patch | 28.22 KB | tduong |
|
#41 | interdiff-1559310-36-41.txt | 10.34 KB | tduong |
#36 | 404_language_aware-1559310-36.patch | 35.07 KB | tduong |
|
#36 | interdiff-1559310-33-36.txt | 1.88 KB | tduong |
#33 | 404_language_aware-1559310-33.patch | 34.71 KB | tduong |
|
#33 | interdiff-1559310-30-33.txt | 40.17 KB | tduong |
#30 | fix 404 pages after.png | 113.28 KB | tduong |
#30 | fix 404 pages before.png | 107.58 KB | tduong |
#30 | settings after.png | 222.63 KB | tduong |
#30 | settings before.png | 210.55 KB | tduong |
#30 | 404_language_aware-1559310-30.patch | 23.88 KB | tduong |
|
#30 | interdiff-1559310-27-30.txt | 1.27 KB | tduong |
#27 | 404_language_aware-1559310-27.patch | 23.27 KB | tduong |
|
#27 | interdiff-1559310-24-27.txt | 3.18 KB | tduong |
#24 | 404_language_aware-1559310-23.patch | 23.28 KB | tduong |
|
#24 | interdiff-1559310-19-23.txt | 10.57 KB | tduong |
#19 | 404_language_aware-1559310-19.patch | 21.23 KB | tduong |
|
#19 | interdiff-1559310-16-19.txt | 4.7 KB | tduong |
#16 | 404_language_aware-1559310-16.patch | 19.93 KB | tduong |
|
#9 | redirect_error_155931_9.patch | 9.98 KB | miro_dietiker |
|
#6 | redirect_error_1559310.patch | 6.16 KB | miro_dietiker |
|
Comments
Comment #1
BerdirCan confirm this, had the same situation today on a site that confused the customer and resulted in him adding redirects for existing pages.
Not sure if it can be fixed, however. Core would have to log the original path before the language prefix was removed, redirect.module only displays whatever information is in the page not found log.
Comment #2
miro_dietikerI guess if data from core is missing, redirect should capture it on its own...
There's no specific 404 / page_not_found hook.
A 404 is created here finally via deliver_page:
http://api.drupal.org/api/drupal/includes%21common.inc/function/drupal_n...
The problem with fast404 is, we can NOT react on them so they're missing in the reports.
So let's create a new error table in redirect as submodule.
Then record error data for 404s via something like hook_page_build() if enabled (by default) including language.
Additionally, let's import apaches error.log in a cron. Thus we can drop complexity on 404s (no inserts) AND we can even import fast404's.
We need to check for logrotate cases (if growing, incremental. If smaller, restart. Or always check first line for change.)
Finally we can offer a batch import of error.log's if they where missing.
I'd be happy to hear from the maintainer for his opinion and if we should go ahead.
This all is also possible in a separate contrib.
Comment #3
Dave ReidHrm, I am wondering if redirect should implements it's own hook_watchdog() to record only 404 responses in it's own table rather than relying on dblog. It would make sense.
This is out of scope for the module since it's implementation specific and could live separately in another module that feeds in the redirect 404 log table. And wouldn't raw server logs also have the same problem of missing the language context?
Comment #4
miro_dietikerRaw server logs perfectly contain the language prefix (if it's a path prefix).
Drupal cuts this off in bootstrap with custom code.
hook_watchdog is a good idea. didn't find that...
We will end up with two db writes per 404. And if we not only append lines but e.g. try to select url & update counter, we will even end in a read-write-cycle, making most db's (especially with indexes) slow.
Then a variable to disable the 404 writes would be enough (so we can rely on the table and UI)
Comment #5
Dave ReidThere is no counter. It's only one write per 404 as far as redirect would be concerned. Not any different from having dblog do a write per 404 or 403. And yeah, making it controlled by a variable (enabled by default) so that other methods could populate the table would give the most flexibility.
Comment #6
miro_dietikerHere's a first step.
Introducing the schema table redirect_error, capturing all drupal_not_found as records.
Not really tested yet. Let's check testbots statuses.
Missing is
Also check the code @todo.
Comment #7
miro_dietikerHmm, testing postponed... It seems redirect has an exception in the tests now "Infinite redirect loop prevented."
Comment #8
BerdirThe table definition needs to be copied, a helper function doesn't work. The schema could change and then you need to change hook_schema() but not the initial update function.
No need to return anything if there is no relevant information.
Not sure if we need a data migration? This will get built up again and watchdog is only temporary data anyway.
description seems wrong?
What's this for?
Description also wrong I think.
Missing spaces around ==
Not sure I understand the @todo, there is always a $language object and ->language is also always set. This should be fine I think.
Comment #9
miro_dietikerSchema: Fully agree with the needed schema definition duplication. I was just looking a bit around and did a copy paste from ???
Most minor thingies. Most fixed already.
Migration dropped.
I added a status column to cover e.g. other 40x errors if needed...
This corresponds to the status code, a redirect sends (302, ...) and how redirect works.
In addition to 404 i thought about covering the case 403 (access denied). But note that a redirect doesn't really make sense there...
So yes, removed
'language' => $language->language,
The problem here is that $language has always a value.
If for example a callback comes WITHOUT a language prefix, we want to identify this case and add a language neutral redirect.
Since drupal has a default language, the default context is identified and only a redirect for that default language is added.
At least in case of switching the default language, all the added redirects don't work again.
I guess the right direction is to be found here:
http://api.drupal.org/api/drupal/includes%21language.inc/group/language_...
module_load_include('language', 'inc', 'language.negotiation');
$langcode = locale_language_from_url($languages);
http://api.drupal.org/api/drupal/includes%21locale.inc/function/locale_l...
Additionally added detection of multilingual case.
Missing is still
Comment #10
knalstaaf CreditAttribution: knalstaaf commentedBoth patches give me this raw code displayed on every page:
And these kind of errors:
I'm using Redirect 7.x-1.0-rc1+8-dev (2014-Jul-23).
Comment #11
Dave ReidComment #14
miro_dietikerAdded this to our 8.x port board.
Comment #15
Dave ReidComment #16
tduong CreditAttribution: tduong at MD Systems GmbH commentedYesterday redirect has been pushed to d.o, so changed version to 8.x-1.x-dev and the PR #98 will be moved here.
Uploaded current patch: first point of the PR is done + starting code for the RedirectNotFoundStorage service.
Comment #19
tduong CreditAttribution: tduong at MD Systems GmbH commentedFixed/completed RedirectNotFoundStorage service.
There are still a
\Drupal::database()
call in RedirectFix404Form (line 80) and redirect.install (line 230). Should we use a service also for them ?Comment #22
Dave ReidIf we're using our own table (which is good), I think we should write Views integration instead and use that to display this page, instead of hard-coded page output.
Debatable also if this should be moved to its own separate sub-module (a redirect_404 sub-module, and also splitting out the 'expire + counting + last access' into a redirect_stats sub-module).
Comment #23
Berdirthis is not an entity handler, just the service is enough. so you can remove this again.
and config factory should be injected too.
the storage shouldn't have to know about where the path is coming from. Make a method with specific arguments, something like logRequest($path, $langcode)
You should also inject the storage into this service.
this should not return a query but just plain result objects.
The idea of the service is that it must be able to implement it with a different backend that's not about sql at all.
So you need to pass the $search string to it as well.
$header is a bit tricky, the paging as well.
About the method name, I'd do something like listRequests() or so.
then this problem will resolve itself as it will be inside the service as well.
Crossposted with Dave, I'm OK with making this a separate redirect_404 module, instead of a setting.
Note that I suggested to make storage a service so that a faster backend like redis could be used. That conflicts with using views for it. Not sure.
Also note that the count/expire stuff has been removed already from redirect 8.x-1.x as it simply doesn't work with how we do caching now. So it would be re-implementing in a new module, not splitting :) And if it should work with page cache, it would have to be a middleware that runs before page cache.
Comment #24
tduong CreditAttribution: tduong at MD Systems GmbH commentedFixed points in the comment above, just "$header is a bit tricky, the paging as well" left.
Fixed RedirectUILanguageTest test + some refactoring.
Moving stuff into the sub-module will be done later as it would be hard to review.
Comment #27
tduong CreditAttribution: tduong at MD Systems GmbH commentedThe test fails because it is still using the old redirect_error table fields. Updates in
RedirectUILanguageTest
andRedirectFix404Form
.Comment #30
tduong CreditAttribution: tduong at MD Systems GmbH commentedFixed a line in RedirectUITest, but there is still some bugs that I don't know how to fix. It seems like when you want to add a redirect, it does not have a placeholder/pre-set a value in
redirect_source[0][path]
textfield, neither islanguage[0][value]
(default to - Not specified - (und)). So these are 2 failing assertions, plus since Path field is required and has no pre-filled path value, after saving it "redirects back" to the same page, thus another assert fail.Both of them appear in RedirectUITest and RedirectUILanguageTest.
Furthermore in the language test, from
count(db_select('redirect_404')->fields('redirect_404')->condition('path', 'testing')->execute()->fetchAll())
it returns 0, even though it should be 1 (I've tried also withcondition('count', 'testing')
).Then I've tested it manually and I'm not sure if it was meant to be so, but in the "Fix 404 pages" it has dropped the langcode from the path that was part of it if you see how it looks like without the patch applied (didn't check the past commits yet).
I'm not sure how to fix these bugs left. Will wait for some feedbacks and reviews/approval to move the redirect_404 stuff into a submodule.
Comment #33
tduong CreditAttribution: tduong at MD Systems GmbH commentedMoved (almost) every 404 related files/codes into redirect_404 submodule, fixed where was needed.
Comment #36
tduong CreditAttribution: tduong at MD Systems GmbH commentedDidn't run the tests before uploading the patch, but noticed that
Fix404RedirectUILanguageTest
takes more than 6 mins... what is wrong here ?Dropped invalid taxonomy permission in both tests.
Comment #39
tduong CreditAttribution: tduong at MD Systems GmbH commentedFor the fix 404 redirect form *not actually* pre-filling its fields, I think that it may be a bug form RedirectFix404Form::buildForm() which leads to RedirectForm::prepareEntity() with no value for
redirect_source[0][path]
andlanguage[0][value]
(set as und per default), but not sure what is exactly the bug...Comment #40
BerdirI think we no longer need the setting then. If you don't want this feature, don't enable the module.
we only need to specify redirect as dependency I think.
We don't need an update anymore. All sites will have to enable the new module, which will install the table.
lets add the backend_overridable tag here, see https://www.drupal.org/node/2306083
that means we don't need this check and also don't have to inject config.
this conditional can go away as well then, just use the first version.
That's not true anymore.
unrelated change.
no idea what this is :) There is no such hook.
Valid fix, but unrelated.
same
same. We should not have any changes anymore in redirect except removing code.
this can go too.
Comment #41
tduong CreditAttribution: tduong at MD Systems GmbH commentedAll done.
Comment #44
BerdirYou have those query arguments wrong.
See \Drupal\redirect\Form\RedirectForm::prepareEntity, you need to pass what's actually implemented. That means source and not path and language and not langcode.
Comment #45
tduong CreditAttribution: tduong at MD Systems GmbH commentedUpdated "path --> source" and "langcode --> language" in both redirect_404 tests and RedirectFix404Form, removed debugs previously put for testing and reverted some @file doc leftover (issue about Redirect cleanups will be committed here: #2717103: Removing @file docblocks).
Running Fix404RedirectUILanguageTest locally, there is still
testAutomaticRedirects
(this is in RedirectUITest) that fails at$this->repository->findMatchingRedirect('term_test_alias');
, it complaints about$this->repository
not being an object, butfindMatchingRedirect()
has been called several times before like that and there are problems only here (?)... Let's see what happens in testbot now...Comment #46
tduong CreditAttribution: tduong at MD Systems GmbH commentedAaand here the missing patch...
Comment #49
tduong CreditAttribution: tduong at MD Systems GmbH commentedFound the bug... Fixed.
Comment #50
tduong CreditAttribution: tduong at MD Systems GmbH commentedDiscussed with @miro_dietiker, we should drop the path that has already a redirect assigned form the Fix 404 pages list. Here I've just started adding the field to the redirect_404 table. Still work in progress...
Comment #51
miro_dietikerI reviewed the situation and defined:
- We should clean up the table, similar to dblog, based on cron: remove records with smallest count and oldest hit
- 404's that are resolved (aka adding a redirect) should disappear. Thus we add a "resolved" column
-- Hitting a record that was previously marked as resolved, removes the flag again (happens when a redirect is deleted)
-- Thus we add a resolved column
- The admin UI should be built with a view
-- With language as an exposed filter
- Yeah, the UI should not link to an inexisting path
Thus, we have low complexity and the new 404 module is focussed to its purpose with a good experience.
Should use a view (see above) and should be excluded with the filter when doing the query.
Finally, we have identified some follow-ups that possibly need discussions first. Please create the follow-up issues when we are near completion.
Comment #52
miro_dietikerWe need to decide if we still want to maintain a non-views fallback like #2716883: Make views dependency optional
To avoid the double complexity, we usually drop it and depend to views...
Also, do we need / can we add an option to disable the regular 404 dblog records to avoid record duplication?
Follow-up: Drupal can operate multiple domains and for 404's it's important to also know about the domain it originated.
(Domains could also be bound to languages through language negotiation.)
Comment #53
tduong CreditAttribution: tduong at MD Systems GmbH commentedAfter discussions, debug, fixing stuff, ... finally there is a patch...
What is done now:
2) 404's that are resolved (aka adding a redirect) should disappear.
-a- Hitting a record that was previously marked as resolved, removes the flag again (happens when a redirect is deleted)
-b- Thus we add a resolved column
4) Yeah, the UI should not link to an inexisting path
Next step, view (reason: more flexible custom modifications):
3) The admin UI should be built with a view
-a- With language as an exposed filter
As what we've discussed yesterday, this is part of the followup:
1) We should clean up the table, similar to dblog, based on cron: remove records with smallest count and oldest hit
Comment #54
miro_dietikerAwesome progress and great to already have test coverage!
The views conversion is important for us, because we need capability to clone the view and offer a varying 404 tab with specific filters.
From reviewing the situation, i think the dblog cleanup with cron can not be a follow-up. Because currently dblog can be cleaned up and doesn't fill the DB.
But with the new module, the DB is filled more and more with every 404. That's a regression that could end up in a critical issue if bots are scanning your website with invalid URLs...
Comment #55
slashrsm CreditAttribution: slashrsm at MD Systems GmbH commentedAgreed. Not just bots. It could also be used for a DoS attack.
@berdir mentioned in #23 that we might want to implement different backends (this is a perfect fit for Redis or MongoDB's capped collections). Views become a bit problematic in that case...
Wrong class name in the comment.
Should we create an index on this? It should speed up writes, specially in case of large datasets.
This seems a bit unfortunate name of the function to me. It is not updating the log item but marking it as resolved.
Should we index fields that can be used to sort? At least count, which is selected by default?
.
Wrong class name.
Comment #56
miro_dietikerDiscussed about pluggability.
When looking at past, we did past as API module defining the logging interface and event capture and past_db that does the db storage, plus offering the UI.
Current code is simple and database centric. As long as we are using the DB, i think we should following the core pattern / direction of using views as much as possible.
Once generalisation is done, we can reintroduce all the complexity of pluggability and configurability. For the time bing i think the only possibly healthy piece would be to rely to a StorageInterface in the EventSubscriber and not the Storage itself.
Comment #57
tduong CreditAttribution: tduong at MD Systems GmbH commentedSorry for the late update, discussions and supports took me a while...
For now, cleanuped code pointed by @shlashrsm (1/6, 3, 7). 2 and 4 has been discussed: in our case here, it is important to write/store our records fast. indexes make it faster to search tables, but longer to write to. having unused indexes (we would use it only to easily sort and select the records to cleanup) will end causing some unnecessary slow down. thus to have a fast write operation, we should not have indexes here.
As discussed, we will need the view thing first, so next: views! :D
Comment #58
tduong CreditAttribution: tduong at MD Systems GmbH commentedTiny update: implemented a StorageInterface
Comment #59
tduong CreditAttribution: tduong at MD Systems GmbH commentedSo, I've tried to find out how to provide our redirect_404 fields and filters to use in a view, but still don't get where I need to say "use redirect_404 table" without an entity.
Currently there is only the Redirect view setting, so only the redirect table is accessible, and only User can be selected as a relationship, which cannot be used to reach our redirect_404 table.
For now I've just provided Path and Language field/filter, and Add Redirect button field plugins, and a ViewsData class.
I need some help to continue working on this, but let's discussed it tomorrow :)
Comment #60
tduong CreditAttribution: tduong at MD Systems GmbH commentedFound it! Now I'm using a hook_views_data() and a wizard plugin to provide our fields and filters. Still work in progress, but here there is some first steps (not clean version).
Now I'm focusing on the operation buttons.
Maybe already one question: how can I convert a langcode (from the db) to a language to be displayed in the view ?
Comment #61
tduong CreditAttribution: tduong at MD Systems GmbH commentedYay! So now we have a nice view I think :D
Still need to clean up some code and add a test for our view filters. And the "dblog cleanups".
Comment #62
miro_dietikerLooks really nice. Some minor feedback.
I'm irritated, there is a language column in admin/content and core should provide a field handler already.
This looks non-D8-alike and risky. Plz recheck for a nice pattern.
What irritates me is that preg_replace() seems to get wrong parameters. Does this do anything real?
Comment #63
slashrsm CreditAttribution: slashrsm at MD Systems GmbH commentedIt needs to be populated.
Inject.
This shouldn't be needed. Since the key in views data matches the name of the column in DB it should work automatically.
You don't need to provide field if you're getting the default one.
ConfigurableLanguage is in language module. We either need to specify it as a dependency or find a way to make it optional. Maybe display a raw langcode if class does not exist?
Wrong comment.
Comment seems wrong.
Shouldn't we get those two from the redirect entity?
Comment #64
tduong CreditAttribution: tduong at MD Systems GmbH commentedDone points from #63.1-7.
#63.8 I need to do it like that because otherwise we will get the added redirect selected language instead of the original recorded one, which needs to match in the db.
Still need to do points of #62 after lunch and add tests.
Comment #65
tduong CreditAttribution: tduong at MD Systems GmbH commentedAdded view to be installed when the module is enabled, added test for Path and Language filter (not working now), and cleanup a bit of code. Will continue next week :)
Comment #68
tduong CreditAttribution: tduong at MD Systems GmbH commentedAdded small feature (in the 404 page, display only the languages of the paths that are still resolved = 0), improved the tests.
Comment #71
tduong CreditAttribution: tduong at MD Systems GmbH commentedAdded cron job to clean up redirect_404 rows (temporary set row_limit to 3 for test).
About the failing tests, why is testbot getting these /checkout from
Url::fromUri('base:admin/config/search/redirect/404')->toString()
?Comment #74
miro_dietikerJust a partial quick review.
This is very expensive. Note that we don't have a langcode index.
With views, it's common that options display ALL possible values and not just what has data. If you want this, you use something like facets and an index that can power it efficiently.
Nice idea, but let's switch to just list all languages.
This doesn't work.
Say the limit is 10 and 9 items have a count >1 and the 10th item has just the most recent hit with a count = 1. You would delete ALL data.
hmm? and there are many similar commented rows. :-)
Comment #75
tduong CreditAttribution: tduong at MD Systems GmbH commented1. reverted condition and fixed the test.
2. I was about to follow your hint but they said that in MySQL is not allowed to delete the table used in a subquery, so I thought to write something like that, but sure it is not fine...
3. From tests fail in #68 I thought it was because of that, so I've tried to just give the $destination without any
ltrim()
but apparently is not that either... reverted. I guess it will still fail :/Comment #78
BerdirI guess this is taken from dblog_cron(), which has a unique id which works better there.
Adding such an id would not help us, since we update records and don't have a strict order based on the ID.
Instead of row count, we could instead define an age and then delete records that were not accessed anymore since N days. Could still fill the logs if you do a huge a mount of requests over a short time, but the data we write does not use a lot of space and you'll have other problems too if you have such a high amount of requests over a short time?
That said, to be honest, I really don't care if we delete a bunch of entries a second too late or too early.
However, the logic needs to be a method in the storage service (purgeOld() or so)
About the views, views has a system for preventing to install default config if the storage for it is not present. See \Drupal\views\Entity\View::isInstallable(), if the views data is not there, the view is not installed. So what you can do is make sure that you only define your views data when the service is using our specific implementation (\Drupal::service('...') instanceof OurClassName).
Comment #79
tduong CreditAttribution: tduong at MD Systems GmbH commentedI'm closed to the solution, but there is still something trivial that leads me to more complex situation...
Since we've tried with
delete from redirect_404 where (path, langcode) in (select path, langcode from redirect_404 order by count desc, timestamp desc limit 2, 1);
and got the error
Error This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery’
so we've ended up to something that would simplify how to sort the rows and drop the irrelevant ones.
The idea now is to provide an extra field, relevancy, which value is updated based on the visit frequency of a redirect_404 record (radioactivity exponential decay concept). Given the formula
$new_energy = $old_energy * $decay
and$decay = pow(2, - $elapsed / $halflife)
(N(t) = N_0 * e^(-t/T)), in our case we have:$elapsed:
REQUESTED_TIME
- (last timestamp stored for this request)$halflife: i've set it as a day per seconds (60*60*24 = 86400), otherwise if it is smaller it decay faster, but this is tbd
We thought that this $decay should be multiplied to the relevancy of the requested record, but we want to "downgrade" all the other to "give" more relevance to this requested row, so:
$old_energy: last relevancy stored for all the relevancy != this request (1: max relevancy)
I've tried to do this in a subquery and get that value to update all the other rows in the Sequel Pro console:
but it is not working (same as delete):
You can't specify target table 'redirect_404' for update in FROM clause
...At the moment the uploading interdiff has only the new field schema and some attempts in
RedirectNotFoundStorage::logRequest()
to store/update relevancy. Once this works I'll move the cleanup logic in this service.About the view, how/where should I create that method/check without the entity ?
Comment #82
tduong CreditAttribution: tduong at MD Systems GmbH commentedOther possible way, something with MySQL triggers ?
Comment #83
BerdirI think you're overcomplicating this. I'll repeat my comment from above:
What you are describing sounds like a ton of complexity, slower writes on 404 requests and more complicated delete queries.
IMHO, your only problem in the first attempt was to sort by count first. Just don't do that. Get the oldest timestamp (timestamp is updated on each new request) after the amount of entries that should be kept, then delete everything of the same age and older.
AFAICS, there are only two problems with that:
1. We might keep 1003 records instead of the configured 1000 because record 1000-1003 happend in the same second. So what? On a site that is that busy, there are likely going to be new requests while we're doing this delete, so it will already have more anyway. Doesn't matter.
2. On a very busy site, with thousands of unique 404 requests, more important 404's with multiple requests might get deleted. I expect we'll set the default to 10k or so. That means more than 10k requests on different 404's need to happen to delete data for such a case. That's quite a lot, and if you're worried about that, just set it to 100k or so. We're storing very little data, much less than a watchdog entry, so storing 10 or 100k entries really isn't an issue.
Comment #84
miro_dietikerYeah, thx for trying to move forward with this idea @tduong.
Not sure if the energy / relevancy pattern is a good idea. I was just thinking out loud.
Here's the missing piece: The relevancy value is only updated when the row is updated. Thus, the relevancy stored is the relevancy at the time of the timestamp.
With this pattern, you usually never need to do bulk updates to those tables.
Instead you should order by calculating the current relevancy, something like:
ORDER BY pow(2, -(UNIX_TIMESTAMP(NOW()) - timestamp))/86400 DESC
IMHO the complexity looks pretty limited with this approach and we can filter out irrelevant things easily.
All other tries failed. We are open for better approaches.
(We are only using the DB to store raw data and have business logic in PHP. Triggers and stored procedures is not how Drupal usually does it. Also i don't see how it could simplify things here... ;-) )
Comment #85
miro_dietikerThat was a cross post with Berdir.
Note that when a bot like Google scans your site, every 404 page is receiving a single hit in a period.
If you have these kind of bots active and still over time cumulated items with currently very low frequency, you risk to loose them.
I agree to simplify, we can cutoff based on large limit numbers with timestamp order.
I discussed with tramy to figure out how much we can simplify the approach and if it's worth considering. If it's complicated it's not and we most likely can live with the limitation.
Comment #86
tduong CreditAttribution: tduong at MD Systems GmbH commentedCouldn't make it work even with two queries. Tomorrow I will try with the timestamp and make it work.
Comment #89
tduong CreditAttribution: tduong at MD Systems GmbH commentedFinally it works, but only using query() instead of select() and delete(). Still need to clean up some code and move the cron logic in the storage service, after lunch :P
Comment #92
tduong CreditAttribution: tduong at MD Systems GmbH commentedUh, something to mention: the relevancy values are stored wrong now because the most 404 page visited is *decreasing* instead of increasing, but I don't think that doing
$relevancy = $relevancy + $relevancy * $decay
is a good solution because it grows too fast (?)And also in the views.view.redirect_404.yml I've commented some lines because it breaks the view, but as soon as someone export the view, he/she gets these lines enabled again... Not sure how to fix it (count numeric in views data) and if doing what @Berdir suggested above will avoid this bug...
Comment #93
miro_dietikerYeah we missed the hit +1 here:
$relevancy = $relevancy * $decay + 1
Some quick reviews again. :-)
I think we should ship a better default. :-)
Double negation makes my mind melt. How about "There are no 404 errors to fix."
I would prefer a standard order of the keys.
(multiple) Missing final dots in description in schema.
Back again? Thought you confirmed removal of the language distinct query?
We should not purge on exception. And don't do a debug either. :-)
So we drop this whole class finally?
(otherwise view would need to go into optional...)
Please don't forget to add the dependency to views!
Comment #94
tduong CreditAttribution: tduong at MD Systems GmbH commentedOook, now there is something that works! :D
Discussions, several attempts and then finally fixed the cleanup cronjob, cleaned up some code, refactored points mentioned above.
About 6., it was just for me, to test/see the result (debug is too slow), and I've uploaded the patch in case there was someone that wanted to review it :P
Point 7., now I've added a @todo for that form and
listRequest()
. We will discuss it later or in a follow up :)For now I still have the view in install. Didn't check if the tests pass if it is in /optional. I'll check it tomorrow.
Unfortunately there is something wrong in the KernelTest that I've just written after the cronjob fix: somehow out
purgeOldRequests()
method inredirect_404_cron()
is not triggered. Anyone knows why and how to make it work ?(...go to watch the match :D)
Comment #97
miro_dietikerIf the table is empty or even if there are less records than limit, this looks to me like it will fail fatal. Also $select seems a bad name for a row - plus neither path nor langcode are needed here.
Comment #98
miro_dietikerIMHO the added complexity by this pattern is minimalistic. The functional part is 2 lines of code compared to timestamp sorting (plus schema lines). But we should add comments about the math so people can understand it. :-)
Comment #99
tduong CreditAttribution: tduong at MD Systems GmbH commented#97: again, it was just for testing as a "mid-solution". Sorry for uploading dirty patches ^^'
I'll post a comment with the explanation first, so next the uploading patch will be on the top of the list in the IS.
Now the Redirect 404 view works fine, and also the filters (our Language filter takes the default Drupal language filter) (see screenshots "Fix 404 pages view" and "view filters"), after assigning a redirect to a record ("add redirect") it appears in the Redirect page ("redirect"), if you look for that path in the Fix 404 page it is not displayed anymore since it is marked as resolved ("redirect assigned").
Mysql math performance information:
We've tested with ~400'000 rows through
drush sql-cli
command line.The time it took to run the select query to order by the effective relevancy at the current time was 2.69 sec (the same query with the limit condition to get the cutoff value should last more or less the same because it has to run through every row to compute the math expression anyway but it doesn't have to display each rows, so we got 1.09 sec for this), then for the delete query it took 3.06 sec (see the first 2 screenshots).
Comment #100
tduong CreditAttribution: tduong at MD Systems GmbH commentedNow this patch should be clean enough :P
- moved views.view.redirect_404.yml in /optional, and added check in
redirect_404_views_data()
to only define our views data when the service is using our specific implementation- cleaned up some codes (in RedirectFix404Form there are only array syntax refactoring)
- added comment/math explanation in the RedirectNotFoundStorage doc block, and some further comments in some methods and test
- cleaned/shortened/break in multiple lines some queries that were too long (hope it doesn't hurt the readability too much ^^')
- fixed web and kernel tests
Comment #101
Berdirit doesn't have to be optional config. there are no additional dependences, so it can't ever not be installed, except when the storage is not there, but that happens on a different level.
That's not the description of the file but from the purge old requests method.
This should simply say something like "Module file for redirect_404."
remove this.
wondering what makes more sense, having this in the redirect settings or in the loggin settings. Not sure...
use chaining to make this more readable
\Drupal::configFactory()
->getEditable()
->set()
->save()
not sure if we really need bot the view and the controller?
a comment for this would be good.
ah, it was discussed here.
I would avoid "we", try to write it from a third person (In case someone uses an alternative backend).
field plugins have a special access method, then it won't show up at all, this would give a column with empty and possibly broken operations.
You should use getLanguage() from the language manager. otherwise you might get problems with UND or deleted languages. Even then, make sure you really get a language object back, in case a language gets deleted.
not sure if having a wizard for this is worth it, only very, very few people will create additional new views.
this is not a controller: Sql implementation for ...
Might make sense to also include that part in the class name: SqlRedirectNotFoundStorage
does ORDER BY cutoff not work?
limit like this is not standard compliant I think.
it would be better to use a select() query with addExpression()
same here, even though it is safe here, never put dynamic variables directly into queries if it can be avoided. this should be easy with ->delete() and a condition?
a nicer pattern for what?
comment seems very repetitive to the one on the class and doesn't tell me much. This pattern was used in 7.x as well, not sure why we have to replace it or how that is related to redis/mongodb.
this should contain documentation about which characters are expected to be treated as wildcards.
queries should not have ; at the end.
this should use db_update(), and this doesn't seem to be a sufficient test for the relevancy stuff.
I'd expect to see a test that simulates a situtation where a record with a higher count and older timestamp would get deleted first, but the relevancy prevents that.
Also the between check is too much magic for me, I'd expect the scenario above will require multiple, specific update queries (one with count 1/relevancy NOW() - 600 and timestamp x, one with count 5 and timestamp xy - 1200 and so on.
you can use fetchField to get a single value.
might be possible to do this in redirect_404 instead. redirects are an entity which trigger an hook_redirect_insert() hook. you can listen to that and then call this there.
Comment #102
tduong CreditAttribution: tduong at MD Systems GmbH commentedPoints 1-3, 5-10, 12, and 18 done. Further feedback:
4: if we want to move this into redirect, then the not found storage service needs to be moved too, otherwise if the user does not enable the redirect_404 sub-module the service call in hook_cron() won't work, will it ? I guess it is fine to leave it there.
11: not fully sure how this magic works, but afaik without this wizard plugin we don't point the queries to any table. after past discussions with @miro_dietiker, he said we don't need an entity for redirect_404, so I'm not sure where else we can tell our view to look into the redirect_404 table.
15: from @miro_dietiker comment #62
19: how ?
Currently doing: 13-17
Comment #103
tduong CreditAttribution: tduong at MD Systems GmbH commentedMisread point 4. forget about what i wrote above.
Otherwise is there already any decision about what to do for these points left ?
In this patch:
13.-14. edited queries.
16. added comment for wildcard, but i guess i still need to improve it (?)
17. done. about the extended test, i've just updated the relevancy field in one line and afaik a test should not have too many randomised values, otherwise it hurts readability (?) also added helper methods to specifically assert which rows are there and which are not. also added the $resolved parameter in case we will need it for other tests in the future.
19. hook added, fixed related code.
Also noticed:
Instead of getting the path with
$this->currentPath
, we should use$this->requestStack
to allow storing the arguments as well (tested with /foo?foo=1 and /foo?foo=2, with the previous code only /foo is stored for both cases), but now if we have /de/foo?foo=3, it stores the langcode correctly, but in path it is stored as /de/foo?foo=3. I will check/fix it tomorrow.Comment #106
tduong CreditAttribution: tduong at MD Systems GmbH commentedUploading patch:
redirect_404_redirect_presave()
: when a redirect has been assigned, in case you had a 404 path in 'en' and you want the redirect to be in 'de', if you use the redirect entity to get the langcode you will get 'de', but you want 'en' because you ve stored that path in 'en'. with the previous code then you wont be able to mark this path to resovled because "un-matching langcode". changed it.$this->currentPath
or$this->requestStack
, we've combined both of them to get the cleaned path and append the (evtl.) URL's arguments.Comment #109
tduong CreditAttribution: tduong at MD Systems GmbH commentedDiscussed with @Berdir about #106.1: redirecting to another language does not resolve the path. if the user wants to have a redirect to another language, that should be up to the user, not our problem. thus reverted it and fixed the Fix404RedirectUILanguageTest.
About the failing tests, calling
$redirect->getSourceUrl()
then it callstoString()
which generates a different Url string for testbot. now this should be fixed (added a new method in Redirect.php to get the source url with its arguments), but i leave the debug statements in the test to evtl. see what else is the bug. thus, this is a dirty patch!Refactored Redirect404Subscriber::onKernelException() to the D8 std.
Comment #112
tduong CreditAttribution: tduong at MD Systems GmbH commentedForgot to fixed a test line.
Comment #113
tduong CreditAttribution: tduong at MD Systems GmbH commentedGreat! Now i can upload a clean patch! :D
Comment #114
BerdirLast round of cleanup.
Lets discuss this first.
dblog_row_limit is a weird name, has nothing to do with dblog. Should be redirect_404_row_limit.
Also, we still need to discuss if this should be in the redirect settings or this form.
wrong comment, no longer true. Just remove it.
Too much technical details. It's a service, someone could write a service that uses redis as storage. Then there is no table.
Suggestion: Mark a potentially existing log entry for this path as resolved.
Lets look at this wizard thing together. There has to be a better way IMHO.
I don't see one in statistics_views_data() for example
That doesn't sound like proper english:
Only log page not found (404) errors.
I still don't think this comment will work.
Otherwise to what? Either we actually have a @todo to do something, with an issue. Or we simply document it as a matter of fact.
Which would be something like this:
This is a fallback for the provided default view.
This is something that the view won't do yet, correct?
We can actually make it, by implementing access() with the same logic.
We discussed a second operation for later: Manually resolve/ignore a path.
So, lets rename this to RedirectOperations (also for machine names, keys, ..), then we can easily add more operations later.
Have a look at \Drupal\views\Plugin\views\field\FieldPluginBase::init. I think we can put this into the views data definition, then we don't need this method.
The interface is *not* SQL specific. Only our implementation is. So.. no Sql prefix.
Same here and in other cases here. As far as the interface is concerned, there is no redirect_404 table. Use a generic description instead, like 404 requests log.
we add french here and the other languages above?
there is no reason to specify every single tag here. It will break for no reason. Especially since all you do then is just look for "French" in the whole table?
What is that for exactly? If that's all you do, why not just do an assertText()?
Some of those options here don't make sense. This should only list:
* All
* unspecified (maybe labeled as "Any language")
* the actual langauges..
for implementation, it might be easiest to get the real languages and then just preprend the two others.
Same here. There's only one one table, //table/tbody/tr/td is specific enough.
Same for those additional language checks. why use this for those but assertText() for everything else?
Getting there :)
Why dont you just do 10 db_insert() queries in the first place instead of a loop and then update them?
Comment #115
tduong CreditAttribution: tduong at MD Systems GmbH commentedDone as suggested. Points 1, 6 need discussion, 13, 14, 16, 17 still need work, i ll do it tomorrow.
Comment #118
tduong CreditAttribution: tduong at MD Systems GmbH commentedFixed and cleaned code in the
Fix404RedirectUITest
andFix404RedirectUILanguageTest
, createdRedirect404TestBase
AssertRedirectTrait
which are imported in the 404 webtests, and cleanedFix404RedirectCronJobTest
kerneltest as suggested.Still have points 1 and 6 left.
Comment #119
Berdirdblog should no longer be required now.
Otherwise I think we're done, with the exception of the last two points there that we need to discuss with @miro_dietiker.
Comment #120
tduong CreditAttribution: tduong at MD Systems GmbH commentedRight, done.
Comment #121
tduong CreditAttribution: tduong at MD Systems GmbH commentedFixed last points (1 and 6), changed that docblock in the
RedirectFix404Form
and moved the cron settings into the redirect settings.Sorted! :D
Comment #122
Berdirtoo much change, you can implement this just like before, with a FORM_ID based alter. you just need to do a search replace from one form id to the other.
Comment #123
tduong CreditAttribution: tduong at MD Systems GmbH commentedSorted :P
Comment #124
tduong CreditAttribution: tduong at MD Systems GmbH commentedNow for real.. ^^'
Comment #125
BerdirOk, I'm out of things to complain about.
I think @larowlan plans to review/test this soon as well.
Comment #126
larowlanOne issue and some general observations
Redirect module still has a menu local task pointing to this, it need to be moved too. See redirect.links.task.yml
Should this typehint a new interface instead of the concrete implementation?
nit:These could all be injected
There's a hidden dependency here that could be injected too
Comment #127
miro_dietikerYay, awesome. :-)
Just for sake of completeness, potential follow-ups discussed are:
- Implement a redis backend, because it seems to be a great fit and is popular in D8 enterprise environments, at least for us.
- Add an "ignore" action on a 404 record that makes it hide permanently, even if repeating again.
- Figure out how we can get rid off the watchdog 404 entries.
Comment #128
Berdir1. Missed that, nice find. Yes, that needs to be fixed.
2. That's on purpose. The views data is tied to that specific implementation, a different implementation, even if it is sql based, might have a different table structure, then they need to provide their own views integration.
3. They could, but we're more or less just moving that form around. But yeah, lets fix it while we're touching it.
Comment #129
tduong CreditAttribution: tduong at MD Systems GmbH commentedFixed as suggested. About point 1, moved 404 related stuff from redirect.links.menu.yml as well.
Comment #132
tduong CreditAttribution: tduong at MD Systems GmbH commentedForgot to update the route name in other places..
Comment #133
BerdirInject all the things :)
Comment #134
larowlanGah, I thought this was committed and filed a new issue for an issue it creates #2780985: Ignore 404 paths longer than the maximum column length
Do we want to leave this RTBC and postpone the other one?
Comment #135
miro_dietikerHm, i would update this issue here and we can quickly review and RTBC it, so we have a healthy single feature issue.
No reply by Dave since 4 months. Hope we are still aligned with his expectations.
I also updated the issue about co-maintaining redirect: #2778701: Offer to co-maintain the redirect 8.x branch
Comment #136
larowlanOk - here is combined patch.
Comment #140
Berdirthis was something that we discussed. we can't make the key longer and keep it as primary/unique key.
Path may include query arguments, there is an argument to be made that we could cut them off. Or add an additional column and store the full path there.
Looks like the new test is missing a constructor argument?
But that's something that we can do later. For now, the most important part is to not throw an exception.
Comment #141
BerdirAh, I guess you wrote the patch against an older version?
Comment #142
toncic CreditAttribution: toncic at MD Systems GmbH commentedFixed test failing.
Comment #145
toncic CreditAttribution: toncic at MD Systems GmbH commentedAnd interdiff.
Comment #146
toncic CreditAttribution: toncic at MD Systems GmbH commentedFixed type mismatch.
Comment #147
larowlanThe patch size seems to have doubled?
Comment #148
BerdirYeah, there's a patch in there again, only checked the interdiffs...
Comment #149
miro_dietikerPlease clean your patch, it is double the size of the previous #136
You again uploaded a patch in a patch - see lang-aware-1559310.136.patch
I recommend to never use a "git add -A" or "*" and also scroll through every patch you create before you upload it.
Comment #150
toncic CreditAttribution: toncic at MD Systems GmbH commentedCleaning patch. I hope that is better now.
Comment #151
BerdirNow it looks good again.
Comment #152
miro_dietikerJust for completeness, providing interdiff to #133 when this was RTBC before.
Comment #153
BerdirWe had a problem on one site which receives a lot of invalid/weird input, including invalid UTF-8.
This resulted in a lot of exceptions. The attached interdiff adds a check to skip those. please update the patch and extend our unit test coverage for this case.
Comment #154
tduong CreditAttribution: tduong at MD Systems GmbH commentedUpdated patch including the patch above. Forgot about the test, I'll do it asap.
Discussed with @Berdir and he was considering a new feature (allow the user to set some paths to be ignored by the 404 redirect error listener) and was wondering if it should be in a followup, but he would like to have this in as soon as possible, so added this new feature + test.
Comment #157
tduong CreditAttribution: tduong at MD Systems GmbH commentedAdded a test for the UTF-8 thingy. Still need to figure out why the previous patch doesn't work and improve the test_only patch (but local test passes...).
Comment #160
Berdirwe should use the path matcher service here, $this->pathMatcher->matchPath($path_alias, $pages) as RequestPath does.
Comment #161
BerdirThe test-only patch shows by not failing that the test isn't executed at all. phpunit tests currently need to be in top-level tests folder or they are not executed by testbot.
I also think my feedback, has not been addressed yet. Also, the default config and schema must cover the existing setting too.
Comment #162
tduong CreditAttribution: tduong at MD Systems GmbH commentedDone as suggested above (comment #160).
Comment #163
tduong CreditAttribution: tduong at MD Systems GmbH commentedNew patch including the deleted file lines, since it doesn't apply otherwise.
Comment #168
tduong CreditAttribution: tduong at MD Systems GmbH commentedFixed test.
Comment #169
tduong CreditAttribution: tduong at MD Systems GmbH commentedDiscussed with @Berdir and he would like to add another feature here:
- add an "Ignore" operation in the "Fix 404 pages" page
- when "Ignore" an entry, mark this record as resolved, 'temporary page' to show what is happening (with path and langcode as query arguments), redirect to the Settings page, and add the deleted entry into the "Pages to ignore" textarea
- display a message saying that the entry has been deleted and ask to save the settings (so that the user could evtl. edit the new path to ignore)
Started implementing the controller. Still need to figure out how to add the Ignore operation in a view.inc file. I'll check it tomorrow.
Comment #170
miro_dietiker@Berdir, @tduong Is this only about static pathes as exact matches, or will we also apply globbing to match wildcards in excludes?
Comment #171
BerdirIt supports wildcards.
I think just ignorePath is enough here.
the language should not be part of the path to ignore, we strip that off from the language before we match it, no?
No, don't save it here. Pass it along as a query parameter to the settings page, like ?ignore=some/path.
Then in the form UI, read that, and append it. For example, the user might want to convert it into a wildcard or so.
Comment #172
tduong CreditAttribution: tduong at MD Systems GmbH commentedDone as suggested, feature still in progress. Renamed the controller to "Fix404IgnoreController" and it results as a new file.
Comment #173
tduong CreditAttribution: tduong at MD Systems GmbH commentedOk, fixed last leftovers here:
To be able to save the 'Path to ignore' configurations, the form names had to be changed and improved the test coverage for this 'Ignore' feature.
Comment #174
Berdirthe !empty isn't required· PHP is intelligent enough to not execute the part after || if the first condition returns true and if the first condition is FALSE, then the second is always true.
Comment #175
tduong CreditAttribution: tduong at MD Systems GmbH commentedDone.
Comment #176
BerdirTested the ignoring, the UI works well. However, the wildcard matching only worked for the first line. That's because you tried to unify the paths by removing the / on both, what you need to do is to have exactly one leading / on the incoming path.
The attached patch worked for me in manual testing. We need to check why our tests are not failing on this. Maybe we only tested the first pattern?
Comment #179
tduong CreditAttribution: tduong at MD Systems GmbH commentedThe problem is because in the test the stored paths to be ignored are without leading slash, which then don't properly match in the listener class (Redirect404Subscriber) and don't actually ignore the pages.
Fixed the test and added a check in redirect_404_logging_settings_submit() to make sure that we store the 'pages to ignore' with leading slash.
Also changed a check in Redirect404Subscriber::onKernelException() to be consistent and have the same thing as Drupal\system\Plugin\Condition\RequestPath::evaluate()
instead of "leading".
Comment #180
BerdirOne part that we are missing is hook_help().
#2725577: Improve redirect_help() updated the help text, we need to figure out a way to handle this. We could make it dynamic, and show a text to enable the optional submodule to do that if it's not enabled and the user has permission to do so. Or just move that part to redirect_404_help().
Comment #181
BerdirAlso, 99% sure this needs a reroll after my recent commits.
Comment #182
tduong CreditAttribution: tduong at MD Systems GmbH commentedRerolled and added dynamic text for redirect_404 in redirect_help():
1) submodule enabled: show description, link to "Fix 404 pages" page
2) submodule not enabled, user has permission: show description, link to "Extend" page to enable it
3) submodule not enabled, user doesn't have permission: don't show anything about "Fix 404 pages" in /admin/help/redirect
Opted for the dynamic text because otherwise we should place 2) in redirect_help() and add redirect_404_help() only to place 1) since atm we don't have/need any help text for redirect_404.
Comment #183
BerdirAs discussed, lets add a update function to enable the module.
Comment #184
tduong CreditAttribution: tduong at MD Systems GmbH commentedAdded update function to enable the submodule.
Renamed the last update function committed 2 days ago, fixed it wrapping its content in a check, as an exception message would appear (entity ID 'redirect_delete_action' already exists).
Tested locally, it works (supervised by @Berdir).
Comment #185
BerdirOk, I think this is it.
Will commit this soon.
Comment #189
BerdirOk, committed!
We've been using this for months on big NP8 sites and it has been working well for us after those initial fixes about weird broken UTF-8 characters and lenght problems. The wildcards are pretty new but well enough tested I think.
Comment #191
stella CreditAttribution: stella at Annertech for Glanbia commentedDoes the redirect_404 module have to have a dependency on the language module? I don't have a multilingual site, but I want the functionality of redirect_404 without having to turn that module on.
Comment #192
BerdirI think that's an oversight, try to just remove that dependency. We use the language manager, but that's not in language.module. Patches welcome :)
Comment #193
stella CreditAttribution: stella at Annertech for Glanbia commentedI patched the module to remove the dependency and disabled the language module, and didn't encounter any errors. Patch attached.
Comment #194
BerdirThanks, can you open a new issue for that, though? so we can propery let it be tested and credit you :)
Comment #195
stella CreditAttribution: stella at Annertech for Glanbia commentedBerdir, I did wonder what was correct protocol here. New issue created for it at #2851810: redirect_404 module has a dependency on language module