Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Problem/Motivation
I would like to be able to add multiple URLs with the same path but different url query strings to the sitemap.xml file.
Steps to reproduce
- Create a basic page with the url of
/testpage
. - Go to
/admin/config/search/simplesitemap/custom
and in the custom url area put in/testpage?id=111
and/testpage?id=222
. Check the “regenerated sitemap” as you save the page. - Validate that the sitemap has only the first URL listed from above and not the other.
Proposed resolution
- Make a patch that fixes it to accept multiple URLs with the same path but different url query strings. Treat those urls as unique.
Remaining tasks
- Create the patch
Comment | File | Size | Author |
---|---|---|---|
#7 | urls-with-different-query-3178229-7.patch | 3.26 KB | WalkingDexter |
Issue fork simple_sitemap-3178229
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
minorwm CreditAttribution: minorwm at Isovera for Acquia commentedComment #3
minorwm CreditAttribution: minorwm at Isovera for Acquia commentedComment #4
rgristroph CreditAttribution: rgristroph at Acquia commentedI tested this patch. I applied it, created a basic page node with the url "/aaaaa", and then I went to the custom URL portion of the sitemap settings, and I added "/aaaaa?a=aaaa" and "/aaaaa?a=bbbb" and then I regenerated the sitemap. The checkbox for checking for duplicates was not checked. After re-generation, the urls "/aaaaa?a=aaaa" and "/aaaaa?a=bbbb" and "/aaaaa" were all in the sitemap.
If the checkbox for checking for duplicates is checked, then "/aaaaa?a=aaaa" and "/aaaaa?a=bbbb" are there, and "/aaaaa" with no url arguments is not there.
Comment #5
rgristroph CreditAttribution: rgristroph at Acquia commentedReading the patch and thinking through this further, I have a couple of thoughts:
* It could check the url argument for equality instead of just the presence of the "?"
* There could be a test
* We could make this behavior configurable
* But for my use case this is working well, I need to allow a site admin to enter custom URLs that are the same except for different arguments, due to how a custom embed on some of our pages works, and then make that appear in the sitemap for google search indexing reasons, and this solves our problem
Comment #6
gbyte CreditAttribution: gbyte at gbyte commentedGood point, two URLs with different query parameters should not be seen as duplicates. This has less to do with custom links - de-duplication is used with all URL generators. I agree that #3 is not enough, but I wouldn't like to make the de-duplication feature much heavier - either this can be implemented cleanly, or a warning should be shown underneath the de-duplication checkbox informing that the feature does not work with query parameters.
Comment #7
WalkingDexter CreditAttribution: WalkingDexter at Initlab commentedComment #8
WalkingDexter CreditAttribution: WalkingDexter at Initlab commentedAlso fix a de-duplication bug for the case when there is more than one entity in the data-set (
EntityUrlGenerator::processDataSet()
).Comment #9
WalkingDexter CreditAttribution: WalkingDexter at Initlab commentedComment #10
gbyte CreditAttribution: gbyte at gbyte commented@WalkingDexter Thanks for the patch. I propose the following change:
https://git.drupalcode.org/issue/simple_sitemap-3178229/-/commit/8767f63...
Reason for using this in other generators is: Menu links can be saved with query strings as well and we don't know what contrib modules may do to content entity canonical URLs.
Reason for storing the path with the query in
['meta']['path']
instead of using the additional['meta']['query']
is just general simplicity. If you feel strongly about this being a bad idea, let me know why.Comment #13
gbyte CreditAttribution: gbyte as a volunteer and at gbyte commentedMerging it for now; feel free to object before June where I intend to release.
Comment #14
WalkingDexter CreditAttribution: WalkingDexter as a volunteer and at Initlab commentedCurrent code breaks the URL variants feature, because variants have the same system path. This problem was fixed in my original patch. However, I'm not sure about that solution. Maybe it would be better to check the final URL string?
Another thing - duplicate lines (364 and 367) in
SimplesitemapTest::testRemoveDuplicatesSetting()
.P.S. Tests are failing (not related to this issue). I tested locally with Drupal 9.3.13.
There were 2 errors:
1) Drupal\Tests\simple_sitemap\Functional\SimplesitemapTest::testEmptySitemap
Behat\Mink\Exception\ExpectationException: Current response status code is 200, but 404 expected.
2) Drupal\Tests\simple_sitemap\Functional\SimplesitemapTest::testSetEntityInstanceSettings
Behat\Mink\Exception\ExpectationException: The string "node/2" appears in the HTML response of this page, but it should not.
There were 2 failures:
1) Drupal\Tests\simple_sitemap\Functional\SimplesitemapTest::testGenerationResume with data set #1 (1000, 500, 3, array('de'))
Failed asserting that 667 matches expected 334.
2) Drupal\Tests\simple_sitemap\Functional\SimplesitemapTest::testGenerationResume with data set #2 (1000, 500, 5, array('de', 'es'))
Failed asserting that 600 matches expected 200.
Comment #15
gbyte CreditAttribution: gbyte as a volunteer and at gbyte commentedGood point about URL variants. So I reverted
removeDuplicates()
to what it was before and it is working correctly. I don't know why tests are failing now; it appears 509172800164a0467c1f9c532b9cf5c4d61d07da broke them - maybe you can have a look?Comment #16
gbyte CreditAttribution: gbyte as a volunteer and at gbyte commented@WalkingDexter
Thanks for looking into the tests. Any chance you can check why testRemoveDuplicatesSetting() is still failing? I found no problems when testing manually.
Comment #18
WalkingDexter CreditAttribution: WalkingDexter as a volunteer and at Initlab commented@gbyte I fixed the tests and improved the
removeDuplicates()
method, please take a look.Comment #19
gbyte CreditAttribution: gbyte as a volunteer and at gbyte commentedThank you, that works.