Hello,

With the current state of the module, it is impossible to have the settings for a bundle set to "not index" and then the possibility to index per content.

I will upload a patch for that.

I currently don't have the time to make a better patch.

CommentFileSizeAuthor
#45 simple_sitemap-entity_not_indexed_by_default-2887373-45.patch6.45 KBkbrodej
#44 simple_sitemap-entity_not_indexed_by_default-2887373-44.patch6.36 KBethomas08
#42 simple_sitemap-entity_not_indexed_by_default-2887373-42.patch6.51 KBmarco-s
#37 simple_sitemap-entity_not_indexed_by_default-2887373-37.patch7.27 KBqubel
#36 simple_sitemap-entity_not_indexed_by_default-2887373-36.patch6.75 KBqubel
#23 interdiff-2887373-23-22.txt1.42 KBGrimreaper
#23 simple_sitemap-entity_not_indexed_by_default-2887373-23.patch5.88 KBGrimreaper
#22 simple_sitemap-entity_not_indexed_by_default-2887373-22.patch6.59 KBGrimreaper
#19 interdiff-2887373-19-17.txt11.21 KBGrimreaper
#19 simple_sitemap-entity_not_indexed_by_default-2887373-19.patch6.7 KBGrimreaper
#17 simple_sitemap-entity_not_indexed_by_default-2887373-17.patch7.4 KBGrimreaper
#4 simple_sitemap-allow_entity_override-2887373-4.patch3.83 KBGrimreaper
#2 simple_sitemap-allow_entity_override-2887373-2.patch896 bytesGrimreaper
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Grimreaper created an issue. See original summary.

Grimreaper’s picture

Assigned: Grimreaper » Unassigned
Status: Active » Needs review
FileSize
896 bytes

Here is the patch, thanks for the review.

gbyte’s picture

Component: User interface » Code
Priority: Normal » Minor
Status: Needs review » Needs work

After applying your patch, entities marked to be indexed but belonging to non-indexed bundles are not added to the sitemap. The reason is in StimeapGenerator.php on line 176:

        foreach ($bundles as $bundle_name => $bundle_settings) {
          if ($bundle_settings['index']) {
            $data_sets[] = [
              'bundle_settings' => $bundle_settings,
              'bundle_name' => $bundle_name,
              'entity_type_name' => $entity_type_name,
              'keys' => $keys,
            ];
          }
        }

So only bundles are added to $data_setswhich are to be indexed. Later on in EntityUrlGenerator.php, getBatchIterationEntities() does not fetch the entities of non-indexed bundles.

I do not see many use cases for this feature, hence marking low priority. I will gladly commit it when we get this to work though.

Grimreaper’s picture

Status: Needs work » Needs review
FileSize
3.83 KB

Hello,

Thanks for the review here is another patch.

The use case is to have indexation enabled for a bundle but the entities not indexed by default.

gbyte’s picture

Thanks for the patch. I have read through it without applying it and I do not understand why you are implementing a new parameter "active" considering the parameter "index" is already in place? The "index" parameter is supposed to define if an entity type is to be indexed, why the need for another parameter?
Tests also need to pass or need to be adjusted.

gbyte’s picture

Status: Needs review » Needs work
Grimreaper’s picture

Hello,

Thanks for your reply.

Tell me if I am wrong.

The goal is to have the possibility to enable the indexation per bunble of enabled entity type. And for each entities (node, taxonomy term, etc.) where the indexation is enabled, having the possibility to have the entity not indexed by default. With a priority of 0 the entity is still present in the sitemap.

As I had very little time to update the patch I end up with this patch.

I totally agree that the tests needs to pass before the patch will be commited. But I don't know when I will have time to update the patch.

gbyte’s picture

Status: Needs work » Closed (works as designed)

Tell me if I am wrong.

The goal is to have the possibility to enable the indexation per bunble of enabled entity type. And for each entities (node, taxonomy term, etc.) where the indexation is enabled, having the possibility to have the entity not indexed by default

The above is actually already implemented; let me check if we are on the same page: You can enable the sitemap settings for entity types and then enable indexation for certain bundles of that entity type. Then you can override this setting for a single entity of that bundle.

With a priority of 0 the entity is still present in the sitemap.

The priority doesn't have anything to do with whether the links end up in the sitemap. That's just info for search engines.

In regards to the feature of having bundles set to 'no index' and then the possibility to override this on a per-entity basis, this would be doable but would require some major refactoring and UI changes. I don't think there is enough interest from the community for me to implement this, but I am happy to review working patches.

Grimreaper’s picture

Title: Allow to have entity not indexed by default » Allow to have entity indexed by default
Category: Feature request » Bug report
Priority: Minor » Normal
Issue summary: View changes
Status: Closed (works as designed) » Active

Hello,

Since my last comment there has been some feature change (the behavior is almost the opposite as it was). The module does not behave like it did when I opened the issue.

Changing the title and the description accordingly.

I just made a test on a fresh Drupal 8.4.2 install with the last dev version of the simple XML sitemap module.

Steps :

  1. Enable simple XML sitemap module
  2. Go to the basic page add form => do NOT see the simple XML sitemap fieldset on the right: OK
  3. Go to the basic page content type edit form
  4. Enable indexation for this bundle
  5. Go to the basic page add form => SEE the simple XML sitemap fieldset on the right: KO, default value is "not indexed" (one good point is that now even if the bundle is set to indexed, you can remove individual entities)
  6. Go to the basic page content type edit form
  7. Disable indexation for this bundle
  8. Go to the basic page add form => SEE the simple XML sitemap fieldset on the right: KO (before if the bundle was not indexed, there wasn't the fieldset for each individual entity of the bundle), default value is not indexed

So now I see two points:

  1. If the bundle is set to not indexed, the simple xml sitemap fieldset must not be present on the node form
  2. if the bundle is set to indexed, the default value set for the bundle must be used on the node creation form
gbyte’s picture

@Grimreaper

The functionality we are discussing here has not changed a bit since you created this issue.

However you just found a bug I introduced in the last couple of commits. If you go a few commits back you will see everything is working as it used to. This is due to some refactoring I've been doing while preparing for a new release.

Thanks a ton for catching this, apparently no test covers this UI element. Will take a look now.

gbyte’s picture

Title: Allow to have entity indexed by default » 'Indexed' checkbox on entity/bundle edit pages wrong value
Assigned: Unassigned » gbyte

  • gbyte.co committed 900c08e on 8.x-2.x
    Issue #2887373 by Grimreaper: 'Indexed' checkbox on entity/bundle edit...
gbyte’s picture

Status: Active » Fixed

That's fixed.

As this issue came to life yesterday in the dev version and now it's fixed, I don't think we need a record of it, so feel free to revert the title/opening post/status/category of this issue to what it originally was.

Grimreaper’s picture

Hello,

Thanks @gbyte.co for your reply and the quick fix.

Here is my test result with the last dev version.

Steps :

  1. Enable simple XML sitemap module
  2. Go to the basic page add form => do NOT see the simple XML sitemap fieldset on the right: OK
  3. Go to the basic page content type edit form
  4. Enable indexation for this bundle
  5. Go to the basic page add form => SEE the simple XML sitemap fieldset on the right: OK, default values are the default values from the content type edit form.
  6. Go to the basic page content type edit form
  7. Disable indexation for this bundle
  8. Go to the basic page add form => do NOT see the simple XML sitemap fieldset on the right: OK

So now I see one point which is the original feature request.

Would it be possible to have the bundle indexable, but having the entities not indexed by default?

Do you want me to create a new issue?

gbyte’s picture

Assigned: gbyte » Unassigned
Category: Bug report » Feature request
Priority: Normal » Minor
Issue summary: View changes
Status: Fixed » Closed (works as designed)

Reverting the issue to its original topic and values. I am closing this for now because as mentioned earlier, I don't think there is enough interest from the community for me to implement this. I am however going to review and possibly commit patches.

gbyte’s picture

Title: 'Indexed' checkbox on entity/bundle edit pages wrong value » Override a non-indexed bundle on a per-entity basis
Grimreaper’s picture

Status: Closed (works as designed) » Needs review
FileSize
7.4 KB

Hello,

Here is a patch to allow to have a bundle indexed but the entities not indexed by default.

It also makes usage of CSS classes for the CSS.

Unfortunately, I ran into memory limit errors in my local environment when running the automated tests. Even at 2GB memory. So let's trigger the Testbot and I will fix errors.

gbyte’s picture

Status: Needs review » Closed (works as designed)

Thanks for your patch. I've tested it, unfortunately it didn't work as expected.

  • Unticking 'Index by default' does nothing. All entities of that bundle are still getting indexed. There seems to be still logic missingin EntityUrlGenerator::getDataSets(). EntityMenuLinkContentGenerator::getDataSets() would have to be adjusted as well.
  • UI is a bit confusing, as there now are 'index' and 'index by default' checkboxes. Maybe there should be something like 'Enable for bundle' and 'Index by default'.

Having said all that, I am still not sure if this feature should be implemented. The use case for having entities of a bundle not indexed by default is rare. But I will look into it if we have a working patch.

Grimreaper’s picture

Hello,

Here is new patch. There was a problem of not detecting default values. Also the previous patch was not handling existing config and entities.

Notes:

  1. Update path to be done for existing websites: but it will be done when we will agree of the solution
  2. Updating tests (if it fails): I can't run simpletest tests on my environment, opening a new issue to convert tests to PHpUnit is required before

The use case for having entities of a bundle not indexed by default is rare

I totally agree that it will be a rare case, that's why default value is to have bundle indexable. But I have the case with a client so that's why I am making patches. Don't worry that if I won't have the case I won't be losing my (free) time making patches on this issue...

Status: Needs review » Needs work

The last submitted patch, 19: simple_sitemap-entity_not_indexed_by_default-2887373-19.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

Grimreaper’s picture

Opening issue to convert tests to PHPUnit tests.

Grimreaper’s picture

Rebasing patch from comment 19.

Tests does not pass.

I will upload a new patch with tests passing soon.

Grimreaper’s picture

Assigned: Grimreaper » Unassigned
Status: Needs work » Needs review
FileSize
5.88 KB
1.42 KB

And here is a new patch.

Grimreaper’s picture

Assigned: Unassigned » Grimreaper
Status: Needs review » Needs work

I have a problem locally.

A content that should be indexable and not indexed by default does not have the fieldset on the node form.

Grimreaper’s picture

Assigned: Grimreaper » Unassigned
Status: Needs work » Needs review

OK the problem I had when manually testing was that on a content type form (admin/structure/types/manage/page), due to https://cgit.drupalcode.org/simple_sitemap/tree/src/Form/FormHelper.php#...

    Simplesitemap::supplementDefaultSettings('entity', $settings, ['index' => 0]);

The default values were "indexable" and "not index", so as it matched default form values, no settings were saved for this bundle so when simple_sitemap/src/Simplesitemap.php->getBundleSettings() was triggered it returned FALSE. Maybe if the settings does not exist, it should return default bundle settings.

But the "['index' => 0]" in:

    Simplesitemap::supplementDefaultSettings('entity', $settings, ['index' => 0]);

does not display default bundle settings to the webmaster when settings its content types. So it can cause confusion I think.

The last submitted patch, 22: simple_sitemap-entity_not_indexed_by_default-2887373-22.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

Grimreaper’s picture

Hello,

@gbyte.co: can you please review patch in comment 23?

gbyte’s picture

Status: Needs review » Closed (works as designed)

Hi Grimreaper,

sorry for getting back to you this late. The number of subscribers to this issue tells me that this is a very esoteric use case. This is why I have decided to leave it out of the 2.x branch of the module which just received its final feature release. I am now focusing on branch 3.x and #2961233: Allow for multiple sitemap types on one Drupal instance. I do not plan to add this feature to 3.x either, but you are welcome to port this patch to 3.x for others to benefit from your work.
I Hope you will understand.

gbyte’s picture

Grimreaper’s picture

Hello gbyte.co,

I don't think estimating the interest in a feature by checking the number of followers on the issue is a good approach.

For Core, surveys and studies are made to ask people what they want.

As for bugs, for one person reporting a bug, there are a lot of other people that says nothing and they got the bug. Same for feature requests.

Also when settings on your drupal.org user account notification for all issues on a project, you don't appear in the followers.

If I had written those patches it is because I have a client that needs this feature, so I made a first patch on my paid time. I didn't created the feature request without reason.

And as I want patches merged (to have it integrated and other people can benefit from it and no need to update the patch constantly), I have made other patches on my free time.

Since I can't convince you of the usefulness of this feature request and as I have other things to do on my free time, I won't made any more patch about that and so no more free time and energy loss. If my client want the module updated, I will made patches on my paid time and keep it in the client project.

gbyte’s picture

Status: Closed (works as designed) » Closed (won't fix)

Hey Grimreaper,

I do believe that in a project of this size, the community voices their interests through participating in an issue, like you did. While I appreciate the work you have put into this patch, I have not heard another voice confirming the usefulness of this feature. Furthermore, I did not find the end result elegant. This is not because of your coding, but probably because of the constraints of the current data model. On top of that the patches did not work for me as expected and required additional time on my part.

Regarding the last bit, I believe I know what it means to use my free time in order to keep this project relevant, as I am not getting paid to participate in any of the issues in this project. I also believe I am doing a good job considering there are no open bug reports. I am sure you can appreciate, that being he only maintainer, you have to pick your battles and do what seems to be most relevant to most.

Having said this, the project benefited from you patching the tests and I am thankful for this contribution!

In regards to updating this patch for your client, there should be no struggle as long as you stay on the 2.x branch. If no critical bugs get discovered, 2.12 will have been the last release for 2.x.

Grimreaper’s picture

Hello gbyte.co,

I completely agree with your points.

Maintaining a project is hard between managing your time (paid or not) and satisfying users' requests or not if it will negate the project in the long term.

And thanks for your positive feedbacks.

rbayliss’s picture

Just wanted to voice a +1 for this feature. We would like to switch a large site from XMLSitemap to this module, and this is going to be necessary for us to make the transition.

gbyte’s picture

Thanks for your input. Could you please elaborate on the why? Just so we are on the same page in terms of the necessity of this feature.

rbayliss’s picture

Sure, yeah. Our situation is that we have a single content type that for the most part we don't care whether Google indexes it or not, but certain pieces of content are much higher value. So we would like the default to be not to include it in the sitemap, with an option to include it. Using XMLSitemap, we can do it. With Simple Sitemap, it's currently impossible.

Version isn't terribly important to us since we don't have any existing config - we're fine with using 8.x-3.x as long as there's a timeline for getting that version to a stable release.

qubel’s picture

Hello,

Updating the patch to be appliable on the 8.x-3.x branch.

I know that the issue is closed but we still have the need on the project.

Thx for the review.

Regards, Quentin.

qubel’s picture

Sorry I forgot the modification in schema.yml file.

Regards,

drup16’s picture

Priority: Minor » Normal
Status: Closed (won't fix) » Needs review

Looks like there is a patch that can be reviewed by the community. I don't agree that number of followers is a good indication of the impact to the community. Some people may also look the other way because it is marked as closed (won't fix).

At the end of the day, it seems like the community is asking for a default setting as opposed to completely locking down the content type, which is the current problem.

comment 35 describes our use case as well.

Status: Needs review » Needs work

The last submitted patch, 37: simple_sitemap-entity_not_indexed_by_default-2887373-37.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

jennypanighetti’s picture

I agree, and I am surprised this issue hasn't been resolved yet. Here's my problem.

Default content type setting: "include items in sitemap"
-- individual nodes can be set to be excluded

Default content type setting: "exclude items from sitemap"
-- individual nodes CANNOT be set to be included, despite the actual widget specifically saying "Settings for this entity can be overridden here."

If you want to make this editable on a per-node basis, then you have to make it editable no matter what the default setting is.

kbeck303’s picture

I have tested the patch in comment #37 running Drupal 8.7.6

I also added granular permissions with a patch on this other issue https://www.drupal.org/project/simple_sitemap/issues/3033792

marco-s’s picture

I've updated the patch to be applicable for 3.5.0. #37 didn't work for me.

orlando.thoeny’s picture

When the sitemap gets updated by cron runs, only the frontpage, and entities of a non-indexed bundle, where a per entity override is in place are indexed.
Very likely connected to this patch, because the issue didn't occur previously.
Also tests are failing.

ethomas08’s picture

Re-rolled patch for D8.8.0 core and 3.7.0 simple_sitemap release.

kbrodej’s picture

Re-rolled patch. Did not notice issues reported in #43

osopolar’s picture

The UI for this fix (workaround) is not intuitive to me. I would have expected something like:

If the Simple XML Sitemap bundle setting "Do not index entities of type @bundle in variant Default" is selected,
the sitemap entity settings radio-buttons shouldn't be disabled anymore to allow the selection of "Index this @bundle entity in variant Default".

Probably this would have required some mayor refactoring as mentioned by gbyte at #8, otherwise it would be much cleaner. At least there should be a description to explain how this Option "Enable indexing for this bundle" option works and what is the difference compared to "Index entities of type @bundle category in variant Default".

gbyte’s picture

Version: 8.x-3.x-dev » 4.x-dev
EmersonWeb’s picture

For what it is worth -- we have a lot of interest in this functionality.

Here is a specific use case that without a viable solution/patch, a lot of extra manual work is created and we are exposed to an SEO penalty for thin content.

We mantain a faculty/staff directory which has nodes created and unpublished via an integration with Workday. That directory allows for users to submit bios. If a faculty/staff member doesn't have a bio, their node essentially is a blank page. This is regarded as "thin" content by Google and an excess of these could negatively affect our SEO.

We could solve this very easily by defaulting the content type's inclusion in a sitemap to NO and then when users add their bios, manually allowing that inclusion. But this module only works in the reverse. You include by default and then manually remove.

The net effect of this is to have to monitor an automated process constantly for new data -- in effect defeating the purpose of the automation in the first place.

Hopefully you'll reconsider this in your 4.x development.

chucksimply’s picture

Any updates here? Allowing individual entities to be indexed even if the bundle is excluded is quite a common need. Seems like the latest patch breaks module functionality.