Problem/Motivation

Fix robots.txt to allow search engines access to CSS, JavaScript and image files

Steps to reproduce

View the robots.txt which has allowed css, js files , images from core and profiles folder but the same is not the case for for modules and themes folder

Proposed resolution

To allow css, js and images under modules and themes folder.
#modules
Allow: /modules/*.css$
Allow: /modules/*.css?
Allow: /modules/*.js$
Allow: /modules/*.js?
Allow: /modules/*.gif
Allow: /modules/*.jpg
Allow: /modules/*.jpeg
Allow: /modules/*.png
Allow: /modules/*.svg

#themes
Allow: /themes/*.css$
Allow: /themes/*.css?
Allow: /themes/*.js$
Allow: /themes/*.js?
Allow: /themes/*.gif
Allow: /themes/*.jpg
Allow: /themes/*.jpeg
Allow: /themes/*.png
Allow: /themes/*.svg

Remaining tasks

User interface changes

Introduced terminology

API changes

Data model changes

Release notes snippet

Similar issue was fixed in Drupal 7 : https://www.drupal.org/project/drupal/issues/2364343

But it also did not include sites/all/modules or sites/all/themes (as is the path for core and contrib modules/themes in drupal 7)
Was that done for security reasons ?

Issue fork drupal-3489435

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

hash6 created an issue. See original summary.

cilefen’s picture

Version: 10.5.x-dev » 11.x-dev
Category: Bug report » Task
Status: Needs work » Active

This seems to be a task or a feature request rather than a bug. robots.txt is editable by site operators.

saurav-drupal-dev’s picture

Assigned: Unassigned » saurav-drupal-dev

Working

anmolgoyal74’s picture

Issue tags: +DrupalCon Singapore 2024, +Bug Smash Initiative
StatusFileSize
new239.86 KB

I think the CSS, JS and Images are allowed to be crawled. In D7, the modules, themes, profiles and misc directories are not not allowed to be crawled that's why it was add in robots.txt of D7. I think we can close this.

https://git.drupalcode.org/project/drupal/-/blob/7543eae758df91a217e6464...

screenshot of D7 robots.txt

binnythomas made their first commit to this issue’s fork.

stborchert’s picture

Issue tags: -DrupalCon Singapore 2024 +Singapore2024

saurav-drupal-dev’s picture

Status: Active » Needs review

I have updated the robot txt file please review.

saurav-drupal-dev’s picture

Assigned: saurav-drupal-dev » Unassigned
smustgrave’s picture

Status: Needs review » Needs work
Issue tags: +Needs framework manager review

Think we need some research first. Anyone check the history and see why these were added? Also this seems like something a project can do themselves. There's also https://www.drupal.org/project/robotstxt

poker10’s picture

Status: Needs work » Closed (cannot reproduce)

robots.txt does not seems to block /modules or /themes directories, see: https://git.drupalcode.org/project/drupal/-/blob/11.x/robots.txt?ref_typ... . This is different from D7 behavior, which blocks these directories by default.

I agree with #4 that there is probably nothing needed here. Feel free to reopen if you still think anything is needed, but will need an issues summary update in such case. Thanks!