I have a number of other sites appearing in my cache folder. They are all wap sites: such as wap.artyphoto.net and wap.artwheel.com

They follow the normal boost cache directory structure, eg: cache/wap.artyphoto.net/0/index.html

This is quite worrying, I'm running on a Serverpoint VPS with unique IP address. I haven't seen any other security issues on my site until this.

Anyone know what could be going on here? I have the boost cache permissions set to read/write for webserver user and group.

Thanks

G

CommentFileSizeAuthor
#27 boost-495290.patch988 bytesmikeytown2
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

mikeytown2’s picture

Are you running a multi-site, with these other sites?

plan9’s picture

No - Just the one site.

mikeytown2’s picture

are you an admin of the other site in any way shape or form?

mikeytown2’s picture

Status: Active » Postponed (maintainer needs more info)
dsnydernc’s picture

I experience this as well. I have a dedicated server with many domains - Boost somehow detects the host domain of my server and creates a directory cache based on it as well as a few sub domains. I suspect this may have something to do with search engine crawlers reaching my site via the hostdomain.com/~user rather than the actual domain of my site. This isn't an issue for me as I recognize the domain name, but for someone on shared hosting I can see how this could be very alarming to see an unrecognized domain name in the file caches. I can't be sure this is the same issue as mentioned above - but hosting administrators often utilize sub and parent domain naming in shared hosting environments, unnoticeable to the end user but apparent to crawlers, which appears to create file caches.

mikeytown2’s picture

@davidsnyder is this for 5.x as well? Drupal has to boot up & boost code has to run in order for this to happen; thus if this is happening, it's very bad SEO due to duplicate content.

Here's a thread about the exact opposite #665772: How to make boost work with external pages

mikeytown2’s picture

Status: Postponed (maintainer needs more info) » Closed (fixed)

Closing all 5.x issues; will only reevaluate if someone steps up #454652: Looking for a co-maintainer - 5.x

Reason is 6.x has 10x as many users as 5.x; also last 5.x dev was over a year ago. The 5.x issue queue needs to go.

davidwhthomas’s picture

Sorry to post to a closed thread but I solved this same problem or other mysterious cache dirs appearing in the cache folder by adding $base_url to settings.php

e.g $base_url = 'http://www.example.com';

hth,

DT

giovassi’s picture

Version: 5.x-1.0 » 6.x-1.17
Status: Closed (fixed) » Active

I have the same problem but in my case are spam website like sina.com.cn!!! I found it because I'm investigating the reason why after weeks my website on VPS ran smoothly around 470MB. suddently rocked at almost 800Mb.. I'm afraid my website is under attack of spammer, I found a dozen of users in my site listed on Stopforumspam.com as spammers.

Can you explain to me what is the function of $base_url.? Is it suppose to allow only your website to be cached?
Thanks

giovassi’s picture

mikeytown2’s picture

@giovassi
Its in settings.php

/**
 * Base URL (optional).
 *
 * If you are experiencing issues with different site domains,
 * uncomment the Base URL statement below (remove the leading hash sign)
 * and fill in the URL to your Drupal installation.
 *
 * You might also want to force users to use a given domain.
 * See the .htaccess file for more information.
 *
 * Examples:
 *   $base_url = 'http://www.example.com';
 *   $base_url = 'http://www.example.com:8888';
 *   $base_url = 'http://www.example.com/drupal';
 *   $base_url = 'https://www.example.com:8888/drupal';
 *
 * It is not allowed to have a trailing slash; Drupal will add it
 * for you.
 */
# $base_url = 'http://www.example.com';  // NO trailing slash!

If you have some sort of way of replicating this bug, I would appreciate it. All that is created is empty directories correct?

giovassi’s picture

In CACHE/norn folder there was just an empty folder
In CACHE/perm folder it was like my website folder with sites, modules, misc sub folders and boost file pointing to those websites and IP's
I'm very upset!!

mikeytown2
I'm asking what it is the function "$base_url = 'http://www.example.com'" exactly has.
Thanks

mikeytown2’s picture

Category: support » bug

@giovassi
If you want this issue fixed, I need a lot more info then what your providing. Technically boost doesn't need $base_url to be set, it should work with it not set, thus this appears to be a bug. I've encountered a similar bug before; thought I took care of it, but I could be wrong. This bug should be harmless, all you will see is a lot of empty directories being created.

What version of PHP/Apache are you using?
Can I get a full directory tree of what shouldn't be there?
Are there any access logs so I can see what the URL looks like?

mikeytown2’s picture

Status: Active » Postponed (maintainer needs more info)
v1p3r1c3’s picture

I found out 1 website appeared in my cache folder that has an article that link to my website, it means they share my article on their website. But I'll still need to check the other sites why they appeared on that cache folder.

deepesh’s picture

this might be the explanation you are all looking for- http://drupal.org/node/842756

kriss683’s picture

I have a similar problem with my site. Domain names other than my own have appeared in the cache/normal folder.

I also have folders that are similar to my domain name, but not exact, appearing. For example, a folder called www.mydomain.com. (with a period after it) is in the cache/normal folder and has cached files. In addition, a folder with the IP address of my server is there.

I have global redirect installed, but it has not helped.

Any help would be appreciated.

Anonymous’s picture

The cache folder is set to 775 when I installed it, which means group write access. This could possibly be the cause of the problem on (some) shared server setups. Shouldn't it be 755 at least?

plan9’s picture

Sorry to have neglected this issue after having started it.

My situation is exactly as described by kriss683 in #17. I'm going to try settings $base_url = 'http://www.example.com' and will report back.

comat0se’s picture

Status: Postponed (maintainer needs more info) » Active

I've also got this same issue...

PHP 5.2.13
Apache/2.2.3 (Red Hat)

/var/www/html/cache/normal
drwxrwxr-x 2 apache apache 4096 Aug 4 21:02 www.qq.com
drwxrwxr-x 2 apache apache 4096 Aug 5 00:46 www.sina.com.cn

both dirs have a _.html in them

I can supply some access logs if you tell me more specifically what you need.

As a data point, I've always had the base_url set in settings.php, so that doesn't help here.

mikeytown2’s picture

what's inside _.html? your homepage?

comat0se’s picture

Yes it is my homepage with all the urls which would be my ServerName filled in with the odd domain name.

mikeytown2’s picture

Your site is run from the sites/default directory correct? It's a little odd that you would be getting this even with the base_url set. Long story short drupal doesn't care what the hostname is; if you hit the site with the "wrong" hostname it will still generate the correct output, thus boost will cache it. Core will cache it as well, just its put into a database table so you don't notice it.

Map your servers IP to any domain name and that domain name will show up in drupal's cache_page table; if all your using is the core cache. If using boost then you get a directory called that domain name. The best solution to this is to not use the default directory in your sites folder.

comat0se’s picture

You mean if I run my settings.php is in sites/default? If so, then yes... the site itself is run from the root httpd directory. Seems really strange that I would get have served up pages for qq.com and sina.com.cn because wouldn't that mean that the DNS for those sites were pointing to my webserver? Odd that someone else in this thread also mentioned sina.com.cn too.

mikeytown2’s picture

in your hosts file if you put

127.0.0.1 sina.com.cn

assuming 127.0.0.1 is the IP of your server, drupal will process the request fully. I have no reliable way to tell if this is a fake or real request; I could make a whitelist, but core doesn't deal with this issue.

Anonymous’s picture

How about checking if the file requested comes from the domain the Drupal installation is on? One could use the URL specified in settings.php for the check, or set on in the module. If the complete url does not match the domain and the path then it does not get added to the cache folder.

mikeytown2’s picture

Version: 6.x-1.17 » 6.x-1.x-dev
Status: Active » Needs review
FileSize
988 bytes

This patch requires the latest dev. This only works if $base_url has been set in settings.php.

deepesh’s picture

With ref. to a similar issue posted by me here - http://drupal.org/node/842756 , I would also like to point-out that in my case boost in not obeying it's cache directory setting and writing in drupal root.

Anonymous’s picture

Title: Other sites appearing in boost cache folder » Other sites appearing in boost cache folder or root folder Drupal installation

@deepesh

Is there any way for you to test the patch in a dev environment?

comat0se’s picture

For the record, I haven't added the chinese website to my hosts file... something else is going on.

mikeytown2’s picture

Someone else can though. If I know your IP I can add it to my hosts file and get the same effect.

deepesh’s picture

Is there any way for you to test the patch in a dev environment?

Sadly no, I guess the bug can be reproduced by searching a drupal site as posted in my thread and see if it produces the same effect.

mikeytown2’s picture

Status: Needs review » Active

committed #27
leaving open because there is probably more that needs to be done.

Terko’s picture

Hello!
I had the same issue before 3 weeks. Visitors from Social network site told me, that I have virus in my pages. Their antivirus started to alert them. And I found that somehow intruder injected in my cached pages the malicious script. I stopped Boost for while and almost forget.
My permissions are 777 of the cache folders. I run php-fpm and nginx.

mikeytown2’s picture

teri@uhaaa.com
Can I see the scripts contents? Use the contact form so send me more details; like one of the bad pages from the boost cache and the URL that was called if you know it.
http://drupal.org/user/282446/contact

Terko’s picture

I deleted them, but I will search to remember which virus was. I can remember slightly that it was some worm called Wordpress maybe.... I will write again soon.

Terko’s picture

The virus was HTML:Iframe-inf - that said one of my users for this page of my site (only for reference):
http://uhaaa.com/rakata-na-buda
I saw the java script injected in the cached copy of the page. When I cleared the cache, the same user told me, that the worm is gone. He alerted me, because his antivirus program alerted him. It's strange, but in the social network site where people discussed the link, this user reported this worm, but other users said that their antivirus is silent.

mikeytown2’s picture

#764494: boost-gzip-cookie-test redirection
Sounds like a false positive from this issue that I've now fixed in the latest dev. Iframes are not reliable so I switched over to ajax.

Danny Englander’s picture

Subscribing

mikeytown2’s picture

@highrockmedia
do you have anything to add? From what I can tell, this is operating how Drupal works. What files are in the other dir's? also read #23

Danny Englander’s picture

mikeytown2, no I don't have anything to add, I was interested in the issue so I subscribed. cheers.

plan9’s picture

Just reporting back - as the original poster - that enabling base URL has definitely fixed the problem for me. I'm running Drupal 5.x from sites/default directory on a VPS (this was opened as a 5.x issue).

I don't know if it;s related but I also had an issue with expired cache files not being deleted on cron and this seems to be fixed now as well.

Does having Base URL enabled require anything to be added to the .htaccess file?

I'm guessing not...

hanoii’s picture

I also noticed this.. one thing I am not sure if it was clear on the readme and also led to more domains is to put the boost .htaccess rulels below the www redirection, that's right, isn't it?

Anyway, will monitor this issue but what about the patch, makes sense to have it? I even saw one www.yahoo.com there, who it got there, I have no idea!

bgm’s picture

Status: Active » Fixed

Per #42, seems like it is resolved, so I am closing the issue.

From reading the other comments, it seems like these are Apache servers with a default vhost pointing to their Drupal site ("apachectl -S" should confirm this). So unless your base url is explicitely set, this is how Drupal works (c.f. #23).

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

tebdilikiyafet’s picture

Is it the reason why users can access my site via;

www.mydomain.com
pop.mydomain.com
mail.mydomain.com
smtp.mydomain.com

Also google bots indexed this pages.

Anonymous’s picture

Because your web host/ ISP has set up all the domain names to point to a single ip address and your web server is configured to send all the requests for that that ip to a single folder that holds your drupal installation.

tebdilikiyafet’s picture

So how can I solve this problem? I use direct admin and my dns configuration is same as with others. You can see my dns configuration here

Anonymous’s picture

You can use a rewrite rule to redirect all of the domains to a blank folder. Or redirect all of the domains that you don't want cached to your main web domain which would at least keep whatever google has already stored. There's an example in the normal drupal .htaccess for redirecting www which can be modified but this is not a boost issue.

gittosj’s picture

Ok - I'm getting the same problem as the OP again in my cache directory. I've deleted everything in the directory since whatever it was seemed to be caused my version of apache to seg-fault.

I'd guess it has something to do with the vulnerabilities mentioned here but I've tuned off boost until I understand quite what's happening.

gittosj’s picture

Version: 6.x-1.x-dev » 7.x-1.0-beta2
Issue summary: View changes
Status: Closed (fixed) » Active
Anonymous’s picture

Could you provide more details? also have you set the $base_url variable? Have you any proxy module enabled ? What is the contents of the cached pages that you mention. Please see comment #46 and below.

gittosj’s picture

No - you're right and my mistake. $base_url was not set on the parent site (there are a couple of child multi-sites. I've now corrected that and will let you know.

The domains and pages stored in the cache are weird. Domains are names like smarkincentivetravel.com and www.youdown.me all of which resolve to real pages. Within the cache are caches of the pages of my main (parent) site but with injected links to js scripts on the pages of the domain site. I'm sure someone brighter than me can work out what these are meant to do but presume its some sort of malware redirect. None of the rest of my site has been touched afaik since it is all locked down with SE linux etc.

I'll let you know if I have more probs or see anything interesting so setting this to close for the mo and than for the help. Presume its something to do with this:
http://drupalscout.com/knowledge-base/your-drupal-site-pretending-be-ano...

gittosj’s picture

Status: Active » Closed (fixed)
gittosj’s picture

Status: Active » Closed (fixed)
Anonymous’s picture

could you email me personally with the URL of one of those cache files ? I'd like to examine it. Boost is just caching the site's output, and though the article you link to states

other caching mechanisms may not be as robust against this problem.

I suspect that it is one of the modules including javascript that needs a look at as it could be a security issue that needs to be brought to someone's attention.

jvieille’s picture

I have this problem too.
There is only one Drupal site installed as multi-site - one actual folder and one redirect (not using default)à
Boost is caching all (non-Drupal) sites of the server.

Base_url is set
The Drupal site uses a different IP address from the other sites.
I tried to whitelist the Drupal site with no effect.

What can I do ?
Thanks for help

jvieille’s picture

Status: Closed (fixed) » Active

had to re-open it

suit4’s picture

Same here.
Today I found lots of other folders in the boost cache pointing to other sites.
I think this is the result of bots or scripts accessing non-existent urls on the site, thus triggering boost to cache the requested page result.

$_base_url is set in settings.php

If I can provide more information, feel free to ask.

jvieille’s picture

I found my problem:
Apache considered the problematic Drupal site as the default site for its dedicated IP address.
I suppose that Google bots found their way to many urls of other sites accessible from that IP in the context of the default drupal site.
There was also a misconfigured server that had an alias that was also set for the Drupal site.

The resulting erroneous crawling populated the boost cache.

I added a dummy virtual serveur alphabetically lower than the drupal site for the IP and remove the wrong serveur alias, which seems to have fixed the issue

jvieille’s picture

Category: Bug report » Support request
Priority: Critical » Normal

Doe not seem to be a bug, more a support discussion