When very long text is entered in the body of a node and input filter contains Line break converter filter, it doesn't let the content be displayed in the view tab, but it is available in the edit tab.
This seems to happen inside "_filter_autop" function.
Steps to reproduce:
1- Go to Create content > Page (or any other node type)
2- Enter a title
3- For body, enter a text that is longer that 40000 characters
4- Submit
5- Now in the view tab, body is not displayed
If you edit the body and enter less content (Under 30000), it will be viewed.
My configurations:
- Windows XP
- Apache 2.2.3
- PHP 5.2.0
- Drupal 5.1
Comment | File | Size | Author |
---|---|---|---|
#63 | autop-pcre-limit.patch | 1.28 KB | John Morahan |
#41 | autop-pcre-limit.patch | 1.47 KB | John Morahan |
#35 | autop-pcre-limit.patch | 1.66 KB | John Morahan |
#33 | test_regex_133188.php_.txt | 3.72 KB | frega |
#27 | autop-pcre-limit.patch | 2.17 KB | John Morahan |
Comments
Comment #1
chx CreditAttribution: chx commentedHardly critical. I would like to see this reproduced on various OSes and PHP versions before trying to find the regex among the many which runs out (if it's indeed autop).
Comment #2
Cherrr CreditAttribution: Cherrr commentedHave the same problem that was described. This is very disagreeable bug. I traced the problem on FreeBSD dedicated server. At home Windows 2000 PC (with Apache, PHP, MySQL) all works fine.
Comment #3
Cherrr CreditAttribution: Cherrr commentedThe problem is not in the drupal filter module but in the php settings.
Find and uncomment this strings in php.ini:
;pcre.backtrack_limit=100000
;pcre.recursion_limit=100000
then set it to
pcre.backtrack_limit=1000000
pcre.recursion_limit=1000000
for example.
Comment #4
artem_sokolov CreditAttribution: artem_sokolov commentedConfirming this issue on Drupal 5.3 (PHP 5.2.4)
The #3 recipe has worked for me by putting in settings.php:
Comment #5
ricabrantes CreditAttribution: ricabrantes commentedI put 147000 characters and works very well..
D.5-dev apache on windows xp and mysql, browser Firefox 2
Comment #6
gpk CreditAttribution: gpk commentedOK so it appears from the above that the problem is with config of PHP (actually only in PHP 5.2.0 or later, which introduced the PCRE limits http://uk3.php.net/manual/en/ref.pcre.php). See also http://bugs.php.net/bug.php?id=40846.
Solution appears to be to increase the limits as per #4 in settings.php. The link to the PHP bug above actually suggests 10,000,000 as a more sensible limit (i.e. 100 times the PHP default of 100,000, and 10 times the suggested 1,000,000 at #4). Might want to check that we are in fact increasing the system's current limits...
Would probably need to be addressed in 6.x/7.x first and then backported to 5.x but as there's no patch yet I'll leave it against 5.x since that's where most people will be hitting this problem at the moment.
@ricabrantes: what version of PHP are you using? What are the values of pcre.backtrack_limit and pcre.recursion_limit (e.g. from phpinfo)? Are you saying that you reproduced this bug, and that #4 fixed it?
Comment #7
ricabrantes CreditAttribution: ricabrantes commentedMy versions are: Windows xp sp3(beta), PHP 5.2.5, MySql 5.0.51, Apache/2.2.6, pcre.backtrack_limit 100000 and pcre.recursion_limit 100000..
I tested on Firefox 2.0.0.12, ie6, opera 9.26 and Safari 3.0.4 for windows..
I can´t reproduced the bug, the text is show very well..
Comment #8
hubris CreditAttribution: hubris commentedUpping the limits to:
ini_set('pcre.backtrack_limit', 10000000);
ini_set('pcre.recursion_limit', 10000000);
doesn't seem to solve the problem. I've entered these values in settings.php, and the changes are confirmed in PHPinfo()
I have a 70,535 character count node and it will not display under the view tab...
Drupal 5.7
PHP 5.2.5
Apache (Unix)
Shared hosting environment
-Chris
Comment #9
hubris CreditAttribution: hubris commentedSome additional information:
I've found some errors listed in my hosted site's control panel error log regarding these long character nodes that I'm trying to edit/submit:
[Mon Mar 31 14:27:33 2008] [error] [client 70.137.148.72] ALERT - configured request variable value length limit exceeded - dropped variable 'field_body_of_chapter[0][value]' (attacker 'IP.address', file '/example_home_directory/index.php'), referer: http://www.examplesite.com/en/node/2170/edit
[Mon Mar 31 13:51:30 2008] [error] [client 70.137.148.72] ALERT - configured request variable value length limit exceeded - dropped variable 'field_body_of_chapter[0][value]' (attacker 'IP.address', file '/example_home_directory/index.php'), referer: http://www.examplesite.com/en/node/1481/edit
After some searching I found this error is related to the Suhosin Hardened PHP extension. Specifically the suhosin.request.max_value_length value of 65000 . My problem node/post of 70,535 characters is exceeding these limits and the field: 'field_body_of_chapter[0][value]' is being dropped... I didn't fully determine when it's being dropped (when I submit the node, when I view the node, etc). But somewhere in the process of creating/viewing the node, it's getting ...killed by this limit.
I've tried increasing these limits via ini_set, but they don't take hold, phpinfo() returns the same 65000 limit:
(tried:
ini_set('hphp.post.max_value_length', 180000); <--- how I've seen it described on other forums
ini_set('hphp.request.max_value_length', 180000);
and
ini_set('suhosin.post.max_value_length', 180000); <--- how the variable actually appears in my phpinfo()
ini_set('suhosin.request.max_value_length', 180000);
)
My host provider recently upgraded to PHP 5.25 (which may have included Suhosin Hardened PHP) - I have existing long nodes in the database, and they are displayed under the View tab. But any edits I try to submit to these existing long text nodes are not submitted - so the problem appears to be occurring during the Submit phase...
So, for those people (like myself) who first try to solve the problem with
ini_set('pcre.backtrack_limit', 10000000);
ini_set('pcre.recursion_limit', 10000000);
and still don't see the 'unable to view long text nodes problem' go away, try looking at your PHP setup to see if the same Hardened PHP restrictions are in place.
-Chris
Comment #10
catchI ran into this because I kept getting completely empty node contents displayed on my site seemingly at random.
http://drupal.org/node/225335 was duplicate. This is a nasty one.
Comment #11
catchComment #12
gpk CreditAttribution: gpk commented@hubris: Just to clarify/confirm: I conclude that in your case the problem you were having is unrelated to the original problem of PCRE limits but a specific restriction on your server's PHP setup which I can't imagine Drupal should try to work round (i.e. there is no way on your server of POSTing more than 65k in the node body). Thanks for the update since that does at least clarify the situation.
Also just to note that the problem is much less likely to occur prior to PHP 5.2.0 since the PCRE limits were essentially reduced with this version of PHP.
Comment #13
catchYeah I should note my install is standard debian etch on 5.2, and we have a bunch of articles which break this limit. Bumping this back to critical since it's a pig to track down and we had a load of visitors asking about 'missing pages' etc.
Not to mention everyone will be running 5.2+ when we release.
Comment #14
hubris CreditAttribution: hubris commented@gpk: You are correct on the clarification/confirmation: the problem I'm having is not the PCRE limits -- the 65k POSTing limit I'm experiencing is due to the Suhosin Hardend PHP POST limits as setup by my host provider, and isn't something that Drupal development needs to take into account.
-Chris
Comment #15
yngens CreditAttribution: yngens commentedi guess i am having the same issue here. #4 did not help.
Comment #16
gpk CreditAttribution: gpk commented@yngens: what are the values of pcre.backtrack_limit and pcre.recursion_limit reported by phpinfo() on your server? Also is it running suhosin (again should be reported by phinfo()).
Also what is the size of the post you are trying to make?
Comment #17
yngens CreditAttribution: yngens commentedgpk, i don't remember, but i am sure i tried even bigger numbers than ones recommended here. not sure about suhosin too - i decided to require users to divide big posts into chapters instead of putting everything into one post as a workaround. but the rpbolem is still there and when i have little more time i will try to test again and report here. thanks
Comment #18
gpk CreditAttribution: gpk commentedOK awaiting your input ...
Comment #19
Renirtor CreditAttribution: Renirtor commentedI have had this empty node problem after upgrading to php5.x from php4.x
I put this code in my php.ini file in root directory:
the longest node of my site (53896 characters including spaces, 55704 bytes) is now showed again.
Since it solved my needs, I did not set a higher limit because I read here about some side effects:
http://de.php.net/manual/en/pcre.configuration.php
but it's good to know that it worked and that a higher value can solve the problem for longer nodes.
Thanks,
Renirtor
Comment #20
ajayg CreditAttribution: ajayg commentedAlternative solution
Just want to confirm saw this issue when I upgraded from php 4.x to php 5.2.6
drupal 5.12
php 5.2.6
Linux Fedora FC8
Resolved the issue by trying solution in comment #3.
But you can also use the paging module which solved the problem without making changes to PCRE limit.
Comment #21
John Morahan CreditAttribution: John Morahan commentedComment #22
ajayg CreditAttribution: ajayg commented@John Morahan
Do you mean that by appplying the patch you don't need to update PCRE limits?
Comment #23
John Morahan CreditAttribution: John Morahan commentedThat's the idea, yes.
Comment #24
John Morahan CreditAttribution: John Morahan commentedthe idea is that the new regex just replaces the \n\n and variants without trying to remember the bits in between.
moving this issue back to filter.module
Comment #25
John Morahan CreditAttribution: John Morahan commentedwith test
Comment #26
mr.baileysRegarding the test:
Comment #27
John Morahan CreditAttribution: John Morahan commentedComment #29
John Morahan CreditAttribution: John Morahan commentedapparently an installer change confused the testbot
Comment #30
ajayg CreditAttribution: ajayg commentedI made a mistake (sorry don't know why)and resubmitted the patch in #25 as well for retesting. I hope this does not affect testing started previously for patch in #27. If it conflicts, My apologies. Should be more careful next time. I am suspecting even if this may not conflict, the system message about result may conflict since all it says results about "last patch submitted" rather than what time/date the retesting was requested. In that case my request would be the last one.
Comment #31
chx CreditAttribution: chx commentedNicely done.
Comment #32
webchickEh. Can we please have a couple of of the 20 people or so who reported having this issue testing the patch?
Comment #33
frega CreditAttribution: frega commentedHmm, chx "assigned" me this issue to review ... but unlike chx i can be (and was) distracted ... so i am not sure whether my input is still relevant ...
Well ... the last patch replaces a backtracking regex with a simpler regex. Testing strings of pcre.backtrack_limit-length is kinda superfluous now, as there is no backtracking or recursive regex in the _filter_autop function left, that could run into that "limit". I would suggest removing the addition to filter.test, and can re-roll the patch if needed.
Yet the new regex leads also to a slightly different output than the old regex - there's whitespace in the last < p >-Tag (which has no impact in HTML). This could be trivial, but as I am no regex-ninja, there could also be other implications I don't see ... I have attached a demo script - illustrating the (trivial?) difference.
Comment #34
John Morahan CreditAttribution: John Morahan commentedThanks for the review frega!
Yeah, I forgot to handle the ending \n's as a special case like the beginning (and also dropped a \n from the final
</p>
). Will fix later.I do think the test (or something like it) should stay, so that it will fail if someone later makes a change that unintentionally runs into these limits again. It's not always immediately obvious from looking at a regex how it will behave in these situations.
Comment #35
John Morahan CreditAttribution: John Morahan commentedComment #36
John Morahan CreditAttribution: John Morahan commentedComment #37
cburschkaGood patch. Assuming it is enough to test the string exactly at the limit, rather than a longer string...
Comment #38
John Morahan CreditAttribution: John Morahan commentedwell, $this->randomName() adds a short prefix too
Comment #39
Dries CreditAttribution: Dries commentedCommitted to CVS HEAD. Thanks.
Comment #40
catchComment #41
John Morahan CreditAttribution: John Morahan commentedUntested backport.
Comment #42
abu3abdalla CreditAttribution: abu3abdalla commentedthank you
Comment #43
Damien Tournoud CreditAttribution: Damien Tournoud commented#537788: Urgent : When there are many unicode words node shows up as just blank rendering the site useless was a likely duplicate.
Comment #44
Dave ReidThis fixed a problem I had on my local install that had a backtrace limit of 1000. Also tested that increasing the backtrace limit also solves the problem, but this is a good fix. Took me 30 minutes to debug that it was the line break filter and lead me to this issue.
Comment #45
tuffnatty CreditAttribution: tuffnatty commented+1 for patch in #41.
Comment #46
soxofaan CreditAttribution: soxofaan commentedI can also confirm that patch from #41 fixes the problem on my setup
(marked duplicate: #711056: Max Number of Lines in Body)
Comment #47
lilyzm CreditAttribution: lilyzm commentedI have this problem after updated from 6.15 to 6.16.
The pach fixes the problem.
Comment #48
mr.baileys@lilyzm: thanks for testing and confirming that the patch works. Version needs to remain at 6.x-dev though, as that's where fixes are applied.
Comment #49
varkenshand CreditAttribution: varkenshand commentedDoes this mean (April 4 today) that D 6.16 has a problem that won't let me save long nodes? As that is what's been happening to me last week. And I can't get it solved.
Also tried to use the filter module of the 6.x-dev version to no avail.
Comment #50
varkenshand CreditAttribution: varkenshand commentedThe long nodes problem has gone away. In my case I narrowed it down to a combination of php5 and MySQL4. Upgrading to MySQL5 solved the issue. Also changed pcre settings just to be on the safe site :o)
Comment #51
joachim CreditAttribution: joachim commentedConfirming this patch fixes the problem and marking #794256: Page "Body" limit? as a duplicate.
Ready to commit! :D
Comment #52
sanduhrsThe patch in #41 is working well.
Comment #53
Gábor HojtsyThere is not much talk about the output differences of before and after the patch. Who tested that apart from just looking at whether PHP chokes or not? Changed output could break sites, themes, etc.
Comment #54
Dave ReidI tested it locally before I used an .htaccess solution and the patch was working just fine. Seeing as the exact same fix went into D7 and we haven't had any regression problems, I'd say it's back to RTBC.
Comment #55
Gábor HojtsyUhm given that D7 is not out and the upgrade path is spotty at places, people did not update their D6 sites either, right? Are you sure we can consider that a proof of regression testing?
Comment #56
Dave ReidI've said I've tested it manually with my D6 install and I'm using the patch currently on several sites with no problem, I'm not sure what more you need. :/
Comment #57
Gábor HojtsyTaking a quick look at the regular expression being replaced, I think it did not always add paragraph tags for example, while the new one adds at least one paragraph wrapper. This made me suspect we are breaking some backwards compatibility here. Am I missing something?
Comment #58
John Morahan CreditAttribution: John Morahan commentedYes that's correct, if you pass it an empty string it will add an empty
<p></p>
. Originally the very next regex immediately removed that paragraph tag, so I didn't think it was a problem. Now three new rules have been added in between. Still, they all search for specific tags (li/blockquote), so I don't think they should affect this special case.Comment #59
joachim CreditAttribution: joachim commented> Yes that's correct, if you pass it an empty string it will add an empty
Won't that break themes that test on the content of the body text being empty?
Comment #60
John Morahan CreditAttribution: John Morahan commentedno, because it's removed again before it gets anywhere near the theme
Comment #61
John Morahan CreditAttribution: John Morahan commentedLet me clarify that.
First, this creates the empty
<p></p>
:Next, these three fix up some wrongly nested tags, none of which occur in our
<p></p>
chunk:Finally, this removes the
<p></p>
:So that we are left with an empty string, as before.
Comment #62
Gábor HojtsyOk, looks like other filter related critical issue commits invalidated this patch recently:
Comment #63
John Morahan CreditAttribution: John Morahan commentedIt still applies with -F3 which is what I used for my description above. Here is a clean reroll.
Comment #64
Gábor HojtsyOk, this was already reviewed and explained before so committed, thanks.
Comment #66
Ludo.RThe #4 solved the problem for displaying the node.
However, the content of the body is not indexed in the drupal search.
Is the this issue fixed in version 6.16 or 6.17?
I may consider doing the upgrade from 6.15 then.
UPDATE : Correction, i just forgot to re-index the search! There is no problem anymore!
Comment #67
ajayg CreditAttribution: ajayg commentedNext time could you please close the issue that you activated, if it is not longer happening?