Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Hi,
I got this error :
Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 1788416 bytes) in /home/vbe/issycom/sites/all/modules/contrib/browscap/import.inc on line 104
I think it happend when browscap try to parse the new configuration files that he get automatically.
Comment | File | Size | Author |
---|---|---|---|
#62 | 62_vs_44.patch | 1.22 KB | AohRveTPV |
#62 | browscap-7.x-2.x-import_by_divisions-2418473-62.patch | 10.26 KB | AohRveTPV |
#59 | files_for_vbn.zip | 4.41 KB | AohRveTPV |
#44 | 44_vs_39.patch | 641 bytes | AohRveTPV |
#44 | browscap-7.x-2.x-import_by_divisions-2418473-44.patch | 10.03 KB | AohRveTPV |
Comments
Comment #1
AohRveTPV CreditAttribution: AohRveTPV commented.
Comment #2
AohRveTPV CreditAttribution: AohRveTPV commented.
Comment #3
AohRveTPV CreditAttribution: AohRveTPV commentedA workaround is to increase the PHP setting
memory_limit
to say 192M or 256M.Comment #4
AohRveTPV CreditAttribution: AohRveTPV commented.
Comment #5
AohRveTPV CreditAttribution: AohRveTPV commentedLooked into this and the problem is that PHP
parse_ini_string()
andparse_ini_file()
are not very memory efficient.The official php-browscap extension has also had this memory usage problem (see https://github.com/browscap/browscap-php/issues/20). Their current release branch uses
parse_ini_file()
. In their 3.x branch, though, they implement their own INI parser that is very memory efficient. The Browscap module could potentially use this parser.However, I found a simpler way to reduce memory usage some: browscap.org provides a JSON download, and PHP's JSON decoder seems to be more memory efficient than their INI parser.
This patch reduces memory usage from 115MB to 81MB with Browscap data version 5308. (Your peak memory usage may vary if the import occurs within a cron run.) Please review and/or test.
Comment #6
AohRveTPV CreditAttribution: AohRveTPV commentedSummary of possible solutions:
- Increase
memory_limit
inphp.ini
.- Apply patch in #5 which decreases memory usage by about 30%.
- Re-use memory-efficient custom INI parser from php-browscap 3.x.
- Something else I haven't thought of.
Comment #7
gregglesIs this work merged into patch in comment #26 of #2305223: Optimize database access in _browscap_import()?
Comment #8
gregglesNeeds a reroll after the changes in 2305223.
Comment #9
AohRveTPV CreditAttribution: AohRveTPV commentedUpdated patch, reduces memory usage from 115MB to 86MB.
After rerolling, memory usage was only reduced to 100MB. The user agent de-duplication code introduced in #2305223: Optimize database access in _browscap_import() became the peak usage. So I changed some of that code around and got the memory usage back down to 86MB. It could probably be reduced back to 81MB with further effort but I think this patch may be good enough.
Import time is not noticeably different. I did not yet verify that the loaded data is the exact same after this change.
Comment #10
gregglesThanks I'll try to review this soon. Some quick questions:
Does GJK mean something to you? Those are my initials and I sometimes use them in variable names for something that I want to change for testing but don't want to commit. I don't see them in the existing code, so I think we should use something more descriptive of the purpose.
What are you using to measure the memory used?
Comment #11
AohRveTPV CreditAttribution: AohRveTPV commentedIt is in the Browscap JSON data--not sure what it means. Maybe a coincidence on the initials.
By the way, this line (and only this line) looks to be ineffectual and could be removed:
$version
is never used. The unpatched code assigns$version
but apparently never uses it.print memory_get_peak_usage();
at the end of_browscap_import()
. Not sure if there's a better way.Comment #12
AohRveTPV CreditAttribution: AohRveTPV commentedGary Keith was an early maintainer (the first?) of the Browscap data. Maybe an initials collision.
Comment #13
AohRveTPV CreditAttribution: AohRveTPV commentedCompared browscap table produced by #9 vs. browscap table produced by HEAD, and there is at least one difference. One has string "true" and "false" values, the other boolean 1 and 0 values. Need to investigate further.
Comment #14
AohRveTPV CreditAttribution: AohRveTPV commentedbrowscap table produced by this patch is identical to table produced by HEAD. Same peak memory usage as #9, no change in speed. Not sure if there is a simpler way to convert TRUE and FALSE to "true" and "false".
Comment #15
gregglesGreat, committed that as well.
Thanks!
Comment #17
gregglesWhoops, that comment was meant to be posted on #2305223: Optimize database access in _browscap_import(). I'd still like to review this a bit more.
Comment #18
lias CreditAttribution: lias commentedThis memory exhausted error is currently preventing a brand new install of browscap from fetching the data using ver. 7.x-2.2+5-dev. So unable to get to work with 128M of memory.
Comment #19
AohRveTPV CreditAttribution: AohRveTPV commentedIsabug, I just checked the peak memory usage of latest 7.x-2.x with Browscap data version 6000: 89MB when using
drush browscap-import
, and 83MB through the web interface.That memory usage is close to 128MB so I suspect if your site itself uses a decent amount of memory, a 128MB limit will be exceeded.
Would be interested to know (if you have time):
- Is the 128MB memory limit is also exceeded on your system with a new, minimal Drupal installation?
- Is the memory usage is better or worse than the 7.x-2.2 stable release with Browscap data version 6000?
You might also try importing the data using
drush browscap-import
. Not sure but that may reduce the memory usage in your environment.Comment #20
AohRveTPV CreditAttribution: AohRveTPV commentedIsabug, I just realized that 7.x-2.x-dev does not yet include the patch in #14. Please try that patch if you can, it should improve the memory usage.
Comment #21
AohRveTPV CreditAttribution: AohRveTPV commentedIf you are running PHP 5.2 or above, this patch for Browscap 6.x-2.x should improve memory usage. I measured 75MB when using the "Refresh browscap data" button to import Browscap data version 6000.
Unfortunately, it is not suitable for commit because Drupal 6 requires only PHP 5, but the patch needs PHP 5.2 for
json_decode()
.Comment #22
gregglesIt's fine for a contrib to require a higher version of php, especially since older versions of php are no longer supported by the php project or by many hosting companies. Just add a php = requirement to the .info file.
Comment #23
peterx CreditAttribution: peterx commentedhttps://github.com/salsify/jsonstreamingparser is probably the next step. One test of the streaming parser cut 12 MB down to 1 MB. You could update the database row by row or in small batches.
Comment #24
AohRveTPV CreditAttribution: AohRveTPV commentedThanks, did not know this.
At least RHEL 5 (and derivatives like CentOS 5) are still supported and are packaging/supporting PHP 5.1, so it seems realistic that some sites may still be on PHP 5.1. Since
json_decode()
is introduced in 5.2, perhaps the safe solution would be use it conditionally on PHP version.Comment #25
peterx CreditAttribution: peterx commentedThe jsonstreamingparser does not use json_decode(), removing another problem.
Comment #26
AohRveTPV CreditAttribution: AohRveTPV commentedA recent branch of the Browscap PHP extension has a custom INI parser that is very memory efficient (as mentioned in #5). This JSON parser is a good suggestion but I wonder if the INI parser might be a better choice since the Browscap project will presumably make sure it works with their data. However, using either parser would probably be a welcome improvement over the current code.
If we use a third-party parser, it should be installed as libraries rather than directly including it in the module, I guess? So the user installs the module then
hook_requirements()
tells them to install the library? Maybe the code could fall back to using the less efficient parser.The Composer file indicates jsonstreamingparser requires PHP 5.3, so PHP version seems like it would still be an issue.
Comment #27
lias CreditAttribution: lias commented@AohRveTPV thank you for the info re: patching 7x-2x-dev. I have done so and when I refreshed browscap data it worked! It still took a while but at least the memory required was less than the 128 it was initially breaking under. I will keep auto updates disabled at this time.
FYI -I am unable to use drush due to drupal on a windows box.
Thank you for the assistance.
Comment #28
peterx CreditAttribution: peterx commentedDecoding a .ini file with the ini_string function should be slightly easier than decoding json. You can read the file line by line, collect a section, then decode and update the database. The database update will be the slowest bit. Aggregating several rows in each update will make the process faster. Given that it runs in the background, the main problem with a really slow process will be timeouts.
Would be nice if the Browsecap people provided a set of smaller files or just a weekly delta file.
Comment #29
peterx CreditAttribution: peterx commentedRan a test reading a local file one section at a time through parse_ini_string():
Read the browscap file php_browscap.ini.
Memory usage: 1,838,624 bytes after 0.028544 seconds and 0 lines.
Memory usage: 1,840,584 bytes after 1.284847 seconds and 398955 lines.
Read the browscap file at full_php_browscap.ini.
Memory usage: 1,839,296 bytes after 0.027104 seconds and 0 lines.
Memory usage: 1,841,512 bytes after 2.769346 seconds and 994676 lines.
Read the browscap file at http://browscap.org/stream?q=Full_PHP_BrowsCapINI.
Memory usage: 1,838,632 bytes after 0.031708 seconds and 0 lines.
Memory usage: 1,844,136 bytes after 23.075820 seconds and 994676 lines.
Comment #30
AohRveTPV CreditAttribution: AohRveTPV commentedProperties inheritance is a challenge for parsing the INI file in parts. To load all the parent properties for a browser, those parents (currently) have to be in memory. With all the properties in memory, I think the memory usage might still be high. One solution would be to change Browscap to not inherit properties at import as suggested in #2411405: Reduce browscap table size, which would seem to allow loading only one browser's properties into memory at a time.
Comment #31
peterx CreditAttribution: peterx commented_browscap_save_parsed_data() batches the import inserts for efficiency. We can use that.
Cache default properties because it is used frequently.
Read the sections until you hit a section with the default properties as parent.
Process that batch.
The insert code could still aggregate a 1000 rows.
Comment #32
AohRveTPV CreditAttribution: AohRveTPV commented#31 sounds like a pretty good approach to me. If you are implementing this I would be happy to review/test.
Comment #33
peterx CreditAttribution: peterx commented@AohRveTPV, almost there.
Read the browscap file at php_browscap.ini.
Memory usage: 1,732,768 bytes after 0.029620 seconds and 0 lines.
Memory usage: 27,764,280 bytes after 23.525675 seconds and 398955 lines.
Comment #34
peterx CreditAttribution: peterx commentedIf anyone wants to experiment, here is my code and browscap in the same lump. My code is run from a test page containing:
$file = '/home/peter/Downloads/browscap/php_browscap.ini';
$import = new browscap_import(FALSE, $file);
Testing from the browscap site is a problem because they switch off access after a small number of accesses. You then have to wait 24 hours.
The database inserts are not batched to 1000. They are batched based on a common parent. You could add extra code to check for 1000 entries then detect the change of parent.
PHP < 5.3 will still require batch .ini decoding through a temp file.
Comment #35
izmeez CreditAttribution: izmeez commentedThanks patch in #21 works well for 6.x-2.1
Comment #36
AohRveTPV CreditAttribution: AohRveTPV commentedThanks, peterx. I did not realize before that each part of the INI file delimited by header comments was self-contained. (It contains all parents for the sections except for Default Properties.) That makes parsing in parts fairly straightforward.
This patch re-implements your basic approach (I think) on top of
import.inc
. We could use your code too, but I was not sure how to integrate it with the existing code, and it seems a ways off from complying with the coding standards. If you want to integrate your code and change it to follow the coding standards, I would be glad to work on it instead.Before patch:
Execution time: 55s
Peak memory usage: 123MB
After patch:
Execution time: 95s
Peak memory usage: 44MB
Comment #37
AohRveTPV CreditAttribution: AohRveTPV commentedPatch in previous comment was misnamed.
Comment #38
peterx CreditAttribution: peterx commentedThank you AohRveTPV for looking at my code as a proof of concept instead of a patch. My other code additions were purely to make testing easier by reading a local file.
The code format style is, in part, Drupal 8. I prefer some of the changes in Drupal 8 because they are closer to the most readable code styles. It is hard going back to the D7 style.
Comment #39
AohRveTPV CreditAttribution: AohRveTPV commentedThis patch fixes problems with #37 and is ready for review/testing.
The INI file is parsed by "divisions". (Divisions are what the Browscap project calls the parts of the INI file it generates that are delimited by header comments.) We read one division at a time, parse it as an INI string/file using the built-in PHP functions, then save it.
This reduces memory usage to 42MB from 123MB with Browscap data version 6001, but increases import request time from 55s to 85s on my system with MySQL InnoDB. I think this is a good trade-off because it will avoid the PHP memory limit in common configurations, but the long request time will probably not cause any problems it was not already.
A downside of this patch is it makes the INI parsing a bit more fragile. Rather than just relying on a well-formatted INI file, we add some assumptions about the file:
- It is organized in divisions which are separated by header comments prefixed with 40 semicolons.
- The default properties division is the third division.
- Each division after the third inherits only from parent sections within its division, other than the default properties section.
I added some assertions for the first two of these assumptions so that importing will hopefully fail gracefully if the Browscap project changes how they are formatting the INI file. (The Browscap project has no reason to expect external code is relying on this organization.)
A problem is that when reading in pieces it was not easily possible to remove case duplicates globally. So I included the code from #2310537: 'useragent' field not case sensitive, which changes the
useragent
column of the Browscap table to be case sensitive.Chunking database inserts 1000 user agents at a time was removed as peterx did because it does not help performance noticeably--processing by divisions effects chunking.
Comment #40
gregglesI thought we decided in #2310537 that case-sensitivity was a bad idea?
Comment #41
AohRveTPV CreditAttribution: AohRveTPV commentedYes. The case duplicates are seemingly useless because
browscap_get_browser()
does a case-insensitive query (as doesget_browser()
). So we decided to just remove case duplicates.The problem though is if the INI file is being processed in parts as in #39, there is no way to know whether a user agent in the current part has a case duplicate in a later part of the file. (Parsing the entire file at once in order to remove duplicates as we are currently doing would defeat the purpose of parsing in parts to reduce memory usage.) Some possible solutions:
1. Keep a list of all user agents encountered and only insert user agents not in that list.
2. Query the database to make sure user agent is not already present before inserting.
3. Preprocess the raw INI string and remove case duplicates before parsing.
4. Change
useragent
column to binary so case duplicates are allowed.#1 would increase memory usage and increase execution time. (How much I'm not sure.) #2 would increase execution time. #3 would work but requires us to write code to parse/edit the INI string. #4 is simpler to implement than #3 and avoids having to deal with case duplicates at all.
Even with the
useragent
column being binary,browscap_get_browser()
is still case insensitive. So there seems to be no change in behavior. Are there any downsides to the column being binary?Comment #42
peterx CreditAttribution: peterx commentedI tried 1 and 2. 2 increased the import to hours. For one, I stored only the lower case versions and compared with lower case. The time was not much different.
Comment #43
AohRveTPV CreditAttribution: AohRveTPV commentedpeterx, I just quickly implemented the approach of keeping track of all inserted user agents and only inserting user agents that are not case duplicates of members of that list.
Before:
Peak memory usage: 42MB
Execution time: 85s
After:
Peak memory usage: 60MB
Execution time: 85s
Indeed the execution time is no different, but the memory usage is worse. Maybe I am doing something inefficiently.
This patch shows changes made versus #39.
Comment #44
AohRveTPV CreditAttribution: AohRveTPV commentedMinor code simplification versus #39. No noticeable performance impact.
Comment #45
AohRveTPV CreditAttribution: AohRveTPV commentedI looked into reducing the memory usage of #44. Peak memory usage is 27MB with the downloaded INI string in memory, but increases to 42MB once that string is passed to
strpos()
. PHP seems to allocate a buffer for the entire string passed tostrpos()
. So I do not see a simple way to reduce the memory usage further.#33 reported 27MB. I suspect this lower number is due to either:
1. Measuring memory usage using
memory_get_usage()
instead ofmemory_get_peak_usage()
. With #44, for instance, memory usage spikes to 42MB duringstrpos()
but returns to 27MB immediately after.2. Reading from an INI file instead of an INI string in memory.
Comment #46
AohRveTPV CreditAttribution: AohRveTPV commentedRegarding the case issue, making the useragent column case sensitive currently seems to me like the ideal solution. The Browscap data is case sensitive, so if we make the Browscap module storage of it case sensitive, the problem is solved. It adds complexity to the code (and inefficiency, apparently) to manipulate the case.
Maybe there are drawbacks to making the column "binary"? I do not know of any. It is supported by the Schema API and is not a hack:
Edit: I suppose it could change whether user agent look-ups are case sensitive with database systems other than MySQL. Will test.
Comment #47
AohRveTPV CreditAttribution: AohRveTPV commentedI tested case sensitive look-ups with several database systems:
Before patch in #44:
MySQL: insensitive
SQLite: insensitive
PostgreSQL: sensitive
After patch in #44:
MySQL: insensitive
SQLite: insensitive
PostgreSQL: sensitive
So the change of the user agent column to binary does not affect the case behavior for those database systems.
Comment #48
AohRveTPV CreditAttribution: AohRveTPV commentedRe #47:
We could, for consistency, make PostgreSQL also do a case insensitive look-up if we used
db_select()
instead ofdb_query()
. TheLIKE
condition gets replaced withILIKE
by the PostgreSQL implementation of theQueryConditionInterface
.Alternatively, since
db_query()
supposedly gives better performance, we could just decide we do not care that PostgreSQLbrowscap_get_browser()
is case sensitive while PHPget_browser()
, MySQLbrowscap_get_browser()
, and SQLitebrowscap_get_browser()
are case insensitive.Whichever, the column being binary seems like it should not matter. Even if the useragent column is case sensitive, the LIKE condition can still be case insensitive.
Comment #49
peterx CreditAttribution: peterx commentedUsed some new code to download the browscap file to a local temp file:
Reading http://browscap.org//stream?q=PHP_BrowsCapINI
Memory usage (real) at start: 1835008
Memory usage (real) at end: 2097152
The file can then be read row by row for local processing.
https://www.drupal.org/sandbox/peter/2475021
Comment #50
AohRveTPV CreditAttribution: AohRveTPV commentedThat is impressive! My understanding is this approach also requires parsing the INI string in divisions (as done in #36 or #44). I propose that we try to get the parsing by divisions committed first, then we could extend to further reduce the memory usage using row-by-row file reading. Parsing by divisions is already a complex change that will need review/testing.
Comment #51
peterx CreditAttribution: peterx commentedI have #44 running on two sites including:
https://petermoulding.com/device-detection-in-drupal-using-browscap
Comment #52
VBN CreditAttribution: VBN commentedI have a question about browscap.
Was running fine but after the automatic update i had the same error as topic starter.
Even worse, because i could not run cron anymore...
So I wanted to install a previous version again.
The strange thing is that i disabled the module, and the error disappeared.
My site seems to be running fine without the browscap module (witch i installed with the themekey module)
I did some test and the only thing what is not working right now is the automatic detection and theme-switch for mobile devices.
- What else does it do? Is it really necessary besides themekey/Mobile switch?
- Would it be ok to delete the module and reinstall to receive previous update?
Comment #53
peterx CreditAttribution: peterx commented@VBN, it is not the browscap software, it is the downloaded file growing larger every day. There is no easy way to request an older smaller file. There are three alternatives for mobile detection.
* Use the browscap patches to use less memory. Hopefully this will become a proper dev release soon.
* Replace themekey with another module that does not use browscap.
* Replace Browscap with another module that feeds into Themekey.
I have a site that needs Themekey only for mobile and I wrote a module that uses just a few lines to detect the word "mobile" in the user agent string. At another site, I produced something to feed into Themekey. The two could be combined. You would not get all the other information returned by Browscap.
Comment #54
VBN CreditAttribution: VBN commented@Peter,
Thx for your interest.
Strange that a downloaded file keeps messing with memory.
But i must admit, I don't really understand what browscap does....never mind that.
Is it possible to remove that downloaded file and just start over again?
(Like you should create as a new install.)
I'm very interested in that little workaround you created.
I just want users to have the mobile theme automatically selected if they use a mobile device.
But i really like to keep Themekey, it's a nice and easy feature.
Since it's the same purpose with me, maybe you would care to share your solution?
Maybe better to contact me in PM if you think it's appropriate to reply here.
Thx in advance.
Comment #55
AohRveTPV CreditAttribution: AohRveTPV commentedYes, you can just disable, uninstall, then enable the module. That will remove all downloaded Browscap data. However, you may get the "memory exhausted" error when you try importing (i.e., refreshing) the data again.
Note: The downloaded Browscap data is stored in the database and not a file.
The module just provides a way to look up browser information for a given user agent string. For example, Browscap can translate
Mozilla/5.0 (Windows NT 6.1; rv:31.0) Gecko/20100101 Firefox/31.0
to information about that browser and its capabilities.So the module is mainly used as a dependency for other modules that need to be able to do such a look-up.
Comment #56
AohRveTPV CreditAttribution: AohRveTPV commentedWas hoping for one more person to review/test #44 before setting the issue to RTBC. Thanks for reporting back it is working for you, peterx.
Comment #57
peterx CreditAttribution: peterx commented@VBN, testing the revised Browscap module would help push the current change into a dev or production release.
After you test it and report the results back here, you might like to test the little alternative I put in a sandbox while experimenting with ways to shrink Browscap.
https://www.drupal.org/sandbox/peter/2484347
The Browscap file contains more information, is used on tens of thousands of sites, and will be better maintained.
Comment #58
VBN CreditAttribution: VBN commentedHi Peterx,
I updated my site with the 7.x-2.2+5-dev version and added the path from #44.
No problems when running cron at this point.
But when i push the "Refresh browscap data"-button the status remains on Never fetched?
PS: Could you upload the patched browscap.install and import.inc files here?
Maybe i did something wrong...:)
Comment #59
AohRveTPV CreditAttribution: AohRveTPV commentedAny errors in the Drupal watchdog? PHP errors in the web server log?
The request could be timing out before completion.
Done.
Comment #60
VBN CreditAttribution: VBN as a volunteer commentedOk AohRveTPV,
Thank you so very much.
I replaced my files in the module with your uploaded ones.
Working like a charm! i refreshed and i have version 6002 now.
So it was my faulty patch replacement...(tsss, i really need to use Cygwin)
I can confirm that patch #44 is working.
You should throw in production stage.
Great stuff you share here!
Comment #61
AohRveTPV CreditAttribution: AohRveTPV commentedReviewing #44 I found at least two mistakes in comments. Will fix.
Also want to add type hinting to the new functions.
Comment #62
AohRveTPV CreditAttribution: AohRveTPV commentedChanges:
1. Clarify/fix a comment.
2. Fix apparent error in parsing data with PHP <5.3.
Need to test the second change.
Comment #63
AohRveTPV CreditAttribution: AohRveTPV commentedGreat, thanks for reporting back that it is working.
The Git for Windows installer includes MinGW, which provides a Unix-like shell. In that shell you can apply a patch using
git apply -v foo.patch
. It probably also includesdiff
, which can also be used to apply a patch.Comment #64
VBN CreditAttribution: VBN as a volunteer commentedThx, meanwhile i found that Netbeans provides a nice easy way to patch.
Comment #65
AohRveTPV CreditAttribution: AohRveTPV commentedWould like to just remove the PHP <5.3 code because it seems useless and I cannot find an easy way to test it:
#2509122: Remove PHP 5.2 or earlier support
Comment #66
izmeez CreditAttribution: izmeez commentedRegarding 6.x further to comment #35 and patch in #21, now after 2 months cron is choking with memory exhausted.
Comment #67
AohRveTPV CreditAttribution: AohRveTPV commentedThanks for the update, izmeez. It looks like the Browscap project changed the JSON file to have the same data as the full INI file:
https://github.com/browscap/browscap/pull/594
This change approximately doubled the size of the file and proportionally increased the memory usage. Previously the JSON file contained the same data as the standard INI file, which is currently used by unpatched Browscap.
So, the patches in #14 and #21 are no longer useful. The approach used by the 7.x-2.x patch in #62 should still work. It could be ported to 6.x-2.x.
As a hack, you could change the URL used in
import.inc
to that of the "lite" INI file:Change:
http://browscap.org/stream?q=PHP_BrowsCapINI
To:
http://browscap.org/stream?q=Lite_PHP_BrowsCapINI
That should work if the "lite" file contains the properties you need.