Meta split from #2448765: Element::children sort order undefined and slower than it could be - This makes tests fail in PHP7.
Problem/Motivation
- Core uses uasort in quite a few places, but uasort does not preserve the sort order
Hence while:
$test = [
'foo:1' => [ '#markup' => 'foo - first' ],
'bar:2' => [ '#markup' => 'bar - second' ],
];
works
$test = [
'foo:1' => [ '#markup' => 'foo - first' ],
'bar:2' => [ '#markup' => 'bar - second' ],
'first:0' => [ '#markup' => 'little things', '#weight' => -1000 ],
];
does not work when run via Element::children($test, TRUE); (until #2448765: Element::children sort order undefined and slower than it could be - This makes tests fail in PHP7 is committed)
It returns:
Array
(
[0] => first:0
[1] => bar:2
[2] => foo:1
)
which is wrong.
This affects Drupal 7 as well as 8 and is the reason why explicit weights are needed so often in 7 even though it _should_ work without.
Proposed resolution
- Add a generic drupalUASort functionality that does preserve the weights.
- Convert all users of uasort to this helper class to ensure we don't have inconsistency in core with very weird weights bugs later.
Remaining tasks
- Discuss
- Fix generically
User interface changes
API changes
| Comment | File | Size | Author |
|---|---|---|---|
| #126 | 2466097-nr-bot.txt | 155 bytes | needs-review-queue-bot |
| #72 | 2466097-72-uasort-integrity-tests.patch | 55.8 KB | Anonymous (not verified) |
| #43 | 2466097-test-uasort-fail.patch | 4.11 KB | Anonymous (not verified) |
| #14 | core_uasort-2466097-14.patch | 52.58 KB | dpopdan |
| #10 | interdiff-2466097-7-10.txt | 18.37 KB | goz |
Comments
Comment #1
sher1 commentedYou are not explicitly listing what the sort method is that is being used. We need to know the exact uasort example that is failing so that we can determine what approach will be necessary to solve the problem. The uasort takes an array and a function as arguments so if the function isn't correct, the sort order will be affected.
Comment #2
rumburak commentedComment #3
pingers commented@sher1 Drupal\Component\Utility\SortArray::sortByWeightProperty was the affected sort function in the linked issue, but this is a meta issue to investigate all uses of uasort(), not just one specific case.
Comment #4
dpopdan commentedComment #5
fgmOne simple way of handling this in the wrapper drupaluasort could be to add the array index to each element, so that comparison functions callbacks could be wrapped and apply the array index as a secondary comparable when the primary callback returns equality. This would ensure stable sorting.
Comment #6
dpopdan commentedAs I see, we have to use a helper function as @fabianx suggested, I tried to modify the sort callback for uasort, but probably this solution will not work 100% correctly, so I will try to create a helper function that assigns value the comparison key before it is passed to uasort.
Comment #7
dpopdan commentedComment #8
dpopdan commentedComment #10
goz commentedAdd
use \Drupal\Component\Utility\SortArrayand then callSortArray::drupalUASort.Should be on 2 lines to respect line break.
Comment #12
goz commentedThis patch needs tests for drupalUASort.
This changes make tests failed.
Comment #13
fabianx commentedNote:
https://github.com/vanderlee/PHP-stable-sort-functions
is a generic library with replacement functions like suasort - which are 100% BC compatible.
I think it would be best to use that one as it seems to have quite heavy tests for all cases :).
Comment #14
dpopdan commentedI added the library to composer.json and vendor, it is really hard to work with the vendor directory because it is included in the repository. That really should be removed. Some test failed because 'static::sort' callback was sent and the suasort function couldn't find the function, so I replaced this callbacks. No interdiff is required because it was a totally new solution.
Comment #15
dpopdan commentedComment #17
dpopdan commentedI don't really understand what the CI error is.
Comment #20
fabianx commentedI think we might be better off copying and forking suasort for now into a static function in the Utility class.
I forgot that new libraries need project governance approval and I don't think the code duplication is a problem.
Can then do:
uasort -> Drupal\Component\Utility\SortArray::suasort()
globally and should work way better with unit tests, too.
Comment #21
joelpittet@Fabianx tried a quick port to a utility class with static functions with the PHPUnit tests + travis still working. On the right track?
https://github.com/joelpittet/PHP-stable-sort-functions
Comment #22
fabianx commented#21: Looks great to me - I would keep the s- prefix though or call the class Utility\ArrayStableSort::uasort.
But overall having that in a static class is better than lots of new functions.
Comment #23
joelpittet@Fabianx thanks, the original developer was thinking of a class name change like that too, I think I'll go with that so there is an easier grep for PHP equivalent.
Comment #24
Anonymous (not verified) commentedThis patch uses the suggested stable_uasort for
uasort()on the PHP issue https://bugs.php.net/bug.php?id=53341 which also solves issue https://bugs.php.net/bug.php?id=50688 where PHP'suasort()will throw warnings while debugging.I've swapped out all calls to
uasort()for\Drupal\Component\Utility\SortArray::suasort()Comment #25
Anonymous (not verified) commentedI've just been clicking around and found I'd overlooked a few instance of
uasort($array, 'static::sort')this patch correctly addresses thoseComment #26
Anonymous (not verified) commentedComment #28
Anonymous (not verified) commentedWhat would be the acceptable procedure regarding protected methods for the helper class static method? The only viable option that springs to mind would be for all sorting methods to be public.
Comment #29
Anonymous (not verified) commentedRetesting patch with public visibility on
\Drupal\views\ViewsDataHelper::fetchedFieldSort()Comment #30
Anonymous (not verified) commentedComment #31
Anonymous (not verified) commentedComment #32
oriol_e9gI would be usefull to evaluate a possible performance regression with this aproatch.
Comment #33
joelpittet@GroovyCarrot the original author merged my changes upstream into classes. We could use this library I think, no?
https://github.com/vanderlee/PHP-stable-sort-functions
Comment #34
Anonymous (not verified) commented@joelpittet I did have a glance at that library however the
uasort()replacement method calls PHP's copy, which also has the related issue #2567035, which is causing problems while profiling. The library also isn't PS-4 compliant, so I imagine it would not pass code review?The library would need namespacing and documenting, in addition to the
\StableSort::uasort()method implementation being changed to either:error_reporting(E_ERROR)on affected versions of PHP (all versions of PHP 5 at the moment).@operator on theuasort()line.I am aware that there are performance regression issues with changing error reporting level at runtime with either of the above suppression options.
Thinking forward, the advantage to committing the
uasort()replacement to Drupal core, is that it is likely the implementation for this function will need to change in when both of these PHP bugs are fixed; the method can decide whether or not to use PHP's copy, depending on the version.Comment #35
joelpittet@GroovyCarrot ok I thought I'd ask, he seemed keen on getting PS4 setup so I'm sure a pull request for that would suffice.
Proving this is a bug through a test should likely be top priority before profiling how slow it has become.
I wrote a quick start of a test (which you can run with
drush scr test.phpfrom your drupal root after applying the patch. I tried to prove the Issue summary correct but got wildly different results...Comment #36
joelpittetAlso not a meta anymore.
Comment #37
joelpittetComment #38
Anonymous (not verified) commentedThe method sorting there was by the 'weight' key and not '#weight' - try replacing
array('Drupal\Component\Utility\SortArray', 'sortByWeightElement')witharray('Drupal\Component\Utility\SortArray', 'sortByWeightProperty')The test cases should ideally test that all of the keys are in the expected order,
uasort()appears to pass the second test there, however it is the elements with the same weight (empty) that have lost their indexes.Comment #39
Anonymous (not verified) commentedI've just wrote 4 test cases that cover
::sortByWeightPropertyand::sortByTitlePropertyWith
SortArray::suasort():Substituting with
uasort():Test cases are found in ./core/tests/Drupal/Tests/Component/Utility/SortArrayTest.php
Comment #40
joelpittet@GroovyCarrot ah thanks, totally spaced on the method name. Because this is a bug we can split the tests out into a test only patch for the core committers to see our tests are doing what they said. That way we can show it's broken before.
Upload the failing tests (test only patch) first and passing one second. Would you mind doing that @GroovyCarrot?
Comment #41
joelpittetOh and I guess that would be a regression tests-only patch because the current test just makes sure the new method works correctly.
Comment #42
joelpittetCurrent tests look great BTW, much better than I could write;)
Comment #43
Anonymous (not verified) commented@joelpittet no problem, that makes sense. Attached test only patch, demonstrating
uasort()is not functioning as expected. As above, 3 of these tests will fail:Patch in comment #39 implements
SortArray::suasort()and replaces all instances ofuasort()in core. Test cases demonstrate behaviour is as expected.Comment #44
joelpittetThat is great! it would be nice even to have that as a regression test showing that they are 'not equal' to keep with this issue so people don't remove it without due process.
I'll help with the profiling with xhprof-kit if I can get a decent scenario where this gets called frequently. Any scenario suggestions in your travels on this patch?
Comment #46
Anonymous (not verified) commentedI'm not sure if there's a standard on performance testing particular methods in Drupal, but from my own tests I estimate that this version is about 1/4 the speed of the native PHP
uasort().I did a crude test to see how many times the function is called on a few pages; the most calls was in the administration at <500 calls. This code should in theory give us the expected impact on performance on a page with 500 calls, using a test case that both functions will pass.
The impact is ~50ms for 500 arrays with 3 keys, it stands to reason that significantly larger arrays would incur a larger performance hit.
To get a better idea I've tested with 100,000 cycles, which better demonstrates speed difference:
To get a more realistic impact on site performance, a test with apache benchmark with 100 requests by 20 individual users, on a page on the front-end with a small view, and caching disabled resulted in:
I wouldn't say 0.07 is a particularly significant difference for this kind of test. Unfortunately I don't have a relatively heavy site in Drupal 8 to benchmark with.
I did try using the webprofiler module to profile the page without the patch, but using
uasort()puts me back to square one - with error message output on the pageWarning: uasort(): Array was modified by the user comparison functionas in the related ticket. In fact, kint rendering a backtrace was the only code that had any noticeable impact to performance.Comment #47
joelpittetActually you're right xhprof would be bad for this anyway. Just like you did is good to narrow it down and do one with many arrays and one with large arrays.
I'll give this a test as well this evening. The profiling is secondary to fixing the bug but it's nice to know the hit we are taking fixing this.
Comment #48
joelpittet@GroovyCarrot other than the minor ending commas on the array items in the test I reviewed the code with
--color-wordsand everything looks good. The needed change to change protected to public on that one method may bump this up to 8.1.x but I'll let the committer decide that. Feel free to post a clean-up patch for the array items commas and leave it RTBC, you can also use short array syntax in your new test class method.Your profiling is showing a slight performance improvement, which actually I confirmed as well while profiling the functions. As I added bigger arrays though that performance started to go south.
With 100,000 weights there gets a bit of a performance regression:
With only 10,000 weights it's a performance improvement and is more likely use-case to be less.
I don't think a test showing PHP is poor at this will serve us well (especially if they fix it) now that I think about it some more.
drush scr perf.phphttps://gist.github.com/2978037a059f0bcd7dd5
Comment #49
Anonymous (not verified) commentedOkay cheers. I'm used to working on Drupal 7 and wasn't quite sure on whether or not the short array syntax was acceptable. We should be good on the last array key commas though, unless they've changed it for Drupal 8 (https://www.drupal.org/coding-standards#array).
For most cases, I can't really think of an instance where the array being sorted is going to have a relatively high number of keys. I would imagine this function is frequently sorting small arrays, as it only operates on the top level. I think you are right, fixing the bug should be priority, and then look to optimise - I think we've at least demonstrated that the expected impact is negligible.
What will likely need to happen is the function checks to see which version of PHP is running, and then decide whether or not to let PHP do the sorting with
uasort(). I believe both this and the related bugs have been fixed in PHP 7, so in that instance it may be worth letting PHP do it. I think PHP 7 compatibility is a separate issue, but a patch like this should help manage the difference in behaviour for the function in one place.Comment #50
swentel commentedNitpick: shouldn't go over 80 chars
This should probably have a change record.
Comment #51
joelpittetThanks @swentel, I start one, @GroovyCarrot feel free to correct my Change Record.
@GroovyCarrot these little niggles should be take care of, shouldn't take you long, I'd patch these for you but testing out the non-upload commit credit thing;)
As @swentel mentioned, 80 chars, and if I remember it should be on one line if possible.
Missing commas
Comment #52
joelpittet@GroovyCarrot et al. Have a peek and make sure the CR works for you.
https://www.drupal.org/node/2661308
Comment #53
Anonymous (not verified) commentedNo problem. Good spy - fixed missing commas, added short array syntax. I believe it is 1 line - which I'm more picky about than the 80 char limit. I've corrected to both standards. Looks good to me @joelpittet.
Comment #54
Anonymous (not verified) commentedComment #55
joelpittetThanks for knocking those little things out. Back to RTBC.
Comment #58
Anonymous (not verified) commentedComment #59
alexpottSorts an array stably using a user-defined comparison function.is the only way I could get in under 80 chars and mention the key fact that the sort is stable.These two lines should be together and describe what a stable sort is. Something like:
Stable sorting algorithms maintain the relative order of records with equal keys (i.e. values). Additionally, this implementation does not call PHP's uasort() to avoid warnings while debugging.I have to honest though I always think that PHP warning is sensible. Sorting should not alter the array and if it does something is wrong. So are we losing a safeguard here?
A list has
-for each item.Can typehint here...
@param callable $cmp_functionGenerally Drupal prefers method names that are less cryptic. How about
stableUasort()oruasortStable(). My preference is for the first one.Given the way the method works can we add some tests with more than 3 elements? And some with 2 and some with 1.
Comment #60
Anonymous (not verified) commented@alexpott that makes sense, I'll improve documentation and add more tests tonight.
Regarding the warning being generated; you are correct, the sorting function should not alter the array. The replacement implementation here does not pass the original array by reference to the sorting function anyway - in which case it would be impossible to modify the original array. The real issue is highlighted better here: http://stackoverflow.com/a/10985500/2448139. The bug is not necessarily that a warning is thrown when it is modified, but rather the warning is thrown when the array is simply observed by
debug_backtrace(). I found this causes issues with the webprofiler module enabled, with all errors and warnings handled by devel during development.Comment #61
alexpott@GroovyCarrot can we add a test proving that user-defined comparison functions can't alter the array?
Comment #62
Anonymous (not verified) commented@alexpott sure I'll put together a test and write a malicious comparison function that requests the array by reference, and
unset()'s some keys, to be sure the original array isn't affected by the sorting.Comment #63
Anonymous (not verified) commentedI did begin writing a test for array modification, but it proved impossible, as requesting the arrays to compare by reference throws a fatal error. This is because the values are passed directly from either
reset(),end()orcurrent(), which do not return references.One thing to note that came up in testing was that empty values were lost by the last stable sort implementation, which is another undesired behaviour. I've corrected the sorting function to retain empty values (ie.
[],FALSE,NULL), and added relevant tests.I've made all the recommended documentation changes and renamed to
SortArray::stableUasort().Comment #64
Anonymous (not verified) commentedComment #65
Anonymous (not verified) commentedComment #66
joelpittetUpdated the method name in the CR too. I diffed the previous RTBC'd patch #53 an with #65 and nothing stood out. Now with more coverage;)
Comment #67
alexpottThere's no test of sorting an array with a single element and there's no test of sorting an array with a user-defined function that changes values.
Comment #68
Anonymous (not verified) commentedAdded tests for array integrity with modifying comparison function.
Comment #69
alexpottThis should have failed as
$thisdoes not have this method.Comment #70
alexpottWe can actually use the
callabletype hint... https://secure.php.net/manual/en/functions.arguments.php#functions.argum...Comment #71
Anonymous (not verified) commentedThe comparison function never gets called if there's less than 2 elements in the array. My bad there, I didn't notice since the test didn't fail.
Comment #72
Anonymous (not verified) commentedAh I actually didn't realise
callablewas a valid typehint, corrected that.Comment #73
reekris commentedIs there any update to this issue? The last patch passed the tests so does that mean it is ready to be merged or is there more work to be done?
I currently have a quite nasty issue using the IEF module: #2653574: Unable to keep nested IEF data separate with multivalue fields.
IEF overwrites referenced entities in the table listing when creating new references. The issue only happens in PHP 7 and I suspect this could be the underlying problem.
Comment #74
Anonymous (not verified) commented@reekris I had thought that the PHP bug with
uasort()was fixed in PHP 7. If you apply the patch above, and swap usages ofuasort()withSortArray::stableUasort(), does it fix the issue?Comment #75
reekris commented@GroovyCarrot tried applying the patch but it did not help unfortunately. It was kind of a wild guess after all..
But even so, what is keeping this from being merged?
Comment #76
joelpittetThe concerns look to be addressed. Back to RTBC.
Comment #77
alexpott@GroovyCarrot Sorry for not reviewing sooner... I think we need to trigger a warning if a user comparison function alters the array - along the lines of what happens if you do this for usort().
PHP Warning: usort(): Array was modified by the user comparison function in /home/jcampbell/usortExceptionWarning.phpSilently succeeding and not telling the user feels problematic. I think the stable usort should be as close as possible in functionality to PHP's usort.
Comment #78
Anonymous (not verified) commented@alexpott I wrote a test (as you asked above) to show that a comparison function could not alter the array; even if it tried to. The only way that a warning that the array was modified could be given here, is if the function were changed, in order to allow the comparison function to modify the array in the first place - which doesn't make much sense?
Comment #79
damien tournoud commentedImplementing a sort in user space like this sounds like a terrible idea. I'm surprised there is no proper benchmark of this, especially on non-cached pages, or large forms like the permission form.
The code itself is obviously based on this comment by "php at clement dot hk" in the PHP documentation (including all the weird kinks, like using a float to slice an array...). This is probably a non-started as, as far as I can see, the documentation is released under Creative Commons Attribution 3.0, which might or might not be compatible with the GPL.
But really this whole thing is a solution in search of a problem. We have had sort stabilization code for years, it's not that complicated. You just need to make sure that if you care about sort stability, your comparison function never returns 0. Doing so for the simple case of
#weightonly requires to go through each item and set the weight if it is not set:If you want to be pedantic about it, wrap the sort function into something that throws an exception if the result is zero, and fix what breaks.
Comment #80
Anonymous (not verified) commented@Damien Tournoud, the code was posted on the PHP bug ticket (https://bugs.php.net/bug.php?id=53341) which was referenced in the phpdoc comment on the method. This was the suggested code to use until the bug in PHP was fixed. I have since fixed a bug with that piece of code anyway, however.
I understand that achieving stable sort is not of huge concern, and there are plenty of work-arounds. Unfortunately, there is a still a bug in PHP which causes a warning when the array is observed (ie. during profiling and debugging) due to the internal pointer of the array being moved. I did post a link above to a more detailed answer, the open ticket at php.net is: https://bugs.php.net/bug.php?id=50688. This is of high concern to me as a developer, as we should not be expected to switch off warnings and errors, just to profile the site while in development - we should be able to do both at the same time, throughout the process. There was discussion here around the issue; however, the only solution being utilised was suppressing all warnings and errors with
@uasort(...), which can potentially cause more grief that it's worth down the line, and I doubt would go down well as a change record.The profiling issue has been fixed in PHP 7, however it seems, from the php.net ticket, that the stable sort is only fixed in PHP 7 for small arrays. I suggested having a method in Drupal core, that could just decide whether or not to use PHP's
uasort(), depending on the version of PHP being used. There has been some attempt to test performance regression on this issue, and there has been little impact using this solution.I should also mention, developers that build applications on Symfony are accustomed to developing while profiling. You can replicate the issues with
uasort()warnings by enabling the webprofiler module in the devel project and just clicking around.Comment #81
damien tournoud commented@GroovyCarrot: If you want to push it forward, this issue does need a reboot. Between the dubious origin of the code and the massive performance impact of the approach (implementing a merge-sort in PHP), there is not much to salvage from the original design.
Note that using
uasort()to sort is a pretty bad idea in general anyway. The way this is implemented in PHP the comparison callback should be expected to be called in the range ofO(n*log(n))times for each sort operation.A much better approach is to build a key that is a simple PHP type, and use the internal PHP implementation to sort on that. Once you have a key, making the sort stable is trivial.
A trivial implementation of a stable sort by a
"#weight"key that might or might not be there:Even on small arrays, this is vastly faster than even straight
uasort()(about 4x for 10 items, about 10x for 100 items in my own quick benchmark), so let's not talk about the custom implementation of the merge sort.So, no, our usages of
uasort()should not be fixed. They should just die. There are very few good reasons to useuasort()in the first place.Comment #82
Anonymous (not verified) commented@Damien Tournoud, it probably is overkill to compare every array element with a comparison function, to simply order by a single numeric key. I would be happy to push this forward, if I can get a clear proposal; I have to admit, it is a difficult way to work by uploading patches to issues. Has GitLab not been considered? Or moving over to BitBucket/JIRA?
Based on your comments, it seems that the proposed changes would be:
SortArray::sortByKeyInt()(and implementing comparison callback methods) and write an implementation similar to above, inSortArray::sortBySubKeyInt(&$array, $key).In your code, you've suggested using a fraction in order to stabilise the array sort. I know in Drupal 7 there are a few instances where fractional weights are used (ie. in the JS array, very small fractions are used); the assumption here is that all instances in Drupal 8 core, where
SortArray::sortBySubKeyInt()is used, is actually using integers, as the current implementation will still work for floats (even though the method name suggests otherwise). Alternatively, perhaps a very small fraction could be used to stabilise items of the same weight?SortArray::sortByKeyString(), similar to above, and replace usages with direct call toSortArray::sortBySubKeyString(&$array, $key).uasort()in core, in order to warrant a merge-sort implementation inSortArray::stableUasort(). You could still argue that this method should still exist for contributed modules to be able to call, should a merge-sort function be required? Otherwise developers would have to be expected to suppress warnings (with@)SortArray::sortByKeyInt(), andSortArray::sortByKeyString(). Also a version in which they will be completely removed should really be stated, both on the change record, and in the code. State that developers are expected to useSortArray::sortBySubKeyInt()andSortArray::sortBySubKeyString()directly, for simple sub-key array sorting. Suggest ifuasort()is absolutely necessary, to instead useSortArray::stableUasort().I don't mind doing the work, as long as a clear end-goal can be agreed.
Comment #83
fabianx commented#81: Thanks for chiming in, Damien.
Because there are performance implications I think we really need fast helper functions or at least best practices.
Currently there is a variety of sorting functions hanging around:
uksort(), uasort(), array_multisort, asort(), etc.
I am sorry if I wasted anyone's time with the library. I did not realize performance was that bad.
TL;DR: We _need_ best practices for stable sort instead of 1000x different solutions.
Related: #2717633: PHP 7 hook_rdf_mapping() ['mapping']['rdftype'] failing in rdf.test: RDF type is present on post. (from where I did go research this again).
Comment #84
donquixote commentedIf all we are doing is to provide stable alternatives to PHP sort functions, then this should not be part of Drupal core.
Either a 3rd party library already exists, or we create one.
(EDIT: Sorry, didn't read enough. Apparently an existing 3rd party library was already proposed, but decided against. I would prefer seeing this outside of Drupal, but I don't want to restart arguments that are already finished.)
Comment #85
donquixote commented#82
There is another option. I would bet it is faster, but it is also more robust:
Option A: Keys by weight.
EDIT: I was going to call this a "bucket sort" approach, but I think this label can be misleading.
EDIT: Just saying, we need to be careful when using item weights as array keys. E.g. [0.1 => 'x'] is the same as [0 => 'x'].
Option B: Weights by key
(EDIT:) Option C: Items by weight
We need some benchmark to see which option is faster.
How to get item weights
To avoid unnecessary callbacks, we should have different versions of the sort function, depending how the item weight is determined.
One for
$item[$weight_key], one for$item->$weight_method(), one for$weight_callback($item). Hardcoding is usually good for performance.(EDIT:) Array-by-reference or return $sorted?
This is something we can discuss..
function
sort(array &$items) : voidvsfunction sort(array $items) : array.Render arrays
Render arrays are a special case.
Besides the sorting, they also require to distinguish element children and element properties ('#').
We should have a dedicated implementation for those, and not worry about them here.
#2522712-26: Simplify Element::children() stable sort.
There is also another special issue about references.
How to move on
Imo this should be treated as two separate steps / problems:
This should be in a dedicated issue, not here, imo. With clear specs in the issue summary.
Or, for my taste, in a 3rd party library. But I heard we have policy that makes this not as easy.
This should *only* introduce these functions, but not use them anywhere. Except in unit tests. And maybe a proof-of-concept patch which will be tested, but not committed.
This should be a separate issue. Maybe reuse this one, or start a new.
@GroovyCarrot:
Please do not consider me an authority on whether or not to open new issues. I am just stating what I personally find reasonable. Others might disagree.
Comment #86
donquixote commentedSortArray::sortBySubKeyInt()vsSortArray::sortBySubKeyString():Is this distinction even necessary? I think not, If we use one of the approaches in #85, and accept a
$sort_flagsparameter.My own preference for method names would be:
Comment #87
donquixote commentedhttps://github.com/donquixote/stable/tree/drupal-org-2466097-87-benchmarks
(Please look at this specific tag. The rest of the library may change over time.)
This is sandbox code to evaluate which of the options in #85 and #81 is faster.
Output of
php bench/run.php:This means that "items by weight" (Option C in #85) is the fastest in the given scenario.
Option B in #85 was not tested, because I do not currently see a promising candidate for the asort_stable() this would require.
-----
Having the weight as an array key to group by is ok for numeric weights.
For string weights or more exotic weight types there are limitations:
E.g. what if two weights are different, but considered equally great?
What if the weights are not even strings, but objects, arrays, or something else?
I think it is ok to limit ourselves to two cases:
SORT_NUMERIC.(string)$weightgives us the array key. This would cause weird results if $weight is e.g. '' or ' ' or ' ' or some other string. But we said these values are not allowed, so we don't have to worry.SORT_NUMERIC, something suited for strings.I think this is sufficient. So we can use the same code for all cases.
Comment #88
donquixote commentedIf we can agree whether or not to open a new issue "Introduce stable-sort-by-weight functions", I can/will start working on patches.
Comment #89
fabianx commentedI overall really like the DX of https://github.com/backdrop/backdrop/pull/514.
It shows nicely how most of core functions can be replaced with one function call.
Of course we can keep the helpers, too.
And we probably need a function like that for Drupal 7 as well - to fix all the PHP 7 issues as most uasort() calls could be sorting in a non-stable way.
Can we also compare how these solutions fare performance-wise against array_multisort(), which can also be used for stable sorting. (e.g. with a 1..count($items) generated array)?
Comment #90
donquixote commented@Fabianx:
I am looking at backdrop_sort(), and I am not really convinced..
The implementation relies on comparison callbacks, which can be expected to be considerably slower than anything proposed in #85 ff.
It has multiple cases in the same function, which should be split up into subroutines.
The public API of the backdrop_sort() function also tries to cover different cases at once. I would much rather see separate functions for separate cases. This way, when you see a function call, you know which case this is. Also this generally allows for better optimization.
The sorting by more than one key can be useful, but in the majority of cases relevant to us there is only one key.
Poor, I bet..
(numbers will follow)
How do you imagine this array_multisort() solution?
Comment #91
donquixote commentedOh, and the backdrop sort is not even stable.
Here is a new version of the sandbox which also benchmarks the backdrop sort.
https://github.com/donquixote/stable/blob/drupal-org-2466097-91-benchmar...
(Again: Look at this specific tag. The rest of the library may change over time.)
Benchmark result of
php bench/run.php:So, depending how you calculate it, the backdrop sort is around 36x slower than the "items by weight" sort.
(27 / 0.75 === 36)
What's missing is the array_multisort(), which I have yet to understand how this is supposed to work...
Comment #92
donquixote commentedI think I do understand the idea with array_multisort() now.
https://github.com/donquixote/stable/commit/954360603477fb2aece698aa2b16...
One problem is that array_multisort does not preserve numeric indices. (says the php manual.) To circumvent this limitation, we need to pass the array keys as a 3rd array, which probably slows things down.
Here are the numbers. I removed backdrop sort from the benchmark, it took too long.
Much better than the backdrop sort in #91, but still slower than the other functions.
Comment #93
fabianx commentedThanks for checking.
I like itemsByWeight the most - performance wise.
Can this also be applied to arbitrary keys?
Comment #94
donquixote commentedYou mean more than one key?
I see
twothree options:I think the second option is more promising.
Code + numbers will follow.
Comment #95
donquixote commentedTag: https://github.com/donquixote/stable/tree/drupal-org-2466097-95-sort-keys
Commit: https://github.com/donquixote/stable/commit/08cbaf25c217ce2b2cd181cca1ca...
No profiling numbers, but I think it will be ok.
In most cases, only a few items will have the same non-zero weight. E.g. 10 items without a weight (so the default weight is used), and then 5 items with distinct non-zero weights. This means the recursive call typically happens for only one group, the one with weight = 0. I think it is ok to optimize for this case.
I did not add any magic conversion of array value as sort key, as in backdrop.
If we really want this, it can be done in a separate function on top of this one.
I still think we should have distinct sort functions for different use cases. sortByWeightKey() accepts one weight key, sortByWeightKeys() accepts multiple weight keys. And then there is sortByWeightMethod() and sortByWeightCallback().
This should be enough for a start. We can add more, e.g. sortByWeightMethods(), when the need arises.
Comment #96
fabianx commented+1 to #95.
Even just sortByWeightKey() and sortByWeightKeys() should be enough for 95% of Drupals needs (as it was enough for backdrop).
Comment #97
donquixote commented@Fabianx: In the patch in #72, there is one case with a $language->getWeight() method. This is why I proposed sortByWeightMethod().
But we can also have a hardcoded stable sort only for languages. This would be even faster, because then the method name can be hardcoded. And we don't need any method_exists() check.
Or we make an interface with a getWeight() method.
--------
Should we open a new issue "Introduce stable-sort-by-weight functions"?
This way, newcomers or people who want to understand the git history don't have to read 80 comments about an approach that was already discarded.
Comment #98
donquixote commentedAny opinion on this? What is the correct neutral value for other sort types?
Neutral value is the weight that will be assumed for items without an explicit weight.
Comment #99
fabianx commented#98: It does not need to be all-encompassing.
As I said before, the backdrop examples shows that a function that can sort by either one key or several ones is enough for most of Drupal already.
But yes we can add a sortByWeightMethod() function. That is fine with me, as its technically the same as $array['weight'].
Comment #100
donquixote commentedThe problem is: If the method accepts a $sort_flags parameter, and we implement the "neutral weight" in a specific way, then changing it later would be a BC break.
E.g. if now we say that the "neutral weight" for SORT_REGULAR is 0, but we later change it to '', then this would be a BC break.
Comment #101
donquixote commentedSince there was no other opinion on whether to open a new issue, I made the call myself.
#2757207: Introduce "stable sort by weight" functions to sort arrays (and objects)
Comment #102
donquixote commentedI am currently working on this in other issues.
#2757207: Introduce "stable sort by weight" functions to sort arrays (and objects)
#2759479: Proof of concept: Replace uasort() with dedicated stable-sort-by-weight functions, where possible
@joelpittet, others:
I need a relevant use case with Drupal 8, where a call to
uasort()is expensive enough that it shows up in xhprof, and is worth optimizing. Otherwise I don't know what to profile, except the raw functions.Any hints or instructions are appreciated!
Comment #103
joelpittet@donquixote I put a link of how I tested in #48 is that what you are refering to? In a real/heavy use-case I think the biggest and slowest render call is simpletest UI. Building that table (IIRC we build it twice for #type table). (Edit, not sure if uasort is used lots, I just know it's the slowest rendered thing I recall)
Comment #104
donquixote commented@joelpittet: Ok, so if I understand correctly, #48 (the gist) is just a raw functions test, similar to the benchmarks in #2757207: Introduce "stable sort by weight" functions to sort arrays (and objects), but not really something we know is really happening in Drupal.
Ok I had a look.
Drupal\simpletest\TestDiscovery::getTestClasses(), if not cached, usesuksort()withstrnatcasecmp()for the groups and for the tests within each group.For me (core with very few contrib) this entire code snippet uses ~9.6 milliseconds (tested with microtime(TRUE)). Not very impressive, imo.
This could be replaced with
ksort(*, SORT_NATURAL | SORT_FLAG_CASE);.Doing this speeds it up to ~3.15 milliseconds. So 3x as fast.
But does it really matter?
Stability in ksort() / uksort():
One could think this is completely pointless, because all keys are distinct.
But if the (strict) order we are using is not a strict *total* order, then two distinct keys can be considered as equal for the sake of ordering, making the ksort() / uksort() less than 100% predictable.
There are some other calls to usort() and uksort() which I did so far neglect in the two issues I opened..
Also other sort functions without a callback, where stability might become relevant.
The other case where sort performance and stability does matter is sorting render arrays by weight.
But these should, imo, be dealt with in a separate issue, this one: #2522712: Simplify Element::children() stable sort.
Comment #105
joelpittetIt's to test the performance of the proposed solution at scales to see how it holds up vs the native sort.
Comment #106
joelpittetI'm sure the author of the patch will be more capable of understanding your analysis/feedback than I. I'm taking it at face value, testable failure on an issue that seems like it could cause is problems in the wild.
Comment #107
donquixote commentedThe proof-of-concept patch in #2759479-25: Proof of concept: Replace uasort() with dedicated stable-sort-by-weight functions, where possible passes.
It is time for some DX review.
The main question to discuss at this point is whether the public API of the StableSort::*() methods is pleasant enough to use, or if we want something else.
(It is not a time for code style review, just saying.)
Comment #109
cilefen commentedThank you everyone for all the work done on this issue.
catch, alexpott, effulgentsia, xjm, Cottser and cilefen discussed this issue on a triage call. It was decided this issue should remain at major priority. I've tagged it for a title (it's a meta issue, right?) and summary update. We should remove the obsolete examples from the summary (because #2448765: Element::children sort order undefined and slower than it could be - This makes tests fail in PHP7 was fixed) and consider setting this back to active while alternate approaches are considered. These could even take the form of a documentation of best practices or a custom solution.
Notes from the discussion include:
Comment #111
David_Rothstein commentedAdding a link to #2834022: drupal_sort_weight sort order undefined and possibly inconsistent between PHP 5 and PHP 7, which can essentially be a Drupal 7 backport of this issue.
It has some code that may make sense to borrow for the Drupal 8 patch here.
Comment #113
David_Rothstein commentedComment #117
liam morlandComment #118
joseph.olstadpatch 72 probably needs a reroll, let's see what the testbot says.
Comment #119
mustanggb commentedWhat is the decision here?
Comment #126
needs-review-queue-bot commentedThe Needs Review Queue Bot tested this issue. It either no longer applies to Drupal core, or fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".
Apart from a re-roll or rebase, this issue may need more work to address feedback in the issue or MR comments. To progress an issue, incorporate this feedback as part of the process of updating the issue. This helps other contributors to know what is outstanding.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.
Comment #127
avpaderno