Document hard limit behavior for equal results [#2834730]

Comment	File	Size	Author
#23	interdiff-23.txt	3.58 KB	dermario
#23	fix_random_test_fail-2834730-23.patch	1.07 KB	dermario
#22	interdiff-22.txt	7.21 KB	dermario
#22	fix_random_test_fail-2834730-22.patch	4.69 KB	dermario
#18	fix_random_test_fail-2834730-18.patch	3.32 KB	dermario
#15	random-facets.mp4	1023.2 KB	dermario
#12	facets-sort-2.png	17.58 KB	dermario
#12	facets-sort-1.png	20.69 KB	dermario
#7	fix_random_test_fail-2834730-7.patch	551 bytes	borisson_
#4	facets-fail.png	87.9 KB	dermario

Comment #1

12 December 2016 at 09:22

borisson_ created an issue. See original summary.

Log in or register to post comments

Comment #2

borisson_

Dutch

Mechelen, 🇧🇪

CreditAttribution: borisson_ commented 12 December 2016 at 20:23

I ran the test locally with --repeat 24 --class "Drupal\facets\Tests\IntegrationTest::testHardLimit" and didn't get any red tests.
Because that didn't cause any failures; I ran php core/scripts/run-tests.sh --repeat 12 --concurrency 4 --class "Drupal\facets\Tests\IntegrationTest"". That came back green as well

This is with my local checkout of search_api, that does run on HEAD. Because it freaks me out that there's such a big difference, I went back and ran the first test again with beta-3 that also ended up green.

Log in or register to post comments

Comment #3

dermario

German

Zürich

CreditAttribution: dermario as a volunteer and at Unic commented 13 December 2016 at 23:24

I get some fails on my vagrant box:

$ php core/scripts/run-tests.sh  --repeat 4 --color --class "Drupal\facets\Tests\IntegrationTest::testHardLimit"

Drupal test run
---------------

Tests to be run:
  - Drupal\facets\Tests\IntegrationTest::testHardLimit

Test run started:
  Wednesday, December 14, 2016 - 00:18

Test summary
------------

Drupal\facets\Tests\IntegrationTest::testHardLimit            45 passes
Drupal\facets\Tests\IntegrationTest::testHardLimit            44 passes   1 fails
- Found database prefix 'test34573080' for test ID 5.
Drupal\facets\Tests\IntegrationTest::testHardLimit            45 passes
Drupal\facets\Tests\IntegrationTest::testHardLimit            44 passes   1 fails
- Found database prefix 'test32834027' for test ID 7.

I try to find out more now.

Log in or register to post comments

Comment #4

dermario

German

Zürich

CreditAttribution: dermario as a volunteer and at Unic commented 14 December 2016 at 08:17

File	Size
facets-fail.png	87.9 KB

My assumption is that we get these fails by a kind of random sorted facets. As you can see in the screenshot, we have 3 facets but it's strawberry (2) and not apple (2).

Log in or register to post comments

Comment #5

borisson_

Dutch

Mechelen, 🇧🇪

CreditAttribution: borisson_ commented 14 December 2016 at 09:36

My assumption is that we get these fails by a kind of random sorted facets. As you can see in the screenshot, we have 3 facets but it's strawberry (2) and not apple (2).

This means that we can probably resolve this issue by adding alphabetical sorting as well as count sorting on the facet? That would make sense.

I'll give that a shot later.

Log in or register to post comments

Comment #6

dermario

German

Zürich

CreditAttribution: dermario as a volunteer and at Unic commented 14 December 2016 at 09:42

Yes, sorting excplitly is an option. A different approach could be to make sure that apple has a different amount of results in facets than strawberry. Currently they both have 2 results.

Log in or register to post comments

Comment #7

borisson_

Dutch

Mechelen, 🇧🇪

CreditAttribution: borisson_ commented 14 December 2016 at 21:13

Status:

Active

» Needs review

File	Size
fix_random_test_fail-2834730-7.patch	551 bytes

1 file was hidden/shown/deleted

File	Size
facets-fail.png	87.9 KB

Log in or register to post comments

Comment #8

borisson_

Dutch

Mechelen, 🇧🇪

CreditAttribution: borisson_ commented 14 December 2016 at 21:26

Status:

Needs review

» Fixed

Committed #7, that should work in theory.

Log in or register to post comments

Comment #9

14 December 2016 at 21:26

borisson_ committed 750e57a on 8.x-1.x

Issue #2834730 by borisson_, dermario: Fix random test fail

Log in or register to post comments

Comment #10

dermario

German

Zürich

CreditAttribution: dermario as a volunteer and at Unic commented 14 December 2016 at 21:45

With the patch applied i still get these random test fails on my local machine. 2 out of 8 tests are failing. It seems like my assumption regarding sort was not correct. I would like to debug that issue on my local machine to find the root cause. Should we reopen that issue?

Log in or register to post comments

Comment #11

borisson_

Dutch

Mechelen, 🇧🇪

CreditAttribution: borisson_ commented 14 December 2016 at 21:46

Status:

Fixed

» Active

Sure, reopening.

Log in or register to post comments

Comment #12

dermario

German

Zürich

CreditAttribution: dermario as a volunteer and at Unic commented 15 December 2016 at 23:59

File	Size
facets-sort-1.png	20.69 KB
facets-sort-2.png	17.58 KB

I just want to give an update and say that i am still working on it. One thing that is blocking me, is that i cannot enable hard limit due to this check:

    if (strpos($facet->getFacetSourceId(), 'search_api') === FALSE) {
      $form['facet_settings']['hard_limit']['#disabled'] = TRUE;
      $form['facet_settings']['hard_limit']['#description'] .= '<br />';
      $form['facet_settings']['hard_limit']['#description'] .= $this->t('This setting only works with Search API based facets.');
    }

Maybe i am doing it wrong but my facets source is views_page:testindex__page_1. Is my problem related to #2772745: Search API integration doesn't check/define feature support of backends ?

What is weird is, that the sort order of facets is correct without the hard_limit:

And is incorrect with the hard limit set:

One assumption (no proof for it) might be, that the limiting is done before sorting? I would like to dive in deeper and debug the actual hard limit (and not the test). If i could solve the problem about (facet source id check) i could check that.

Log in or register to post comments

Comment #13

borisson_

Dutch

Mechelen, 🇧🇪

CreditAttribution: borisson_ commented 16 December 2016 at 07:41

The check should've been fixed in #2835112: Incorrect search api check for hierarchy option.

Log in or register to post comments

Comment #14

dermario

German

Zürich

CreditAttribution: dermario as a volunteer and at Unic commented 16 December 2016 at 09:59

Assigned:	Unassigned	» dermario
Status:	Active	» Needs work

Log in or register to post comments

Comment #15

dermario

German

Zürich

CreditAttribution: dermario as a volunteer and at Unic commented 16 December 2016 at 11:41

File	Size
random-facets.mp4	1023.2 KB

I could reproduce it, but i still do not know the root cause. :-)

I created the following facets scenario:

Atom (2)
Bar (3)
Beer (2)
Clown (3)

i disabled cache in Drupal\views\ViewExecutable::execute temorarly :

   if (FALSE && $cache->cacheGet('results')) {
      if ($this->pager->usePager()) {
        $this->pager->total_items = $this->total_rows;
        $this->pager->updatePageInfo();
      }
    }
    else {
      $this->query->execute($this);
      ....

After that i reloaded my search page several times and got this result:

https://www.drupal.org/files/issues/random-facets.mp4

For any reason \Drupal\search_api_db\Plugin\search_api\backend\Database::getFacets() returns these facets randomly with hard limit enabled on my machine.

Log in or register to post comments

Comment #16

dermario

German

Zürich

CreditAttribution: dermario as a volunteer and at Unic commented 16 December 2016 at 16:08

I could narrow it down to the sql query that generates the facets. In my case its this query, generated and called in \Drupal\search_api_db\Plugin\search_api\backend\Database::getFacets:

SELECT t_2.value AS value, COUNT(DISTINCT t.item_id) AS num
FROM
(SELECT DISTINCT t.item_id AS item_id, '1000' AS score
FROM
search_api_db_test t) t
INNER JOIN search_api_db_test_field_tags t_2 ON t.item_id = t_2.item_id
WHERE t_2.value IS NOT NULL
GROUP BY value
ORDER BY num DESC
LIMIT 3 OFFSET 0

This query gives me the facets with the same quantity in a random sort order:

[vagrant@vagrant facets]$ drush sql-query "SELECT t_2.value AS value, COUNT(DISTINCT t.item_id) AS num FROM (SELECT DISTINCT t.item_id AS item_id, '1000' AS score FROM search_api_db_test t) t INNER JOIN search_api_db_test_field_tags t_2 ON t.item_id = t_2.item_id WHERE t_2.value IS NOT NULL GROUP BY value ORDER BY num DESC LIMIT 3 OFFSET 0"
3	3
5	3
1	2
[vagrant@vagrant facets]$ drush sql-query "SELECT t_2.value AS value, COUNT(DISTINCT t.item_id) AS num FROM (SELECT DISTINCT t.item_id AS item_id, '1000' AS score FROM search_api_db_test t) t INNER JOIN search_api_db_test_field_tags t_2 ON t.item_id = t_2.item_id WHERE t_2.value IS NOT NULL GROUP BY value ORDER BY num DESC LIMIT 3 OFFSET 0"
5	3
3	3
1	2
[vagrant@vagrant facets]$ drush sql-query "SELECT t_2.value AS value, COUNT(DISTINCT t.item_id) AS num FROM (SELECT DISTINCT t.item_id AS item_id, '1000' AS score FROM search_api_db_test t) t INNER JOIN search_api_db_test_field_tags t_2 ON t.item_id = t_2.item_id WHERE t_2.value IS NOT NULL GROUP BY value ORDER BY num DESC LIMIT 3 OFFSET 0"
3	3
5	3
1	2
[vagrant@vagrant facets]$ drush sql-query "SELECT t_2.value AS value, COUNT(DISTINCT t.item_id) AS num FROM (SELECT DISTINCT t.item_id AS item_id, '1000' AS score FROM search_api_db_test t) t INNER JOIN search_api_db_test_field_tags t_2 ON t.item_id = t_2.item_id WHERE t_2.value IS NOT NULL GROUP BY value ORDER BY num DESC LIMIT 3 OFFSET 0"
5	3
3	3
1	2

To find a solution for that problem, we must decide how the hard limit should behave in such cases. If there is a hard limit of 3 but there are 4 facets with the same quantity of results, which one should be cut away? The sort options defined in the facets settings do not help here, as the processing (sorting) applies much later. E.g. we do not have the translated entities label here. I am not a facets pro but i see the following solutions for that problem:

Sort by the value as the second sort dimension in the query above (maybe only if a limit is set). E.g. ORDER BY num DESC, value ASC. This would prevent randomness, but might be a non transparent to the user, as the selection bases on ids. But still better than random sort.
Perform the limit later in the process - maybe after all processors (sort) have run.
Leave it as it is and inform the user that there might be edge cases on db backends, when there are facets with the same quantity.

I would try to go for option 1 + user info, but would be happy to get feedback on this.

Log in or register to post comments

Comment #17

borisson_

Dutch

Mechelen, 🇧🇪

CreditAttribution: borisson_ commented 16 December 2016 at 18:59

Thanks for putting so much work into figuring this out @dermario, really awesome work!

I think we should go for .1 + a note in README.txt

Log in or register to post comments

Comment #18

dermario

German

Zürich

CreditAttribution: dermario as a volunteer and at Unic commented 17 December 2016 at 19:37

Status:

Needs work

» Needs review

File	Size
fix_random_test_fail-2834730-18.patch	3.32 KB

Thank you @borrison_ i really like to help here and try to finish things i started :) Thank you as well for always giving such a good and quick feedback.

I created #2836994: Fix random test fail in facets module in search_api with a patch to get 2nd sort dimension in. In the patch attached is a new method to assert a certain sort order of facets. That code helped me debugging this issue and could be useful for other tests. With the patch reported in #2836994: Fix random test fail in facets module i had 200 test-runs without a single fail.

There is also a text for the README.txt, which might be improved (as i am not a native speaker).

Log in or register to post comments

Comment #19

dermario

German

Zürich

CreditAttribution: dermario as a volunteer and at Unic commented 17 December 2016 at 19:39

Log in or register to post comments

Comment #20

borisson_

Dutch

Mechelen, 🇧🇪

CreditAttribution: borisson_ commented 17 December 2016 at 19:55

4 files were hidden/shown/deleted

File	Size
fix_random_test_fail-2834730-7.patch	551 bytes
facets-sort-1.png	20.69 KB
facets-sort-2.png	17.58 KB
random-facets.mp4	1023.2 KB

+++ b/README.txt
@@ -35,6 +35,25 @@ After adding one of those, you can add a facet on the facets configuration page:
+KNOWN ISSUES
+------------

The text here looks good, it clearly details the problem.

+++ b/README.txt
@@ -35,6 +35,25 @@ After adding one of those, you can add a facet on the facets configuration page:
+first and then sorting by the raw value of the facet (e.g. entity-id) in the second
...
+"Clown" will be cut off due to its higher internal value (entity-id). For further
+details see: https://www.drupal.org/node/2834730

This passes 80 cols, and should be reformatted.

Log in or register to post comments

Comment #21

17 December 2016 at 19:50

Status:

Needs review

» Needs work

The last submitted patch, 18: fix_random_test_fail-2834730-18.patch, failed testing.

Log in or register to post comments

Comment #22

dermario

German

Zürich

CreditAttribution: dermario as a volunteer and at Unic commented 17 December 2016 at 21:15

Status:

Needs work

» Needs review

File	Size
fix_random_test_fail-2834730-22.patch	4.69 KB
interdiff-22.txt	7.21 KB

Ups, seems like my new assertions are not working with the restructured test framework, so i reverted them again. I also corrected the line-length in README.txt

Log in or register to post comments

Comment #23

dermario

German

Zürich

CreditAttribution: dermario as a volunteer and at Unic commented 17 December 2016 at 21:22

File	Size
fix_random_test_fail-2834730-23.patch	1.07 KB
interdiff-23.txt	3.58 KB

2 files were hidden/shown/deleted

File	Size
fix_random_test_fail-2834730-22.patch	4.69 KB
interdiff-22.txt	7.21 KB

This should be the right patch. Messed up my local repo a bit. Sorry for that :-)

Log in or register to post comments

Comment #24

17 December 2016 at 22:08

The last submitted patch, 22: fix_random_test_fail-2834730-22.patch, failed testing.

Log in or register to post comments

Comment #25

borisson_

Dutch

Mechelen, 🇧🇪

CreditAttribution: borisson_ commented 20 December 2016 at 19:40

Title:	Fix random test fail	» Document hard limit behavior for equal results
Issue summary:	View changes

Updated title to state more clearly what's actually going on + small update to IS.

Log in or register to post comments

Comment #26

dermario

German

Zürich

CreditAttribution: dermario as a volunteer and at Unic commented 28 December 2016 at 22:49

#2836994: Fix random test fail in facets module was committed yesterday. So we could add the description in #23.

Log in or register to post comments

Comment #27

borisson_

Dutch

Mechelen, 🇧🇪

CreditAttribution: borisson_ commented 29 December 2016 at 07:58

@dermario: everything is currently waiting on reviews for #2772745: Search API integration doesn't check/define feature support of backends. The last time I had to reroll that issue it took me ~25 days to get everything back to green. Even though this is a very small change I'd prefer not to have to go trough rerolling that again. I'd love to get as much feedback as possible on that issue (does the upgrade path work, do the facet still work after that? Do the other changes make sense?)

Log in or register to post comments

Comment #28

borisson_

Dutch

Mechelen, 🇧🇪

CreditAttribution: borisson_ at Dazzle commented 28 January 2017 at 12:45

Status:

Needs review

» Fixed

Committed, thanks!

Log in or register to post comments

Comment #29

28 January 2017 at 12:47

borisson_ committed dfd1737 on 8.x-1.x authored by dermario

Issue #2834730 by dermario, borisson_: Document hard limit behavior for...

Log in or register to post comments

Comment #30

11 February 2017 at 12:54

Status:

Fixed

» Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

Log in or register to post comments

Document hard limit behavior for equal results

Comments

Comment #1

Comment #2

Comment #3

Comment #4

Comment #5

Comment #6

Comment #7

Comment #8

Comment #9

Comment #10

Comment #11

Comment #12

Comment #13

Comment #14

Comment #15

Comment #16

Comment #17

Comment #18

Comment #19

Comment #20

Comment #21

Comment #22

Comment #23

Comment #24

Comment #25

Comment #26

Comment #27

Comment #28

Comment #29

Comment #30

Related issues

Thank you to these Drupal contributors

News items

Our community

Documentation

Drupal code base

Governance of community