Store only diff of results with branch for issue tests [#2575797]

Before drupalcon barcelona, there was an issue with max_allowed_packet while dumping pift_ci_job_results. While Basic increased that variable, it didn't solve the underlying issue:

mysqldump: Error 2020: Got packet bigger than 'max_allowed_packet' bytes when dumping table `pift_ci_job_result` at row: 147068065

Suffice it to say all dev environments are blocked from having new data until this table can be fixed. As an interim, I've committed ee9c69f55274c86e2d1c18e8d2b95f439a4d26ba which will skip exporting data from the table altogether. However, this is a larger issue with the architecture of the pift_ci_job_results itself and probably need to be re-architected.

Some ideas / needs raised during the extended sprints:

"We need the passing result data because when we make a new test, we need to see that the test actually executed"
"What about making the whole page a static HTML blob and storing that in the database?"
"It'd be great if someday we could see which tests are failing on the branch randomly. This would require some type of semantic setup, which couldn't be achieved by storing an HTML blob"
"Export the data to another file, like JSON"

Right now the table is at 169272738 rows. Marking Critical because of the rapid growth of this table.

Comments

Comment #1

27 September 2015 at 14:09

japerry created an issue. See original summary.

Comment #2

japerry

KOMK

commented 27 September 2015 at 14:10

Issue summary:

View changes

Comment #3

japerry

KOMK

commented 27 September 2015 at 14:11

Issue summary:

View changes

Comment #4

japerry

KOMK

commented 27 September 2015 at 14:11

Issue summary:

View changes

Comment #5

pwolanin commented 27 September 2015 at 14:26

putting the HTML blob in a row or (better) in a file loaded in the page callback and exporting JSON, CSV, or something else for the full data per result for later/off-line processing and keeping just file references would be much more scalable so we'd have 1 row per test run in the DB for now.

Comment #6

drumm

he/him

NY, US

commented 27 September 2015 at 20:09

Priority:

Critical

» Normal

While Basic increased that variable, it didn't solve the underlying issue:

mysqldump: Error 2020: Got packet bigger than 'max_allowed_packet' bytes when dumping table `pift_ci_job_result` at row: 147068065

Suffice it to say all dev environments are blocked from having new data until this table can be fixed.

This was fixedby http://cgit.drupalcode.org/infrastructure/commit/?id=94adf1c86fbd8bbcfba... and new dev sites do work. mlhtest-drupal.redesign.devdrupal.org was built out today and has recent data in pift_ci_job_result. The dev site DB on disk did increase in size from 26G to 28G, probably a result of increased activity from DrupalCon in both testing and issues. It took 1h to build out, which is slightly faster than dev sites in the last couple weeks, which took 1h10m.

The root cause of the previous dev site not being built out was ERROR 2013 (HY000) at line 24681: Lost connection to MySQL server during query, most likely a simple connection blip between devwww and devdb.

Comment #7

drumm

he/him

NY, US

commented 27 September 2015 at 20:11

mlhtest-drupal.redesign.devdrupal.org can be used for sprinting-related work. Please add URLs to /var/www/dev/mlhtest-drupal.redesign.devdrupal.org/comment as issue(s) are worked on, so we know what all is going on there.

Comment #8

drumm

he/him

NY, US

commented 27 September 2015 at 20:30

Title:	pift_ci_job_result table gets too large	» Is pift_ci_job_result table getting too large?
Component:	Development Environments	» Servers

That leaves the question of whether the row-per-test data storage makes sense. This isn't a new style of data storage, QA.Drupal.org does the equivalent. We do have a faster pace of testing happening now, and more tests in core than ever.

According to New Relic, SELECT queries on the pift_ci_job_result table are taking 28.4ms on average over the last 7 days, the tallest spikes on the response time graph are 67ms. It does not make the top 20 most time consuming queries for the site. For now, I think this shows it isn't a critical problem, if there is a problem.

When first launched, I did have to go through a few iterations on getting the table & keys right, but we are in at least an okay spot for now. The table has a not-too-large covering index for the query that hits it. We should keep an eye on the table's growth. A large number of rows alone isn't necessarily a problem, but is indeed worth investigating.

Comment #9

drumm

he/him

NY, US

commented 30 March 2016 at 18:51

Rudy switched this table to be stored in a compressed format awhile ago. That saved us 50% on disk and as far as I know has otherwise been doing well.

I implemented trimming test results from issues that have been Closed (fixed) for 6 months. That's only run once. We have it available to run more-frequently, but it isn't significant savings.

This still leaves us with a lot of rows. A possible next step for issue results would be to not store successes matching the branch job’s results, basically storing only the differences. That would align well with the UI, really you are only interested in the difference.

Comment #10

drumm

he/him

NY, US

commented 12 May 2017 at 21:40

Title:	Is pift_ci_job_result table getting too large?	» Store only diff of results with branch for issue tests
Project:	Drupal.org infrastructure	» Project issue file test
Version:		» 7.x-3.x-dev
Component:	Servers	» Code
Assigned:	Unassigned	» drumm
Category:	Bug report	» Feature request

We are running into disk space issues on staging now. While we can provision more disk, we can’t do that forever.

Comment #11

15 May 2017 at 14:21

drumm committed 3518f7f on 2575797-result-diff

Issue #2575797: Store only diff of results with branch for issue tests,...

Comment #12

15 May 2017 at 20:45

drumm committed 8ef1b40 on 2575797-result-diff
```
Issue #2575797: Clear the schema cache
```

Comment #13

15 May 2017 at 20:45

drumm committed 3518f7f on 7.x-3.x

Issue #2575797: Store only diff of results with branch for issue tests,...

drumm committed 8ef1b40 on 7.x-3.x
```
Issue #2575797: Clear the schema cache
```

Comment #14

drumm

he/him

NY, US

commented 16 May 2017 at 14:51

One more cache clear is needed as the last update starts: DELETE FROM cache WHERE cid LIKE 'entity_%';

This is running slowly and successfully on staging. I’ll be deploying to production shortly.

Remaining work:

Store new test results as diffs.
Update results in email notifications to show the diff.

Comment #15

19 May 2017 at 15:24

drumm committed 60157c3 on 7.x-3.x

Issue #2575797 by drumm: Adjust OOM protection

Comment #16

19 May 2017 at 19:51

drumm committed dffa027 on 7.x-3.x

Issue #2575797 by drumm: Refactor result fetching for UIs to use common...

Comment #17

19 May 2017 at 20:50

drumm committed d2ec7ec on 7.x-3.x

Issue #2575797 by drumm: Store diff for new results

Comment #18

drumm

he/him

NY, US

commented 19 May 2017 at 20:56

This has been running for quite awhile on production. Over 5G has been cleared with only ~6% processed.

With the latest commits, only the diff will be stored for new jobs.

Currently processing is still crawling along with a hook_update_N() running for a very long time. The last part here is making a drush process to keep processing without blocking deployments. And the drush process can load each branch job result once, which should cut up to 1/3 off the processing time.

Comment #19

drumm

he/him

NY, US

commented 25 May 2017 at 18:31

Status:

Active

» Fixed

This is working well. We’ve freed 15G, and are on track to free another 60G.

Comment #20

8 June 2017 at 18:34

Status:

Fixed

» Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

Store only diff of results with branch for issue tests

Comments

Comment #1

Comment #2

Comment #3

Comment #4

Comment #5

Comment #6

Comment #7

Comment #8

Comment #9

Comment #10

Comment #11

Comment #12

Comment #13

Comment #14

Comment #15

Comment #16

Comment #17

Comment #18

Comment #19

Comment #20

Change records for this issue

News items

Our community

Documentation

Drupal code base

Governance of community