Problem/Motivation
I'm trying to fix D7 tests with SQLite in #3172878: Fix SQLite tests in Drupal 7.
It looks like enabling WAL in SQLite may be essential to allow the level of concurrency we use in drupalci (see below); it's a good thing to do anyway (#3172877: Enable WAL journal mode by default for SQLite database [D7 backport]).
However, when we do that in D7 core, it seems that testbot can no longer open the results database to generate the report at the end of a run.
I believe the problem is due to permissions on the directory; when using WAL the directory that the SQLite database is within needs to be writeable for the client to be able to run queries. This is because SQLite will create some temp files that sit alongside the db file, and if the directory is not writeable it can't do so.
See, for e.g. https://stackoverflow.com/a/20372658
The error we're getting is, e.g.:
https://dispatcher.drupalci.org/job/drupal_d7/94138/console
18:39:45 JavaScript 117 passes, 0 fails, and 0 exceptions
18:39:46
18:39:46 Test run duration: 2 min 16 sec
18:39:46
18:39:46 Attempting to connect to database server.
18:39:46 Database is active.
18:39:46 chown -R 1001 /var/www/html
18:39:46 chown -R 1001 /var/www/.composer/cache
18:39:46 chmod -R 777 /var/www/html
18:39:50
18:39:50 In JunitXmlBuilder.php line 47:
18:39:50
18:39:50 SQLSTATE[HY000]: General error: 14 unable to open database file
Without WAL enabled in SQLite I think the tests quickly hit locking problems e.g. https://www.drupal.org/pift-ci-job/1830033
AJAX forms on cached pages 33 passes, 0 fails, and 0 exceptions
PDOException: SQLSTATE[HY000]: General error: 5 database is locked in /var/www/html/includes/database/prefetch.inc:168
Stack trace:
#0 /var/www/html/includes/database/prefetch.inc(168): PDOStatement->execute(Array)
Steps to reproduce
That error happens when \DrupalCI\Build\Artifact\Junit\JunitXmlBuilder::generate() tries to run a SELECT query on the results database.
https://git.drupalcode.org/project/drupalci_testbot/-/blob/production/sr...
public function generate(array $test_groups, DatabaseInterface $results_database) {
$mapped_results = [];
$db = $results_database->connect($results_database->getDbname());
$q_result = $db->query('SELECT * FROM simpletest ORDER BY test_id, test_class, message_id;'); // <= line 47
...and I believe that's because the directory containing the file is not writeable by the current user at that point.
Some more context; I put some debug code in just before this SELECT query and copied the db file elsewhere to make sure it existed and contained the expected data.
If I open that file with the sqlite3 client and run a SELECT, some temp files are created - e.g.:
testbot@drupalci:/var/lib/drupalci/workspace$ ll
total 24712
-rw-rw-rw- 1 testbot testbot 12636160 Sep 25 11:03 db.sqlite
-rw-rw-rw- 1 testbot testbot 32768 Sep 25 11:06 db.sqlite-shm # <== temp file
-rw-rw-rw- 1 testbot testbot 0 Sep 25 11:06 db.sqlite-wal # <== temp file
If we change the permissions on the directory so that it's not writeable by testbot, and try to run a SELECT:
testbot@drupalci:/var/lib/drupalci/workspace$ chmod a-w .
testbot@drupalci:/var/lib/drupalci/workspace$ sqlite3 db.sqlite
SQLite version 3.8.7.1 2014-10-29 13:59:56
Enter ".help" for usage hints.
sqlite> SELECT * FROM simpletest ORDER BY test_id, test_class, message_id;
Error: unable to open database file
Proposed resolution
I believe one way to fix this would be to ensure that the directory the db sits in is writeable before we try to query it for the final results.
There are probably different ways to do this - you could move the file to somewhere writeable like /tmp perhaps?
Or there are other places in the code that try to "open up the permissions" e.g. \DrupalCI\Build\Build::cleanupBuild():
// Open up permissions on containers.
$uid = posix_getuid();
$environment = $this->container['environment'];
$commands = [
'chown -R ' . $uid . ' ' . $environment->getExecContainerSourceDir(),
'chown -R ' . $uid . ' ' . $environment->getContainerComposerCacheDir(),
'chmod -R 777 ' . $environment->getExecContainerSourceDir(),
];
$result = $environment->executeCommands($commands);
$db_container = $environment->getDatabaseContainer();
$db_dir = $this->container['db.system']->getDataDir();
$commands = [
'sudo chown -R ' . $uid . ' ' . $db_dir,
'chmod -R 777 ' . $db_dir,
];
if (!empty($db_container)) {
$result = $environment->executeCommands($commands, $db_container['id']);
}
Maybe we could make those permissions changes on the db_dir earlier?
I think the fix needs to happen in the testbot code rather than in D7 core if we're going to be able to use WAL at all (and I think we need to).
Remaining tasks
- Figure out an acceptable way to make the results db accessible to generate the final report, and implement it.
User interface changes
n/a
API changes
n/a
Data model changes
n/a
| Comment | File | Size | Author |
|---|---|---|---|
| #2 | 3173146-2.patch | 1.01 KB | mcdruid |
Comments
Comment #2
mcdruid commentedNot sure if this is the ideal solution, but seems to have got it working for me.
Comment #3
mcdruid commentedComment #4
mcdruid commentedComment #5
mcdruid commentedAFAICS this works with just the
chown- seems thechmod -R 777is not necessary.I think the directory is owned by www-data rather than testbot, and that's the problem.
Comment #6
joseph.olstadI believe this is the issue queue that is most active for this type of ci testing issue
Comment #7
joseph.olstadComment #8
MixologicI reviewed and tested this, deployed it to production. Drupalci now has this fix.
Thanks McDruid!
Comment #9
joseph.olstadGo @McDruid Go!
lots of core patches to go yet! I was hoping for the variable_get stale cache fix to go in (asap) gotten bit too many times with this issue:
#973436-136: Overzealous locking in variable_initialize()