Hello :)

So, let's see if it's possible to write code that will automatically (based on certain configs) purge webform submissions.

I do it mostly for #2838423: Drafts for anonymous users. However, I prefer to file it as a separate ticket for the following reason:
well, it's just an independent task :) while it goes along with the anonymous drafts, it may also be used outside of the anonymous drafts scope

I propose to do it:

  • as an independent webform handler, so each webform may have it's own auto-purging settings
  • the actual purging should happen in cron. Ideally we should marry somehow batch processing and crons. Because chances are some Drupal instances will have SOOO many submissions to purge that a single cron run won't be able to process it and will time out or error due to memory exhausting. So we should see if it's possible to delete submissions in certain (presumably manageable in a single cron run) batches.
  • configurable about which submissions should be deleted:
    • Amount of submissions to purge per single cron run (see the position above about possible time outs/memory exhausting)
    • Draft status: purge only drafts, purge only full submissions, purge any
    • Owner status: purge logged in submissions, purge anonymous submissions, purge any
    • Purge anything that is older than ____ (5 days... 1 month... 10 hours and so on)
    • Purge any submission which is more than Nth submission (sort all submissions per date and remove anything that goes after Nth submission, i.e. something like SELECT * ... FROM ... LIMIT 100, Nth)

This way we should have a rather sharp and multi-applicable swiss knife whenever it comes to auto-purging and while such set up perfectly covers anonymous drafts purging, it also can be applied in other scenarios as end user may please.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

bucefal91 created an issue. See original summary.

jrockowitz’s picture

@bucefal91 I am convinced we need to add this feature and I think your proposal is accurate but I would suggest we just make this a fully integrated feature and not a handler. The 'Automatic purging of webform submissions' is not applicable if submissions are not saved to the database. Handlers don't currently support being hidden when submissions are not saved to the database.

We definitely need to limit how many submissions are being purged per cron task and we must execute WebformSubmission->delete() so that all pre_delete() hooks are triggered.

I see exactly how to do this and it would only take me an hour or two to setup. I am very open doing the work especially if this would allow you to put more time in to the Webform Views integration module. I would create a feature branch which you can review.

I am also okay if you want to take on this task on yourself and get your hands dirty.

bucefal91’s picture

mmm, I basically need this purging for a website my company develops. So I am somewhat under time pressure to have it. I would at least write a patch myself and you will see whether it goes along with your vision/plans.

we just make this a fully integrated feature and not a handler.

Do you mean just adding more settings and basically working in WebformEntitySettingsForm class instead of a new handler where those purging settings are edited?

The 'Automatic purging of webform submissions' is not applicable if submissions are not saved to the database.

Whoups, I wasn't aware that non-DB storage is possible for webformsubmissions (of course, unless you override its storage handler.) But even if we consider such case there shouldn't be a problem: the auto-purging handler will simply query for necessary webform submissions via something like entityQuery or using existing methods of WebformSubmissionStorageInterface and then will remove the submissions via invoking WebformSubmissionStorageInterface::delete(). So the auto-purging handler will respect encapsulation of storage, i.e. the auto-purging handler will only invoke methods of the storage interface leaving the specifics of actual storage (DB or whatever else) to the storage class itself. So if the default storage handler is swapped with some other implementation (as long as that other implementation is bug-free and complies with WebformSubmissionStorageInterface) the thing just keeps working.

I see it beneficial to do it as a handler because it's also possible to set up more than 1 auto-puring handler, so something like "purge anonymous drafts that are older than 1 day" and "keep only latest 100 logged-in drafts while purging all other" kind of purging can be set up.

Do you see my reasoning? Though if you do prefer to do it via settings of a webform, then I'll follow your vision, my point here is to get this feature coded :)

jrockowitz’s picture

Yes, the feature should be added to WebformEntitySettingsForm and include hide/show logic to hide it when saving of submission is disabled.

A few implementation tasks and notes....

  • Look at how webform's draft and preview settings is setup.
  • Add 'purge' and 'purge_days' settings to \Drupal\webform\Entity\Webform::getDefaultSettings
  • Purge settings should use a string value ('' => 'None'; 'draft' => 'Draft'; 'completed' => 'Completed'; 'all' => 'Draft and Completed')
  • Default number of days should be at least 2 weeks
  • Make sure to implement hide/show
  • Feature can have dedicated details element called 'SUBMISSION PURGING' and come after 'SUBMISSION LIMITS'
  • Number of submissions purged per cron task needs to be added to Webform admin settings. (after BATCH SETTINGS)
  • Write update hook that calls _webform_update_admin_settings() and _webform_update_form_settings() to sync existing installed config.
  • You must run this update hook on all the existing webform.module config and export the updated config (I can help with this)
  • Writes tests. Might be able to trigger cron purge by manually setting days to a negative number.
  • Purge business logic should live in the \Drupal\webform\WebformSubmissionStorage or a new service.
bucefal91’s picture

Gotcha! Today I've already started working on other things but on Monday I'll post my patch here that follows your instructions.

bucefal91’s picture

I am nearly finished. I've done all the coding and then somewhat got stuck with tests. Initially I wanted to do it as a Unit test but it was nearly impossible to mock entityQuery objects (at least for my level of understanding and experience of unit tests). So I had to switch to Kernel level of testing and haven't fully polished them yet.

jrockowitz’s picture

@bucefal91 I just started to move from some tests from SimpleTest to PHPUnit Kernel tests. The more immediate priority is that the code is maintainable and does have a reasonable amount of test coverage.

At some point we should setup a WebformKernelTestBase class that has some webform specific helper methods and assertions.

bucefal91’s picture

Status: Active » Needs review
FileSize
30.86 KB

Here we go. I attach a patch. I followed your steps as close as I could understand them :)

I eventually did the test coverage as a Kernel test, a rather simple one but it does the job. I must admin I still haven't looked throughout the tests, so I cannot say anything meaningful about test architecture. But my past experience does say that it's often useful to have some abstract class with a toolkit (CRUD methods for tested concepts, assertions, and so on).

Let me know if this looks good. Locally my phpunit did not pass (my own new test did pass) but the rest of --group webform had errors:

27) Drupal\Tests\webform\Unit\Breadcrumb\WebformBreadcrumbBuilderTest::testType with data set #8 ('webform_source_entity', 'entity.{source_entity}.webfor...ssions')
@covers method does not exist \Drupal\webform\BreadCrumb\WebformBreadcrumbBuilder::applies: Drupal\Tests\webform\Unit\Breadcrumb\WebformBreadcrumbBuilderTest::testType with data set #8

/var/www/localhost.com/htz/web/core/tests/Drupal/Tests/Listeners/DrupalStandardsListener.php:24
/var/www/localhost.com/htz/web/core/tests/Drupal/Tests/Listeners/DrupalStandardsListener.php:133
/var/www/localhost.com/htz/web/core/tests/Drupal/Tests/Listeners/DrupalStandardsListener.php:148
/var/www/localhost.com/htz/vendor/phpunit/phpunit/phpunit:52

and also:

44) Drupal\Tests\webform\Unit\Breadcrumb\WebformBreadcrumbBuilderTest::testBuildUserSubmission
ReflectionException: Class Mock_WebformBreadcrumbBuilder_f2b7def7 does not have a constructor, so you cannot pass any constructor arguments

/var/www/localhost.com/htz/web/modules/contrib/webform/tests/src/Kernel/Breadcrumb/WebformBreadcrumbBuilderTest.php:117
/var/www/localhost.com/htz/vendor/phpunit/phpunit/phpunit:52

I had a bunch of such errors of each kind.

jrockowitz’s picture

First off, this code is really clean and good. It is very maintainable and could easily be committed AS-IS.

I think we should move the 'purge' method to the WebformSubmissionStorage service instead of creating a new WebformSubmissionPurge service. The webform module already has too many services.

The PURGE_NONE, PURGE_DRAFT, etc.. constants should be moved to the WebformSubmissionStorageInterface.

The 'purge_days' should default to an empty value use Drupal's #states API to hide/show and require this element when based on 'Automatically purge' value.

The 'purge_days' number input should be suffixed with 'days''

All @var Declaration should have a description.

Any usage of \Drupal::service('webform.purge') should have a @var declaration to help with refactoring via PHPStorm. @see webform_cron().

There is a minor typo ' puring' in WebformAdminSettingsForm.

I really don't want to allow someone to enable automated purging for every form via the UI. This could wipe out all submissions. I think we should remove the 'SUBMISSION PURGING DEFAULT SETTINGS'. Keep in mind that someone can easily write custom code to automatically purge submissions site wide, if they really want this feature.


All your tests and my tests are passing. How are you executing the PHPUnit tests? CLI, PHPStorm, or UI?

bucefal91’s picture

All of your feedback is implemented. :)

I run the tests via cli: vendor/bin/phpunit --group webform. My phpunit.xml is nothing special - just edited

    <!-- Example SIMPLETEST_BASE_URL value: http://localhost -->
    <env name="SIMPLETEST_BASE_URL" value="http://localhost.com/htz/web"/>
    <!-- Example SIMPLETEST_DB value: mysql://username:password@localhost/databasename#table_prefix -->
    <env name="SIMPLETEST_DB" value="mysql://root:1@localhost/htz"/>
    <!-- Example BROWSERTEST_OUTPUT_DIRECTORY value: /path/to/webroot/sites/simpletest/browser_output -->
    <env name="BROWSERTEST_OUTPUT_DIRECTORY" value="/var/www/localhost.com/htz/web/sitse/simpletest/browser_output"/>

It must be something in my setup, I guess, since D.org test bot is happy about the tests.

bucefal91’s picture

FileSize
18.29 KB

I couldn't attach more than 1 file for some reason in the above comment. Here's the interdiff against #8

jrockowitz’s picture

Some very, very minor tweaks...

Change 'purgeWebformSubmission' to just 'purge' because this method is associated with the WebformSubmissionManager


I have never used the '!value' setting in the #states API. To be consistent with the Webform module and Core can you change...

'visible' => array('select[name="purge"]' => array('!value' => WebformSubmissionStorageInterface::PURGE_NONE)),
'required' => array('select[name="purge"]' => array('!value' => WebformSubmissionStorageInterface::PURGE_NONE)),

to...

'invisible' => array('select[name="purge"]' => array('value' => WebformSubmissionStorageInterface::PURGE_NONE)),
'optional' => array('select[name="purge"]' => array('value' => WebformSubmissionStorageInterface::PURGE_NONE)),

All the exported config needs to have purge_days: 14 changed to purge_days: null or ''. I am not really sure if purge_days should be NULL or an empty string.


I still need to do a final round of manually testing but this is very close to done.

The final step would be to update the FEATURES.md file and the web page.

bucefal91’s picture

Tell me you're happy now :)

I've updated FEATURES.md to the best of my knowledge. The purge_days should be NULL because the schema dictates that it's an integer, so we can't put an empty string.

Also, I've updated all default (the YAML-dumped) webforms purge settings to "none", because if we have purge_days = NULL then purge must be "none", otherwise it's inconsistent. And as you said, the "safest" default setting about purging is to do no purging.

The #states support the notion ![operator] to negate it. I've redone it through 'invisible' and 'optional' - you see, I didn't remember about those 2 available states, so did it the other way around.

jrockowitz’s picture

@bucefal91 I am super happy.... you just implemented an awesome feature and all I had to do was nit-pick.

This was super easy to manually test using the below steps...

  • Execute drush webform-generate contact and generate 50 submissions
  • Open the 'Contact' form and confirm that there are 50 submission. (admin/structure/webform/manage/contact/results)
  • Enable submission purging after 7 days for the 'Contact' form. (admin/structure/webform/manage/contact/settings)
  • Execute drush cron.
  • Confirm that the old submissions were removed from the 'Contact' form. (admin/structure/webform/manage/contact/results)

Three minor changes I made....

  • Removed WebformEntitySettingsForm::validateForm because none of the other #states conditional settings are currently being validated. Just one less thing to support for the time being.
  • Changed WebformPurgeTest to WebformSubmissionStorageTest to encourage more tests to be written.
  • Made setting.purge condition use explicit purge constants. This guarantees that someone manually tweaking webform config won't accidental trigger submission purging.

As long as my changes pass, I will commit this ASAP. If I broke your patch... I will fix it.

  • jrockowitz committed c9d0411 on 8.x-5.x authored by bucefal91
    Issue #2843400 by bucefal91, jrockowitz: Automatic purging of webform...
jrockowitz’s picture

Status: Needs review » Fixed

  • jrockowitz committed 4c600cc on 8.x-5.x authored by bucefal91
    Issue #2843400 by bucefal91, jrockowitz: Automatic purging of webform...
  • jrockowitz committed df74cfe on 8.x-5.x authored by bucefal91
    Issue #2843400 by bucefal91, jrockowitz: Automatic purging of webform...
bucefal91’s picture

Nice :) I am happy too - now I can use the dev snapshot instead of patching webforms in the project I am developing :) The current dev has no big known issues, right? Basically I can choose current 8.x-5.x-dev instead of 8.x-5.0-beta4 + this patch?

jrockowitz’s picture

@bucefal91 Dev is relatively stable for Beta software. If you are using the Webform module on a production website, you should probably just use the patch, if it can be applied to beta4.

I am aiming for tagging releases at the beginning for every month.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

ressa’s picture

Awesome feature, just what I was looking for! It can now be found at admin/structure/webform/manage/contact/settings/submissions.

ElusiveMind’s picture

We were getting more web forms than could be purged on a cron, leaving many forms un-housekept. I could not find a way to increase the cron size in the settings forms, so I could add this to settings.php to make it process more. Added in case anyone stumbles on this to find this answer:

$config['webform.settings']['purge']['cron_size'] = 5000;

5000 can be the upper limit or you can change it according to your site.