In #2931611: Honeypot key value entries never expire (effectively), a bug which allowed an infinite number of honeypot_time_restriction entries to be written was fixed. However, for any Drupal 8 sites running Honeypot prior to the version with that fix, there are likely at least hundreds, if not thousands or more rows in the database that are set to never expire.

An update hook should be written which deletes all entries without an expiration set. The update hook could take a long time to run for some sites with a huge number of entries, so it may need to process them using the batch api.

Issue fork honeypot-2997609

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

geerlingguy created an issue. See original summary.

C-Logemann’s picture

We have a customer system where we already have more than 6,7 million honeypot entries blowing up the DB with some GB. I will create a direct database command to delete. But I agree that there should be a solution for all the unneeded honeypot data on all systems.

geerlingguy’s picture

@C_Logemann - Thanks for the information! Once you have that DB query, could you share it here in case someone else needs to do the same thing until this update hook is written?

C-Logemann’s picture

@geerlingguy Yes I already planned to share my SQL-Snippit when it's tested.
What's your Idea for the clean up at the update hook? Maybe also a database call for deleting all honeypot entries with too long time to expiration in key_value_expirable? This is something I can develop on upcoming Drupal Global Contribution Weekend January 25-27, 2019.

C-Logemann’s picture

All following SQL-Codes are wrapped in drush 8 commands.

It's good to find out how many table rows you have to decide what to do next. Maybe we can also analyze the data in the update hook and maybe do not delete all at once if the value is to high (see below):

drush sql:query 'SELECT collection, COUNT(*) AS num FROM key_value_expire WHERE collection = "honeypot_time_restriction";'

Because all "expire" values are identical in my case I removed an additional condition to filter time based. But if you like to operate more granular you can calculate a unix timestamp and use it in this way:

drush sql:query 'SELECT collection, COUNT(*) AS num FROM key_value_expire WHERE collection = "honeypot_time_restriction" AND expire > "1147483647";'

Before you decide which strategy you want to use to delete it's better to test it because it can take a long time to delete and this can maybe affect the normal usage of your system.
In our case the following command was running some minutes. So I decided to not use on the production system but it was helpful to use on the dev system to analyze the data. Because of this performance impact I think we need to warn people if we put this in an update hook. It should only be used with drush without php time limit and decoupled from SSH connection (screen, tmux, byobu ...):

drush sql:query 'DELETE FROM key_value_expire WHERE collection = "honeypot_time_restriction";'

After analyzing the data which were not deleted with the command above I decided to truncate the complete table because this was running much faster and was freeing disk space very fast.
(Edit:) Maybe take a look at this Post in "Drupal Answers": https://drupal.stackexchange.com/questions/221985/what-is-the-purpose-of....

But this is nothing for an update hook and should only be used if you can loose other data in the table "honeypot_time_restriction" which is not from the honeypot module:

drush sql:query 'TRUNCATE TABLE key_value_expire;'

Anybody’s picture

Whao key_value_expire grew up to 2,3 GB with honeypot entries on a customer site... So this is still quite important. ;)

Manuel Garcia’s picture

Status: Active » Needs review
FileSize
876 bytes

Had a go at this, turns out we cant do it in batches, as far as I know - though let me know if you have ideas:

  • key_value_expire does not have any id-like columns
  • you cant use range/limit on delete queries

So here is the update hook deleting all entries with an expiry date older than what we set by default. I also removed an unused use statement while I was at it :)

Manuel Garcia’s picture

+++ b/honeypot.install
@@ -73,3 +73,14 @@ function honeypot_update_8100() {
+    ->where('collection = honeypot_time_restriction AND expire > ' . \Drupal::time()->getRequestTime() - 3600*24)

actually, this should be a < I believe

geerlingguy’s picture

Status: Needs review » Needs work

It looks like this adds 3 new coding standards issues... https://www.drupal.org/pift-ci-job/1265461

Manuel Garcia’s picture

Status: Needs work » Needs review
FileSize
743 bytes
869 bytes

Addressing #8 & #9

Manuel Garcia’s picture

Forgot one.

Anybody’s picture

Wonderful, @Manuel Garcia, RTBC+1 for #11. Works perfectly for me. Thanks!!

C-Logemann’s picture

Status: Needs review » Needs work

The current patch has no solution to handle situations where we have to many wrong entries in database. There is a big risk that this update hook can fail in such situations especially when triggered by update.php. So I strongly recommend to add a strategy for communication with site admin before starting a big cleanup. Maybe it's better to move the cleanup thing away from update hook to main code with an UI e.g. a warning and a corresponding "cleanup link" on the status page.

Spokje’s picture

Status: Needs work » Needs review
FileSize
2.58 KB

Added a patch that uses the Batch API deleting the incorrect/never expiring entries for honeypot.

If there are a _lot_ of those entries, this update will take a _long_ time, but it _shouldn't_ take down the site doing the cleanup.

I think the decision to do this to the UI as suggested in #13 could be better, we (with around 20 D8 sites) managed to get away with the batched hook_update_N approach as in the patch, since we didn't manage a D8 site with "mega enormous" amounts of the mentioned entries

Spokje’s picture

Hiding previous patches, since they _will_ take a site down due too DB becoming unresponsive upon running update.

Spokje’s picture

Note to self: first make sure you uploaded the correct patch, and then click 'Save'...

Also another extra reason to do this in an UI operation: Since triggering a hook_update (at least through the command line with drush) puts the site automatically in Maintenance Mode, and this can be a _long_ update, the Maintenance Mode "down time" can be more than just annoying.

Spokje’s picture

Sorry, incorrect index line in previous patch

geerlingguy’s picture

The proper way to do this cleanup is via update hook, though—if we don't automate it or leave it to the site owner to do it, in 90%+ of cases it will never get done... I haven't had issues with batch API on a well-tested update hook on many different sites, often operating on hundreds of thousands or millions of database rows.

Maybe I'm missing something major, but what is the main risk with this being in an update hook?

Since triggering a hook_update (at least through the command line with drush) puts the site automatically in Maintenance Mode, and this can be a _long_ update, the Maintenance Mode "down time" can be more than just annoying.

Is that true? I don't believe running updates via drush up or even via Drupal's update.php ever puts the site into maintenance mode, unless you explicitly do that _before_ the updates...

Spokje’s picture

@geerlingguy

If we don't automate it or leave it to the site owner to do it, in 90%+ of cases it will never get done

QFT

I haven't had issues with batch API on a well-tested update hook on many different sites, often operating on hundreds of thousands or millions of database rows.

Maybe I'm missing something major, but what is the main risk with this being in an update hook?

Same here, I just did an update with the patch in #18 to wipe out around 525.000 rows, took around 10 minutes, site behaved normally during the whole time.

Is that true? I don't believe running updates via drush up or even via Drupal's update.php ever puts the site into maintenance mode, unless you explicitly do that _before_ the updates...

That would be the only caveat I can see.

Also, I never noticed this before, but seeing this hook_update_N took much longer than the "normal" ones, the reason for not noticing could be that "normal" ones are finished before my normal browser-refresh to check on the site is done.

I couldn't remember seeing a CR for this so I did some research:

Turns out when using the Batch API in a hook_update_N and using drush, Maintenance Mode is triggered: https://drupal.stackexchange.com/a/270795

So basically I would say: hook_update_N should be the way to go, batching seems the only way to keep the database from choking whilst doing the purge, so my patch in #18 seems, at the very least IMHO, the correct way of tackling the purge.

Manuel Garcia’s picture

Glad to see there's a way about batching this, nice work figuring that out.

The proposed batch update is going to be very slow since it does a delete query per row. This would be solved by using LIMIT on the delete query per batch, but unfortunately since Postgres does not support his feature, neither does Drupal.

Spokje’s picture

@Manual Garcia

The proposed batch update is going to be very slow since it does a delete query per row. This would be solved by using LIMIT on the delete query per batch, but unfortunately since Postgres does not support his feature, neither does Drupal.

I agree on the slowness, but I, as you did, found out there's no LIMIT available on a DELETE query in Drupal.
I sadly think, this is the fastest way that is possible with the tools available, but would love to be proven wrong.

gambry’s picture

I'm not sure - but I presume - using DELETE WHERE name IN ([..]) query is faster than looping and querying 500 rows per batch?
i.e.

$query = $database->select('key_value_expire', 'kve');
$query->fields('kve', ['name']);
$query->condition('collection', 'honeypot_time_restriction');
$query->condition('expire', 2147483647);
$query->range(0, $deletable_honeypot_time_restriction_entries_per_batch);
$result = $query->execute();

$database->delete('key_value_expire')
->condition('collection', 'honeypot_time_restriction')
->condition('expire', 2147483647) // Not needed.
->condition('name', $result->fetchCol(), 'IN')
->execute();

It runs in ms on mine, so probably the limit can be bumped up to 1k. And I presume even higher.

UPDATE: to optimise and speed up the delete query we can at this point also remove the ->condition('expire', 2147483647) bit, as the condition already ran above and the primary key for the table is collection + name.

jwilson3’s picture

+++ b/honeypot.install
@@ -6,7 +6,6 @@
-use Drupal\Core\Link;

@@ -48,7 +47,7 @@ function honeypot_schema() {
-    \Drupal::messenger()->addMessage(t(
+    drupal_set_message(t(

These unrelated lines should be removed from this patch.

See https://www.drupal.org/node/2774931

jwilson3’s picture

Status: Needs review » Needs work
marthinal’s picture

Status: Needs work » Needs review
FileSize
2.04 KB

Rerolled

Manuel Garcia’s picture

Thanks for the rerroll @marthinal

I had a thought about what @gambry said on #22 and to me it makes a lot of sense, so I went ahead and updated the patch to use that approach. I've not actually tested this so please kindly do so on your environments and report back :)

C-Logemann’s picture

Triggering the maintenance mode is not a caveat for me because it's recommended to use maintenance mode when running updates (e.g. mentioned @ /core/UPDATE.txt). Especially site where user data are collected a backup (after activating maintenance mode) is highly recommended. So admins can switch back when an update fails. I just had this situation some days ago.

geerlingguy’s picture

I'm willing to merge this into the next release if I can get someone to review it with a live site's backup and make sure it doesn't cause any weird problems on running it.

aleksip’s picture

Status: Needs review » Reviewed & tested by the community

I tested the latest patch (#26) on a real production site (Drupal 8.8.1, Honeypot 1.30). The size of the key_value_expire table was 1,101 MB, and it contained 4,059,873 honeypot_time_restriction entries that needed to be deleted.

The update did put the site in maintenance mode. It took 3 hours 46 minutes on a relatively modest DigitalOcean Droplet setup and finished successfully. The size of key_value_expire went down to 5.55 MB.

The patch looks fine to me and seems to work as intended so marking as RTBC.

Manuel Garcia’s picture

Thanks @aleksip for reporting on that, great to hear it worked fine for you.

+++ b/honeypot.install
@@ -73,3 +73,52 @@ function honeypot_update_8100() {
+  $deletable_honeypot_time_restriction_entries_per_batch = 500;

(re #29) I think 3h and 46min is a very long time to execute an update hook. I'm guessing we could do much better here as @gambry mentioned in #22. We may want to bump this to 1000 or even higher.

Any chance you still have that DB before running the upgrade @aleksip? - would be great to test if it runs fine with 1000 or 2000 etc.

geerlingguy’s picture

Yikes; deleting 1 GB of db records shouldn't take 3 hours—I agree a larger batch size may be appropriate, at least 1000, maybe even like 5000? It's a matter of playing around with RAM usage vs time per batch.

gambry’s picture

If it tooks 226 min with 500 deletions per batch, increasing this to 5000 (x10) should reduce that time to about 22min, which is much shorter, but still an eternity for some websites to stay in maintenance mode.... have we investigated the possibility to do this cleanup in a cronjob?

Another option could be to make $deletable_honeypot_time_restriction_entries_per_batch configurable, i.e. using Settings or State API. We can set a default of 1000 (about 2hours) or 5000 (about 22min) and site managers can test the batch on stage websites and decide an alternative better value. I presume websites not accepting 22min as downtime have servers which should easily support 7.5k - 10k $deletable_honeypot_time_restriction_entries_per_batch.

geerlingguy’s picture

@gambry - We could also, in the release notes for the next version of the module that has this fix, add a warning like "if you would like to avoid having your site in maintenance mode for a while, you can manually remove all the entries using this process: (link here)." That might sidestep the issue entirely.

gambry’s picture

@geerlingguy that is a nice option we could do anyway. But I'm wondering how many site-managers follow the best practices of reading Release Notes before upgrading a module? 🤔

Also an additional option - which could sidestep the issue entirely AND more likely to be viewed by site-manager - is to add a hook_requirements() which queries key_value_expire table and if any "bad" record is found it raises a warning or error and links to a documentation page, or shows directly the query to clean up the table.

Then the work done in here can still be added:
a. as a API/helper for anyone to extend and use it
b. as a custom batch form, for not-technical site-managers
c. as a cron job task, site-managers can opt-out/opt-in from.

Thoughts?

Manuel Garcia’s picture

I think we should go MVP on this and write first the update hook to clean up people's sites before they run into (more) trouble.

We can always do follow ups if people are interested in the ideas on #34.

jonathanshaw’s picture

The point of #34 is to not use an update hook at all, but to flag the problem and make the solution opt-in. It seems like a good direction to me. People won't read the release notes, they will see their site is offline for some time, the will try to restart their server part way through the update, the issue queues here will get flooded with various kinds of weirdness...

Hook_requirements and a custom batch form might be safer.

geerlingguy’s picture

@gambry - I think I like your suggestion, specifically:

as a API/helper for anyone to extend and use it

We could just put the functionality into a helper function and say:

Run the following drush command to clean up the table: drush ev "honeypot_clean_up_old_entries();"

(Not that exact command/name, but basically if someone can just run one drush command (or one database query to run), it would be sufficient.

PapaGrande’s picture

Status: Reviewed & tested by the community » Needs work

We could just put the functionality into a helper function and say:

Run the following drush command to clean up the table: drush ev "honeypot_clean_up_old_entries();"

I agree with this approach and pulling the code out of an update hook.

As a benchmark, I changed $deletable_honeypot_time_restriction_entries_per_batch to 10000, and ran drush updatedb on a Pantheon Multidev server. 7584423 rows were removed in 6 minutes 53 seconds.

If I had deployed the update unawares and a production site was in maintenance mode for almost 7 minutes, I'd have probably had an aneurysm.

pfrenssen’s picture

Assigned: Unassigned » pfrenssen

I'll do an iteration on this. I'll start by implementing the hook_requirements() entry.

pfrenssen’s picture

Version: 8.x-1.x-dev » 2.0.x-dev

Switching to the latest version so I can make an issue fork.

pfrenssen’s picture

Status: Needs work » Needs review

Created a batch process that can be run in the UI through a link in a hook_requirements() warning in the status report. I did several tests and indeed the higher the number of items deleted in a single batch the faster the process goes. I settled on 50000 records per batch which matched roughly to the refresh time of the batch UI on my machine. If I put it any lower it processes multiple batches per page refresh. RAM usage seemed OK, we are retrieving 50000 integer values from the database after all, these don't take so much space.

Next step is to create a drush command to trigger the batch process.

pfrenssen’s picture

Title: Write update hook to clean up old honeypot_time_restriction values » 2997609-delete-expired-records
FileSize
25.99 KB
36.35 KB
18.21 KB

Here's how it looks in the UI:

Status report:
Screenshot of the status report

Confirmation form:
Screenshot of the confirmation form

Batch operation:
Screenshot of the batch operation

pfrenssen’s picture

Assigned: pfrenssen » Unassigned

Added a drush command that triggers the batch cleanup, it goes something like this:

$ ./vendor/bin/drush honeypot:delete-expired-records
>  [notice] Deleted 50000 of 2068589 records (2.41711% complete)
>  [notice] Deleted 100000 of 2068589 records (4.83421% complete)
>  [notice] Deleted 150000 of 2068589 records (7.25132% complete)
>  [notice] Deleted 200000 of 2068589 records (9.66843% complete)
>  [notice] Deleted 250000 of 2068589 records (12.08553% complete)
>  [notice] Deleted 300000 of 2068589 records (14.50264% complete)
>  [notice] Deleted 350000 of 2068589 records (16.91975% complete)
>  [notice] Deleted 400000 of 2068589 records (19.33685% complete)
>  [notice] Deleted 450000 of 2068589 records (21.75396% complete)
>  [notice] Deleted 500000 of 2068589 records (24.17107% complete)
>  [notice] Deleted 550000 of 2068589 records (26.58817% complete)
>  [notice] Deleted 600000 of 2068589 records (29.00528% complete)
>  [notice] Deleted 650000 of 2068589 records (31.42239% complete)
>  [notice] Deleted 700000 of 2068589 records (33.83949% complete)
>  [notice] Deleted 750000 of 2068589 records (36.25660% complete)
>  [notice] Deleted 800000 of 2068589 records (38.67370% complete)
>  [notice] Deleted 850000 of 2068589 records (41.09081% complete)
>  [notice] Deleted 900000 of 2068589 records (43.50792% complete)
>  [notice] Deleted 950000 of 2068589 records (45.92502% complete)
>  [notice] Deleted 1000000 of 2068589 records (48.34213% complete)
>  [notice] Deleted 1050000 of 2068589 records (50.75924% complete)
>  [notice] Deleted 1100000 of 2068589 records (53.17634% complete)
>  [notice] Deleted 1150000 of 2068589 records (55.59345% complete)
>  [notice] Deleted 1200000 of 2068589 records (58.01056% complete)
>  [notice] Deleted 1250000 of 2068589 records (60.42766% complete)
>  [notice] Deleted 1300000 of 2068589 records (62.84477% complete)
>  [notice] Deleted 1350000 of 2068589 records (65.26188% complete)
>  [notice] Deleted 1400000 of 2068589 records (67.67898% complete)
>  [notice] Deleted 1450000 of 2068589 records (70.09609% complete)
>  [notice] Deleted 1500000 of 2068589 records (72.51320% complete)
>  [notice] Deleted 1550000 of 2068589 records (74.93030% complete)
>  [notice] Deleted 1600000 of 2068589 records (77.34741% complete)
>  [notice] Deleted 1650000 of 2068589 records (79.76452% complete)
>  [notice] Deleted 1700000 of 2068589 records (82.18162% complete)
>  [notice] Deleted 1750000 of 2068589 records (84.59873% complete)
>  [notice] Deleted 1800000 of 2068589 records (87.01584% complete)
>  [notice] Deleted 1850000 of 2068589 records (89.43294% complete)
>  [notice] Deleted 1900000 of 2068589 records (91.85005% complete)
>  [notice] Deleted 1950000 of 2068589 records (94.26716% complete)
>  [notice] Deleted 2000000 of 2068589 records (96.68426% complete)
>  [notice] Deleted 2050000 of 2068589 records (99.10137% complete)
>  [notice] Deleted 2068589 of 2068589 records (100% complete)
>  [notice] Message: Successfully deleted 2068589 expired records.
pfrenssen’s picture

Title: 2997609-delete-expired-records » Provide ways to clean up old honeypot_time_restriction values
idimopoulos’s picture

Status: Needs review » Reviewed & tested by the community

To get the easy part out of the way, code looks fine. Tested the UI part and works fine. Status report reports fine.
I must say I don't really enjoy this approach. We are having some stale data and we are creating a form to delete them. This form is a one time form which we will have to ditch later as it only works on records with expiration time set to 2147483647. But then again, we don't know when to remove it because we cannot know when everyone is going to remove the stale data.

However, given
A) the comments starting from #34
B) I tried to manually expire the records on our side. It seems that the system_cron is the one taking care of the expired records but not in batches. system_cron will attempt to delete them all at once. Since key_value_expire is a standalone table (not a fieldable entity or anything complex), it is quite fast, but not fast enough nonetheless. It took me 331 seconds for < 3m records to be deleted (I also had XDEBUG enabled).
C) Creating a different cron run would more or less do the same thing - would work on stale data until no more stale data exist and then would be called forever without purpose.

I am going to give the approach a green.

joseph.olstad’s picture

we have 225 megabytes in this table .... just noticed!

***EDIT*** our entries were actually from access_unpublished

joseph.olstad’s picture

Status: Reviewed & tested by the community » Needs work

Patch 26 ok, but latest merge request doesn't provide a drush command for drush 10.x

does not work with drush 10.x

so I'm going with patch 26 instead as a temporary workaround.

joseph.olstad’s picture

Status: Needs work » Needs review

OOPS,

honeypot was not enabled, that's probably why I couldn't find the drush command
need to review

gambry’s picture

Status: Needs review » Reviewed & tested by the community

I think this was RTBC before your comment @joseph.olstad?
Move it back if I'm missing something.

joseph.olstad’s picture

yes the drush command seems to work fine, thanks, my problem was actually access_unpublished expired keys, not honeypot.

loopy1492’s picture

@joseph.olstad I think I'm having the same isssue. Did you just use the same command except use 'access_unpublished' instead of 'honeypot'?

Anybody’s picture

Any active maintainer? Would anyone like to contact the maintainer and tell him about this issue or offer Co-Maintainership?

TR’s picture

@Anybody: WTF is that supposed to mean?

This may be RTBC, but it's a terrible solution, which is acknowledged by the person who RTBC'd it above.

It's great to have a patch here for those who need to use it for a one-time cleanup, but IMO it's not something that should be added to Honeypot permanently.

Anybody’s picture

Status: Reviewed & tested by the community » Needs work

@TR: No reason for strong language. After reading my comment again, I totally understand that it was too short, and should point out why I thought it would make sense to have a maintainer look at this as it's RTBC. But still doesn't justify your unfriendly reply.

I read the comments, had a look at the status and had the impression it would be useful to have a maintainer have a look at this. And it seemed to me, that there was no maintainer activity in this issue since a longer period of time.

So let's set this back to NW, as it should be, as of your unfriendly comment. Anyway, thanks for your reply... but please let's stay friendly.

To be honest, it was just meant to ask for help with Maintainership, but as I see now, you propably read it as criticism.

claudiu.cristea’s picture

If I'm running drush core:requirements immediately after a clean site install, I'm getting:

  SQLSTATE[42S02]: Base table or view not found: 1146 Table '9ToAslWwz5.key_v  
  alue_expire' doesn't exist: SELECT COUNT(*) AS "expression"                  
  FROM                                                                         
  (SELECT 1 AS "expression"                                                    
  FROM                                                                         
  "key_value_expire" "kve"                                                     
  WHERE ("collection" = :db_condition_placeholder_0) AND ("expire" = :db_cond  
  ition_placeholder_1)) "subquery"; Array                                      
  (                                                                            
      [:db_condition_placeholder_0] => honeypot_time_restriction               
      [:db_condition_placeholder_1] => 2147483647                              
  )                                                                            
                                                                               
In StatementWrapper.php line 145:
                                                                               
  SQLSTATE[42S02]: Base table or view not found: 1146 Table '9ToAslWwz5.key_v  
  alue_expire' doesn't exist         

The key_value_expire table is created on demand. This explains the failure.

claudiu.cristea’s picture

Status: Needs work » Needs review
FileSize
993 bytes

Trying again the "test only" as a patch.

TR’s picture

Version: 2.0.x-dev » 2.1.x-dev