Problem/Motivation
All use of md5 was removed for Drupal core since it's considered deprecated and should never be used in new applicaiotn code. It's more collision-prone, not any faster, and leads to hassle for Drupal implementors when lazy security audits flag it.
Proposed resolution
Change the hash function to a sha2 family - best choice may be a truncated sha512, since that hash is actually faster than sha256 on 64 bit cpus.
Add an update hook to change the DB schema, if needed, and to re-create a hash for existing queue items.
Remaining tasks
patch and updated tests
User interface changes
n/a
API changes
n/a
Data model changes
n/a
Comments
Comment #2
pwolanin commentedHere's the basics of a patch to fix this.
Comment #3
andypostGreat job!
would be great to inject serializer here to allow igbinary
Comment #4
andypostComment #5
e0ipsoThanks for this. My only additional feedback is that given we are renaming the db column here, should we rename it to something generic like
hashedto save ourselves some trouble in the future if we want to change algorithms again (or leave the algorithm up to the application)?Comment #6
pwolanin commentedYes, good point about the column name - I had that thought too. Possibly the value saved in the hash column should have a prefix indicating the hash used? That would allow a more seamless transition in the future to a different hash algo?
The problem with injecting the serialization service is that the \Drupal\Core\Queue\DatabaseQueue::claimItem() method would then also need to be reimplemented. I think that should be postponed to another issue and/or proposed for the core queue class?
Comment #7
pwolanin commentedHere's a quick update to make the column and method naming more generic. If you wanted to change to e.g sha3, it could be done with no update hook if you are willing to accept possible duplicate items in the queue from before the module update.
Would probably be helpful here to figure out a test of the update hook, or at least I'll want to verify it.
Comment #8
e0ipsoSome additional feedback. I am happy to implement this if you feel this is taking too long.
Instead of asking for a hash algorithm identifier to be of the same length we could make the database column size of fixed length (keeping 32 should be fine). After that we can take
32 - strlen($algo_name)characters from the base64. In the end the composed hash will always be 32 chars. This will help avoid potential PDO exceptions in the future caused by invalid lengths after improperly updating the hash algorithm identifier constant.Comment #9
pwolanin commented@e0ipso - sure, but as a general standard you want about 256 bits of the hash for really robust collision resistance, so 32 chars is too short. Thus, I'd keep the column length at 47 or 48 chars (so about 43 chars of 6 bit hash).
Comment #10
pwolanin commentedQuick update
Comment #11
pwolanin commentedPartial test of update hook - needs more to check conversion of existing items.
Comment #12
e0ipsoThis is fair. My usage of the module has been on not very busy queues, so I never really worried about collissions. Other usages may differ, so I take your point.
I am amazed about the dedication you have shown so far.
Comment #13
pwolanin commentedHere are improvements to the 1st test method. It had one actual bug fixed. Still need to wrap the new test method up.
The fixed bug in the test was this line at the end:
It's passing in
$item_idinstead of$dataComment #14
pwolanin commentedOk, I think this is complete enough in terms of test coverage.
Comment #15
rstaylor commented#14 looks good to me. I've tried feeding it sample data including duplicates and it is working well for me.
Comment #16
pwolanin commentedtrivial change - removes unused use statement from the test class.
Updated patch and interdiff.
Comment #18
e0ipsoThanks a lot! This was committed and released.
Comment #19
pwolanin commentedexcellent, thanks!