Currently when an entity has never been published, it has a publication date set to Tuesday, January 19, 2038 at 03:14:07 UTC, which is the value of the PUBLICATION_DATE_DEFAULT
constant. According to the documentation the reason for this is to make sure that unpublished nodes appear at the top of lists when sorting by newest date:
/**
* Define the value stored in the database when a node is unpublished and no
* publication date has been set. We use the largest number that the database
* field can hold so unpublished nodes will appear newer than published nodes
* when sorted by publication date.
*
* @note: This is going to trigger the Year 2038 problem.
*/
define('PUBLICATION_DATE_DEFAULT', 2147483647);
I think this is not a good approach, and the sorting of unpublished content to the top is a niche use case that can be solved in other ways which do not warrant to violate basic expectations of the data model.
Because of this approach the following features are not working as expected:
-
Checking if the field is empty using
FieldItemInterface::isEmpty()
. If I would call$node->get('published_at')->isEmpty()
for a node that has never been published, I should getTRUE
, but instead I now getFALSE
because there is a fake value present. -
We can not use conditions in entity queries in the normal way. If I want to get recently published content I cannot do
$query->condition('publish_at', strtotime('1 week ago'), '>')->execute()
since this will unexpectedly include unpublished content. - When a never before published node is rendered it is expected that no publication date is shown. This doesn't work, instead the date Tuesday, January 19, 2038 is shown. This is shown both in the frontend, and in administration interfaces.
- Sorting of nodes by publication date is broken. Unpublished items are shown always at the top, pushing all recently published content away. If the user orders by "publication date descending" it means that they want to see the most recently published nodes at the top, not several pages of random unpublished nodes. In large projects it is not uncommon to have thousands of unpublished nodes in the database.
Comment | File | Size | Author |
---|---|---|---|
#13 | 3066446-13-8.x-2.0-beta2.patch | 7.78 KB | szeidler |
| |||
#12 | interdiff-3066446-8-12.txt | 630 bytes | markdorison |
#12 | 3066446-12.patch | 8.54 KB | markdorison |
| |||
#8 | 3066446-8.patch | 8.56 KB | webflo |
|
Issue fork publication_date-3066446
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
pfrenssenComment #3
idimopoulos CreditAttribution: idimopoulos at Randstad Digital for European Commission and European Union Institutions, Agencies and Bodies commented@pfrenssen I see you also have https://www.drupal.org/project/publication_date/issues/3066317, which computes some extra values for this field.
You should note that if the field is set to empty, the
$entity->published_at->published_or_now
fails with an exception of requesting a property on null.This is because the
$entity->published_at
retrieves the field item list and then attempts to access the first element from the list of values. The first element is null so it tried to fetch a null value.It is the same using the
::get()
and the same with the magic method__get()
.This kinda removes the possibility of using the computed properties when the publication date is not set so there is no point having them if we don't fix it.
Comment #4
idimopoulos CreditAttribution: idimopoulos at Randstad Digital for European Commission and European Union Institutions, Agencies and Bodies commentedPlease, disregard my previous comment. A default value is already being set. The problem I had lies elsewhere, I will update the proper ticket.
Comment #5
claudiu.cristeaFully agree. This
PUBLICATION_DATE_DEFAULT
inserts crap in the database. The "sorting issue" should be fixed by the code that is building the listing. However, this module could provide a special sort handler to cover this case for Views. This is a critical as perverts the DB content.Comment #6
claudiu.cristeaThis is already fixed in D7, see #3074177: Replace unpublished content sorting hack with real Views handlers.
Comment #7
szeidler CreditAttribution: szeidler at Ramsalt Lab commentedI second the issue. When using the publication_date token in Pathauto it will create to undesired aliases in the database. When using the redirect module on top of it it will create a redirect.
Comment #8
webflo CreditAttribution: webflo at UEBERBIT GmbH commentedI am on board with removing PUBLICATION_DATE_DEFAULT.
Comment #9
jstollerPUBLICATION_DATE_DEFAULT was an old D7 hack. The current D7 version of Publication Date sets four node properties:
published_at
,published_at_or_now
,published_at_or_changed
andpublished_at_or_created
. When the publication date is empty, those last three default to the current date, last changed date and created date, respectively. The views handlers for sorting and filtering then let you choose how you want empty publication dates to be handled: 'NULL', 'Current time', 'Content created date', or 'Content last modified date'. I'd suggest adding something similar to the D8/D9 version.Comment #10
donquixote CreditAttribution: donquixote commentedWe also need to fix publication_date_tokens() in the D8 version to allow for NULL values.
This was already fixed in D7 in #3081923: Published token should support NULL value and fallback dates.
I think for D8 we should do a more minimal fix, without introducing the additional properties.
Comment #11
jonathanshawI don't see why we need this isPublished() check in isEmpty(). It seems like preSave() adds a value if isPublished(), so parent::isEmpty() will never be true if isPublished() is true?
Comment #12
markdorisonPer #11, is there a scenario where we need to override
isEmpty()
? Removing the override would solve the following item from the issue summary:Comment #13
szeidler CreditAttribution: szeidler at Ramsalt Lab commentedIn case someone needs an applying patch for 8.x-2.0-beta2. Based on #12.
I'm wondering: Do we need a update hook, that NULLs the existing 2038 data?
Comment #16
star-szrThanks all for working on this!
#8 works well (and still seems to apply), for #12/#13 please pay attention to the test failures - you can see that code is being added back in the merge request and tests are passing again. From my manual testing without that code the module doesn't really work, it doesn't fill in the published on value.
As for the merge request, it seems to introduce a lot of unrelated code style/linting type changes so it may be hard to get that in. I'm not a maintainer of this project, but I do have a lot of experience contributing to other projects (including core). I would recommend scaling it back to the task at hand, especially since this is marked as critical. Thanks!
Also, I'm going to be bold and mark this as a bug since it really does cause a lot of unexpected behaviour and gotchas.
Comment #18
webflo CreditAttribution: webflo at UEBERBIT GmbH commentedRebased the MR from @saidatom - new MR: https://git.drupalcode.org/project/publication_date/-/merge_requests/8
Comment #19
saidatomComment #20
bvoynickI noticed this MR adds Drupal 10 to the compatibility list for the module. Might be a little early to do that with D10 still in alpha?
Comment #21
bvoynickOne outstanding issue I just pushed a changes for is that a fatal error is very possible in the Timestamp display formatter if it gets a non-integer value such as NULL. This is easy to accomplish by viewing or previewing a never-published node set up to display the publish date.
Drupal Core generally seems to make the assumption that timestamps will never be empty, so this is a tricky one. The approach I pushed to the MR is to add an override on __get() for PublicationDateItem, to reroute ->value magic property to actually return the published_at_or_now property. This is a big change that will require integrating code to change to using isEmpty() to detect the lack of a date - but I think it's no less of a change than what this is already doing by switching from the 2038 value to NULL.
I think one remaining necessity is that there is no update hook to change all those 2038 values on existing sites.
Comment #22
webflo CreditAttribution: webflo at UEBERBIT GmbH commentedThanks for the patch. I worked on it today. The behaviour of ChangedItem and CreatedItem is very difficult. I think its better to use TimestampItem instead and put the necessary in the FieldItemList class.
https://git.drupalcode.org/issue/publication_date-3066446/-/tree/3066446...
Comment #25
webflo CreditAttribution: webflo at UEBERBIT GmbH commentedComment #27
osopolarWhat about the update hook? See also #13 and #21
I guess it would be fine to set the published_at to NULL where it currently is 2147483647: