By default no content types should be enabled for scheduled publishing and unpublishing. The node add/edit form should not have the input date fields unless the admin has enabled this for the specific content type. However, somewhere along the 7.x to 8.x conversion this check has been missed. We currently get the date input fields regardless of the content type setting, and cron processes the requested actions on any content type which had data in the fields.

I have added a new set of tests testNonEnabledNodeType() in #2594615: Automated testing in 8.x [meta] #55 and these currently fail as intended.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

jonathan1055 created an issue. See original summary.

joekers’s picture

It seems as though scheduler_entity_base_field_info() is adding the publish_on and unpublish_on fields to all node types regardless of the scheduler setting. I think we could add these fields when the scheduler settings are added to the content type instead -unless there's a better way?

jonathan1055’s picture

I don't think we can conditionally add the fields when a realtime action of enabling the content type for scheduler is performed. My suggestion is that instead of hook_entity_base_field_info we use hook_entity_bundle_field_info as this has knowledge of the 'bundle' which we can use to retrieve the enabled setting.

On the API page it says

Bundle fields either have to override an existing base field, or need to provide a field storage definition via hook_entity_field_storage_info() unless they are computed.

So one idea might be to leave the entity definition as it stands, but just turn it on or off in the bundle function. Or maybe that's making too much work?

Something else to be aware of: This hook will be changed in https://www.drupal.org/node/2346347

pfrenssen’s picture

This sounds familiar, I believe this is fixed by the patch in #2597233: The FALSE value for #group FAPI property does not have expected behaviour.

jonathan1055’s picture

Amazing timing! I am working on this right now and discovered that we cannot simply move the definition of the fields from hook_entity_base_field_info to hook_entity_bundle_field_info. When I tried that, the fields quite rightly did not show for non-enabled types, but for enabled types the data could be entered but was not saved.

Then I spotted that $form['publish_on']['#access'] was set in _scheduler_form_alter which is only called if one of the options was turned on. The fields are correctly hidden if at least one of the options is on, it is when both options are off that we get the default date entry fields showing up, in plain form just from TimestampDatetimeNoDefaultWidget.php with no custom descriptions, etc.

It would seem that the default #access value is TRUE. So I was just trying out setting $form['publish_on']['#access'] = FALSE in scheduler_form_node_form_alter for the case when we do not call _scheduler_form_alter. But I see from the changes in #2597233: The FALSE value for #group FAPI property does not have expected behaviour that if we call the helper form_alter for all cases then that might also solve the problem. I will test it both ways and let you know.

pfrenssen’s picture

Status: Active » Closed (duplicate)
Related issues: +#2597233: The FALSE value for #group FAPI property does not have expected behaviour

I just ran the test with the patch from #2597233: The FALSE value for #group FAPI property does not have expected behaviour and it indeed fixes the issue!

The test fails now at this point:

    // Attempt to create a node with a scheduled publishing date in the future.
    $body = $this->randomMachineName(30);
    $edit = [
      'title[0][value]' => $this->randomMachineName(10),
      'publish_on[0][value][date]' => \Drupal::service('date.formatter')->format(time() + 3600, 'custom', 'Y-m-d'),
      'publish_on[0][value][time]' => \Drupal::service('date.formatter')->format(time() + 3600, 'custom', 'H:i:s'),
      'unpublish_on[0][value][date]' => \Drupal::service('date.formatter')->format(time() + 7200, 'custom', 'Y-m-d'),
      'unpublish_on[0][value][time]' => \Drupal::service('date.formatter')->format(time() + 7200, 'custom', 'H:i:s'),
      'promote[value]' => 1,
      'body[0][value]' => $body,
    ];

    $this->drupalPostForm('node/add/' . $name, $edit, t('Save and publish'));

Since the 'publish on' and 'unpublish on' fields are not there, the form cannot even be posted, and the test fails with "Failed to set field publish_on[0][value][date] to 2016-01-05".

The part in the test leading up to this is certainly sufficient for testing the current problem, so I suggest to just remove the latter part of the test.

Marking this as a duplicate of #2597233: The FALSE value for #group FAPI property does not have expected behaviour.

pfrenssen’s picture

@jonathan1055 oops, cross-posted!

jonathan1055’s picture

Status: Closed (duplicate) » Needs review
FileSize
2.52 KB

The change in #2597233: The FALSE value for #group FAPI property does not have expected behaviour does fix the problem of making sure the date input fields are not accessible for non-enabled content types. However, as explained in the first paragraph of https://www.drupal.org/node/2594615#comment-10725014 there are still a couple of fixes required in the cron process to make sure we do not act on non-enabled nodes. I have tested these changes both manually and via automated test results at https://www.drupal.org/pift-ci-job/131554

The attached patch is for reviewing and commiting here. The tests will not pass on this run, as the commit to the test files has not been done yet.

Status: Needs review » Needs work

The last submitted patch, 8: 2625572-8-cron-ignore-non-enabled-content-type.patch, failed testing.

The last submitted patch, 8: 2625572-8-cron-ignore-non-enabled-content-type.patch, failed testing.

jonathan1055’s picture

Status: Needs work » Needs review

I have just re-tested, and the patch in #8 still applies and fixes the remaining problems. Now that #2597233: The FALSE value for #group FAPI property does not have expected behaviour has been fixed we get 42 passes and 0 fails, and the test class is all green

3		Scheduler.Drupal\scheduler\Tests\SchedulerNonEnabledTypeTest
✓		- setUp
✓		- testNonEnabledNodeType
✓		- checkNonEnabledTypes

I'd like someone else to check my patch, before I commit, as it is bad practice to make changes without another reviewing it.

jonathan1055’s picture

I have addressed the @todo relating to how to cope with a node where the relevant date has been removed or somehow the db table is out-of-sync with the entity cache. We need to do this to avoid a cron run crash because $node->set('changed', $publish_on); fails badly but silently if the date is null. In the patch in #8 I had substituted the current time, but I now think that this is wrong. We should simply skip processing the node if something unexpected has happened to the data. Should we add a logger message to say that the node has been skipped? This would assist the author/admin to discover what has gone wrong.

The tests will need adjustment, because currentlty they fail even with this modified patch. I will look at that next.

Status: Needs review » Needs work

The last submitted patch, 12: 2625572-12-cron-ignore-non-enabled-content-type.patch, failed testing.

  • jonathan1055 committed fe9b63e on 8.x-1.x
    Issue #2594615, #2625572: by jonathan1055: Amended tests for non-enabled...

The last submitted patch, 12: 2625572-12-cron-ignore-non-enabled-content-type.patch, failed testing.

jonathan1055’s picture

Having fixed the tests we now get a full set of passes in SchedulerNonEnabledTypeTest using the altered cron code in #12 which abandons processing when the date is missing.

Here is an updated patch which does the same processing as in #12 but writes a log message to explain what has been done.

Status: Needs review » Needs work

The last submitted patch, 16: 2625572-16-cron-ignore-non-enabled-content-type.patch, failed testing.

The last submitted patch, 16: 2625572-16-cron-ignore-non-enabled-content-type.patch, failed testing.

jonathan1055’s picture

Status: Needs work » Needs review

Good, we get 14 class passes and only 4 fails as expected. I have also tested each of the scenarios manually to check the dblog messages. All OK.

Setting back to 'needs review' to encourage others to review the patch before I commit.

pfrenssen’s picture

Assigned: Unassigned » pfrenssen
Issue tags: +SprintWeekend2016

Going to review this as part of the Sofia team of the global sprint weekend.

jonathan1055’s picture

OK. I have worked out how to get the proper field label. This was a @todo in the latest patch, can't upload a new patch right now, but will do later. It won't affect any review you do now though, that is worth doing, thank you.

jonathan1055’s picture

Here's an updated patch showing the type name and field name instead of the machine names. Also interdiff from 16 to 22 in case you have started looking at the patch in #16

Status: Needs review » Needs work

The last submitted patch, 22: 2625572-22-cron-ignore-non-enabled-content-type.patch, failed testing.

pfrenssen’s picture

Assigned: pfrenssen » Unassigned

This review is for the patch in #16. I am stopping for today, didn't have time yet to look at the updated patch.

  1. +++ b/scheduler.cron.inc
    @@ -35,8 +35,26 @@ function _scheduler_publish() {
    +  $logger = \Drupal::logger('scheduler');
    

    It's stuff like this that really presses the point that we need to provide a proper OO service for the Scheduler API, as proposed in #2651338: Create a service for the Scheduler API. Then we can use dependency injection for the logger.

  2. +++ b/scheduler.cron.inc
    @@ -35,8 +35,26 @@ function _scheduler_publish() {
    +    // The above query and api calls can return nodes of types which are not
    +    // enabled for this scheduler action. Do not process these.
    

    This is absolutely correct. I'm thinking that we can mitigate this a bit by adding a condition to the query to exclude all node types that do not have scheduling enabled. This is out of scope for this issue though, I created a followup: #2659824: Exclude node types that do not have scheduling enabled.

  3. +++ b/scheduler.cron.inc
    @@ -35,8 +35,26 @@ function _scheduler_publish() {
    +    if (!$node->type->entity->getThirdPartySetting('scheduler', $action . '_enable', FALSE)) {
    +      $logger->warning('"%title" not published because @type node type is not enabled for Scheduler', $logger_variables);
    +      continue;
    +    }
    

    Not 100% sure if it is a good idea to log this. If a node was once scheduled and then scheduling was disabled for the content type it would mean that this error would be logged on every cron run for all eternity.

    It would also take some tricky work for a site maintainer to fix this error: they would have to re-enable scheduling for the content type so that the publish on field appears again, fix the offending nodes, then turn off scheduling again.

  4. +++ b/scheduler.cron.inc
    @@ -46,6 +64,13 @@ function _scheduler_publish() {
    +    // Ensure we have a valid date to avoid a cron crash if third-party modules
    +    // have altered the node list or node data via the API calls above.
    +    if (empty($node->publish_on->value)) {
    +      $logger->error('@type "%title" not published because %field field has no value', $logger_variables);
    +      continue;
    +    }
    

    Wouldn't it be better to throw an exception? This is a programming error after all. People that implement this hook should ensure that the content they provide is valid.

jonathan1055’s picture

Thanks for the review, Pieter. In reply

  1. Yes, I will add a @todo in the code and reference the API service issue
  2. Good idea to improve the query, I will add @todo
  3. "Not 100% sure if it is a good idea to log this. " When the query is improved, this code can be removed and we will not be logging it on every cron run. But I suggest that we leave the current patch code as-is for now, until the query is changed. Or alternatively we silently skip the node without any message
  4. How would thowing the exception work? It is only during a cron run that we detect the bad data. Or can we do it a different way? Happy to learn what you had in mind

Jonathan

jonathan1055’s picture

I've thought a bit more about the conflicting problem of (a) alerting the user/admin that the node will not be processed and explaining why, so that they understand and can fix it without raising support issues, versus (b) clogging up the dblog with a message every time, and leaving the node scheduled for eternity.

A way to solve this could be to log the error then remove the publish-on date from the node, so it does not keep getting selected. After all, the date should not be on the node in the first place. If the date got left on when the node-type had been enabled in the past this would be a relatively easy way to tidy up the bad data and continually add dblog rows. A possible danger of this is if the admin accidentally disabled scheduling and before they noticed their error a cron run cleared all the dates. Maybe the removing of dates should only be done if the scheduled date is a week in the past, that way the admins gets plenty of warnings in the log but it does not continue for ever.

Just putting ideas out there ...

pfrenssen’s picture

I've thought a bit more about the conflicting problem of (a) alerting the user/admin that the node will not be processed and explaining why, so that they understand and can fix it without raising support issues, versus (b) clogging up the dblog with a message every time, and leaving the node scheduled for eternity.

Indeed the Drupal log is intended for the site admins. If this error occurs it means that there is an implementation error - the API was not correctly used, and nodes that are not scheduled are being altered into the list of scheduled nodes. There's nothing a site admin can do about this, it needs to be fixed in code. There is another problem: the dblog module is optional, it is not guaranteed to be enabled. It is even considered a best practice for high performance / high traffic sites to disable database logging on production. Instead the log will be written to a file which means that the data is not available for site admins (unless the log files are exposed in some way).

Exceptions are the standard way of dealing with logic errors. Exceptions tend to be very visible to developers - they will usually notice on their dev machines that the cron job doesn't complete. Every uncaught exception will be logged in the PHP error log, so even during cron jobs these errors are still logged, and in a more reliable way. Fatal errors in the PHP log are a big alarm bell for developers and sysops. On a typical hosting setup I think the chances are higher that the PHP log will be monitored by the sysops rather than the dblog.

A way to solve this could be to log the error then remove the publish-on date from the node, so it does not keep getting selected. After all, the date should not be on the node in the first place. If the date got left on when the node-type had been enabled in the past this would be a relatively easy way to tidy up the bad data and continually add dblog rows. A possible danger of this is if the admin accidentally disabled scheduling and before they noticed their error a cron run cleared all the dates.

I don't think we need to babysit the way people use our API, and try to pre-empt potential bugs that might exist in their hook implementations. Removing the publish-on date is a form of data loss, that would not be appreciated in some cases :)
I really think an exception would be sufficient to cover it on our end.

jonathan1055’s picture

Yes, I see now why we should not rely on the dblog and should throw exceptions instead. Also for the distinction between site admins and developers messages. Thank you Pieter for the detailed explanation. You are also right that we should not remove the dates as that is defintely a form of data-loss.

Here's a re-roll addressing all the points in #24 and #27. Here are the differences from the patch in #16 that you have reviewed. I've also added an interdiff for 16 to 28

  1. Added @todo about converting logger to service, and referencing #2651338: Create a service for the Scheduler API
  2. Added @todo about improving the entity query, referencing #2659824: Exclude node types that do not have scheduling enabled
  3. Changed both error traps to exceptions. The first one will still be required even when the entity query is improved, as the API calls may add in un-enabled nodes. I used sprintf() to format the text as suggested here. This seemed cleaner than concatenating the text values because I needed to include quotes in the message
  4. Moved the creation of logger_variables array to be next to the single logger call. This will all be changed later
  5. Repeated all the above for the second function

I'm not sure if the exception class definitions needs any more documenting. The examples seem fairly brief, but let me know if you think it need more. Also, is it correct to define them in the file they are used, i.e. cron.inc, or should they be somewhere else common to any other exceptoins we might add in the future?

Status: Needs review » Needs work

The last submitted patch, 28: 2625572-28-cron-ignore-non-enabled-content-type.patch, failed testing.

jonathan1055’s picture

Status: Needs work » Needs review

Good - as expected this test class is fully green with 58 passes and no fails:

3		Scheduler.Drupal\scheduler\Tests\SchedulerNonEnabledTypeTest
✓		- setUp
✓		- testNonEnabledNodeType
✓		- checkNonEnabledTypes
pfrenssen’s picture

Priority: Normal » Major

Raising priority. This is a hard blocker for the alpha release.

jonathan1055’s picture

That's OK. I think I have covered all your points, and I'm happy that the exceptions are working OK. Would appreciate your review of the patch in #28. I've given the interdiff from 16.

jonathan1055’s picture

Assigned: Unassigned » jonathan1055

This issue will fix the final non-passing test case. It will be good to have the committed codebase passing all green, as that will assist our general workflow on all other issues.

  • jonathan1055 committed 7c3ef62 on 8.x-1.x
    Issue #2625572 by jonathan1055: Admin setting 'Enable content type for...
jonathan1055’s picture

Status: Needs review » Fixed
Issue tags: -SprintWeekend2016

As there have been no objections (yes ok, no reviews either) in the month since I posted the patch in #28 on 8th Feb, in the interests of keeping things moving I decided to commit this. I will happily make amendements if the programming needs it, or to bring in line with coding standards.

jonathan1055’s picture

... and now we have the committed codebase passing all test classes :-)

all tests pass

pfrenssen’s picture

As there have been no objections (yes ok, no reviews either) in the month since I posted the patch in #28 on 8th Feb, in the interests of keeping things moving I decided to commit this. I will happily make amendements if the programming needs it, or to bring in line with coding standards.

I'm fine with this approach. If an important issue stagnates for too long because of a lack of reviews and you are sure about the approach then by all means go ahead and commit it. If any problems would occur we can still address it in a follow-up issue.

pfrenssen’s picture

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

jonathan1055’s picture

Assigned: jonathan1055 » Unassigned

Unassigning.