Problem/Motivation

I noticed two random fails in HEAD in the past couple weeks for DateRangeFieldTest:
https://www.drupal.org/pift-ci-job/535798
https://www.drupal.org/pift-ci-job/521653

They are not failing on the exact same assertion, but they are failing in the same test scenario within the method (when the start and end date are the same).

I ran the tests locally and was actually able to reproduce the first fail once (in more than 200 test runs). For the failing assertion, the output looks like this:

i3fgAed6
Monday, December 31, 2012 - 12:00 THESEPARATOR Monday, December 31, 2012 - 12:00

It's supposed to look like this (from a different run, with a different machine name):

d0vNZOvq
Monday, December 31, 2012 - 12:00

I think the timezone when it failed locally was the edgecase timezone Pacific/Tongatapu, but I'm not 100% sure. I don't know for the two fails on the bots. (See #2829845: Record timezone in DateTime test assertions and verbose output.)

I think this might actually be a bug in DateTime or DateTime Range, since it appears to be related to an edgecase where the module is supposed to detect that the start and end times are the same and therefore not display the separator nor end date. No hypotheses about the other fail for the formatter, which I haven't reproduced locally yet.

Proposed resolution

TBD

Remaining tasks

TBD

CommentFileSizeAuthor
#57 interdiff-54-57.txt2.9 KBmpdonadio
#57 2829848-57.patch5.18 KBmpdonadio
#54 2829848-2-54.patch5.3 KBalexpott
#54 2829848-2-54.test-only.patch3.25 KBalexpott
#50 interdiff-39-50.txt1004 bytesmpdonadio
#50 2829848-50.patch2.04 KBmpdonadio
#39 interdiff-10-39.txt2.06 KBmpdonadio
#39 2829848-39.patch2.01 KBmpdonadio
#35 datetime-2829848-usleep-formatter-PASS.patch10.15 KBxjm
#33 datetime-2829848-usleep-PASS.patch7.65 KBxjm
#33 datetime-2829848-usleep-FAIL.patch6.79 KBxjm
#29 2829848-usleep.patch3.71 KBmpdonadio
#28 datetime-2829848-1000-PASS.patch8.98 KBxjm
#21 datetime-2829848-1000-FAIL.patch4.23 KBxjm
#19 datetime-2829848-2000-FAIL.patch3.15 KBxjm
#18 datetime-2829848-500-FAIL.patch3.15 KBxjm
#12 interdiff-10-12.txt3.36 KBmpdonadio
#12 2829848-12.patch4.76 KBmpdonadio
#10 2829848-10.patch1.4 KBmpdonadio
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

xjm created an issue. See original summary.

xjm’s picture

xjm’s picture

xjm’s picture

Issue summary: View changes
xjm’s picture

Issue summary: View changes
xjm’s picture

Issue summary: View changes
xjm’s picture

I ran a shortened version of the test overnight and got no fails in 1000 runs. A couple possibilities:

  • I was just really lucky to see the fail locally yesterday afternoon.
  • Some of the earlier testing I removed from the method for these latter runs affected the result.
  • It's related to the actual time of day, or the time of day relative to the timezone in question, or something. The verbose output from the local test run that failed said it was modified atNov 22 16:39 (CST). That is around noon in what I think was the timezone that failed, but not quite, and there were also test runs a minute before and a minute after, so seems a stretch.
  • Some edgecase at 0 seconds. Confirming or ruling that out is another reason to record the timestamp.

Probably we should ask infra to search the CI logs for similar fails.

Anonymous’s picture

mpdonadio’s picture

99% sure I know what is happening here. Will post a patch later.

mpdonadio’s picture

Status: Active » Needs review
FileSize
1.4 KB

#8, is there anything in particular about that article that you think may be causing this problem?

OK, I think this is what is happening. In working on #2823847: DateRange formatters should compare rendered dates instead of raw timestamps, I noticed that when in the daterange formatters, the timestamps in this hunk

        /** @var \Drupal\Core\Datetime\DrupalDateTime $start_date */
        $start_date = $item->start_date;
        /** @var \Drupal\Core\Datetime\DrupalDateTime $end_date */
        $end_date = $item->end_date;

        if ($start_date->format('U') !== $end_date->format('U')) {
          // ...
        }

weren't always the same (look at some of fails on the earlier patches) for date-only items. The Y/m/d portions where the same, but the time wasn't. This is because we are deferring the call to `datetime_date_default_time()` to the build methods (that method needs to go; followup to come). I suspect we are having a second-rollover between when the start and end computed values are occurring, and that is causing the spurious.

This patch adds the default time logic to DateTimeComputed so that the computed value on the field always has the proper time set for date-only items, regardless of where you are using them.

Two notes:

1. Think we also need this in the DateTimeComputed::setValue() method, but I am seeing some weirdness there that I don't quite understand.

2. In the hunk above, we can't compare timestamps in the formatters. Since the field is based around \DateTime at the core, values can be outside the Unix timestamp range. The referenced issue handles this by accident, but that also introduces a change that I don't think we want in 8.2.x. My next patch will have a fix for this.

xjm’s picture

Mixologic found these other test results going back a couple months (for patches, so they might be affected by the patch):

So this fail seems to be a lot less frequent than it appeared if anything. Hmmmm.

mpdonadio’s picture

Here is an updated patch to the formatters to not use timestamps, but see also the recent comments on #2795489: 2038 bug with PHP timestamps on 32-bit systems - warn users?.

Calgon, take me away!

mpdonadio’s picture

And regards to the second fail that @xjm couldn't reproduce, it is essentially the same problem: rendering out a range w/o the separator when the dates are the same. That particular assertion is making sure hook_entity_prepare_view() can run and is attaching the attribute to the proper element. When the range is displayed, it is the div that wraps the two time elements. When it is the same, and no separator is present, it attaches to the time element directly. See #2811725: Error when render Datetime Range field: Error: Unsupported operand types.

The last submitted patch, 10: 2829848-10.patch, failed testing.

Status: Needs review » Needs work

The last submitted patch, 12: 2829848-12.patch, failed testing.

mpdonadio’s picture

Status: Needs work » Needs review

Back to NR based on latest runs on fixed bots.

xjm’s picture

I've now reproduced the second fail locally (twice in maybe 300 runs of the full method); both times, it failed for Calcutta, so it's not just tied to the edgecase-y +1300 timezone. The fails were also 3h10 apart, so it's also not tied to the actual IRL time of day. Same visual bug:

Asia/Kolkata vhaLLBJ8
full | Asia/Kolkata vhaLLBJ8
Monday, December 31, 2012 - 12:00 THESEPARATOR Monday, December 31, 2012 - 12:00

I wonder why it did not ever fail when I shortened the method...

I'll look into adding a test-only patch that runs the test a few zillion times to see if we can demonstrate @mpdonadio's patch fixing it, but the ideal thing would be to have a 100% fail rate patch that demonstrates the underlying cause.

xjm’s picture

Status: Needs review » Needs work

The last submitted patch, 19: datetime-2829848-2000-FAIL.patch, failed testing.

xjm’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 21: datetime-2829848-1000-FAIL.patch, failed testing.

xjm’s picture

Not quite sure how 500 runs passes in 16 min but 1000 times out after 2h... anyway. 500 doesn't show nearly the fail rate I'd hoped (nor the much higher fail rate I'm seeing locally, which is about 0.25%), but I guess with enough queued runs it could work.

xjm’s picture

Status: Needs work » Needs review
Anonymous’s picture

rerun patch-2000 with php7 and 3/4 fails.

mpdonadio’s picture

This will exacerbate the hypothesis of the problem is rollover between seconds when the computed datetime objects are created. It has high failure rate for me locally, but really lengthens the test time.

Status: Needs review » Needs work

The last submitted patch, 29: 2829848-usleep.patch, failed testing.

xjm’s picture

Nice @mpdonadio.

The last submitted patch, 28: datetime-2829848-1000-PASS.patch, failed testing.

xjm’s picture

Combining the half-second sleep and test patch with @mpdonadio's fix.

Edit: This is the initial patch from #10 for computed dates, not the whole thing.

The last submitted patch, 33: datetime-2829848-usleep-FAIL.patch, failed testing.

xjm’s picture

For comparison, here is the formatter fix only, without the computed datetime fix.

xjm’s picture

Alright, so it looks like either approach will fix the random fail. #2823847: DateRange formatters should compare rendered dates instead of raw timestamps looks reasonable on its own merits, but this "rollover" bug for computed dates could trickle down to other places as well.

For the first approach:

  1. +++ b/core/modules/datetime/src/DateTimeComputed.php
    @@ -18,7 +18,7 @@ class DateTimeComputed extends TypedData {
    -   * @var \DateTime|null
    +   * @var \Drupal\Core\Datetime\DrupalDateTime|null
    

    Is this just correcting existing documentation to the more specific class that is actually used, rather than its parent? I think it's probably out of scope here?

  2. The question is, is hardcoding a time of 12:00:00 universally applicable and the correct fix? It looks like the same technique is used in datetime_date_default_time():
    /**                                                                             
     * Sets a consistent time on a date without time.                               
     *                                                                              
     * The default time for a date without time can be anything, so long as it is   
     * consistently applied. If we use noon, dates in most timezones will have the  
     * same value for in both the local timezone and UTC.                           
     *                                                                              
     * @param $date                                                                 
     */
    function datetime_date_default_time($date) {
      $date->setTime(12, 0, 0);
    }
    

    Does the "most" in "in most timezones" get us into any trouble as an assumption? (The tests here prove it's fine this particular case for (e.g.) the +1300 timezone, but speaking in general.) Does the default of noon prevent a majority of such bugs at the cost of obscuring bugs when things are being compared between timezones, or when someone takes a date-only timezone and tries to add or subtract ~12h from it?

    Overall, it probably makes computed dates more robust by reducing unpredictability.

    At the least, I think we probably want an inline comment. Maybe we want a method somewhere that is intended to replace the procedural datetime_date_default_time().

    Finally, are there other places that date-only dates could have a similar bug?

xjm’s picture

+++ b/core/modules/datetime/src/DateTimeComputed.php
@@ -40,14 +40,20 @@ public function getValue($langcode = NULL) {
+    $storage_format = $datetime_type === 'date' ? DATETIME_DATE_STORAGE_FORMAT : DATETIME_DATETIME_STORAGE_FORMAT;

When checking for other places that might use date-only dates, I noticed there is a constant DateTimeItem::DATETIME_TYPE_DATE for this 'date' string.

mpdonadio’s picture

Assigned: Unassigned » mpdonadio

#37, yeah, should use the item constant there (see also #2826404: Create DateTimeItemInterface and deprecate global constants in datetime.module).

#36-1, yup, that should go.

#36-2, long answer here. The 12:00 thing is a legacy thing that has been in core for a long time. We opened an issue to #2830094: Deprecate and remove usages of datetime_date_default_time()., but haven't figured out how we want to remove all usages.

The 12:000 UTC time is chosen because it will work with the date-only portion for most cases, except around the edge cases. In other words, when it is noon UTC it is the same day just about everywhere. We have to pick something so it is consistent, and that is the most reasonable option. We fixed a bunch of those bugs (#2778083: Default value for Date-Only fields is broken when UTC date is different than user current date, #2739290: UTC+12 is broken on date only fields), and in the process now do that TZ loop whenever we are testing date-only things (may need to add some more, or they are in-progress patches). Using 12:00 in the patch is consistent for how we use this in core, we are just setting it earlier in the process. I suspect doing this in the computed item will eliminate the need for many (or most) of the non-test uses of datetime_date_default_time().

I would be happy with touching up #10 with the feedback above and handling #36-2 with a comment above the ->setTime() with why it is needed. #12 is slightly different than #2823847: DateRange formatters should compare rendered dates instead of raw timestamps; the later is a BC break, the former uses the storage strings in the formatter comparison instead of timestamps in addition to the change in the computed item. #10 is the least invasive way to fix the known problem. I am hesitant to say we are bug-free, but until we have other fails to look at, I think potential problems are speculation.

I'll post a new patch in a bit.

mpdonadio’s picture

Assigned: mpdonadio » Unassigned
FileSize
2.01 KB
2.06 KB

Updated patch for commit review. Interdiff is against #10, which was the starting point.

xjm’s picture

We could probably add a test for this that calculates a date-only date for two times that are a full second or more apart, and ensure that the dates produced are considered the same?

xjm’s picture

I just noticed that the test method already includes a few lines like this in HEAD:

      datetime_date_default_time($start_date);

The first time, there is the comment:

// Formats that display a time component for date-only fields will display
// the default time, so that is applied before calculating the expected
// value.

So probably we should also remove those lines from the test?

Edit: there is also a similar usage in core/modules/datetime/src/Tests/DateTimeFieldTest.php so we probably should look at that test too.

mpdonadio’s picture

Assigned: Unassigned » mpdonadio

#40, should be pretty easy to do in new DateTimeItemTest test. Will post tomorrow.

#41, those lines adjust the expected values used in the tests. DrupalDateTime objects get created and formatted to mimic the expected formatter output; there is never a trip to the fields or the computed objects. That is a refactor that I would like to do in #2830094: Deprecate and remove usages of datetime_date_default_time(). when we can nuke that method. In the tests, I'm hoping to centralize that behavior in a DateTestBase::massageDateObject() method. So, for now they need to stay.

mpdonadio’s picture

And I just remembered that DateTimeItemTest is invalid. See the changes to it in #2824717: Add a format constraint to DateTimeItem to provide REST error message that are unrelated to the constraint.

@xjm, can you look at that patch and let me know how you want to proceed?

xjm’s picture

Not sure I follow the question -- are you saying testing it correctly requires changes to the test that are also in #2824717: Add a format constraint to DateTimeItem to provide REST error message? Do you think they could or should be in their own issue?

mpdonadio’s picture

#44 testing this in DateTimeItemTest (which is where it really belongs) is problematic b/c that test is invalid. Right now, if you use the Entity API, you can store the wrong format for the datetime_type, and the test doesn't even use the proper format. This never got noticed until we tried adding the constraint. We can get that issue in, create a separate issue to fix that test, or create a new test class for this issue.

xjm’s picture

Yeah, let's split the necessary test changes out into their own issue so they can be reviewed in their own context. I'd say at major priority since it is a soft blocker for this issue.

mpdonadio’s picture

xjm’s picture

Issue tags: +Triaged D8 critical

@catch, @cilefen, @alexpott, @effulgentsia, @Cottser, and I discussed this issue today and agreed to keep it critical. While the test has a fairly low failure rate and belongs to an experimental module, the underlying code affected is in stable DateTime. (In some cases, we might disable a randomly failing test in an experimental module, but that also makes it harder to fix and more likely to regress. We agreed that is not necessary in this case and we should just go ahead with the fix.)

It looks like some additional review and feedback on #2832264: DateTimeItemTest is not valid could help this get fixed sooner.

Thanks @mpdonadio!

dawehner’s picture

To be honest I'm not entirely convinced by choosing 12:00:00 UTC instead of 00:00:00 UTC. If you think about it at least the start of the unix timestamp was not 12:00:00 UTC, and days usually start at midnight :)

+++ b/core/modules/datetime/src/DateTimeComputed.php
@@ -40,14 +41,27 @@ public function getValue($langcode = NULL) {
+        // time will be set from the current time instead. For consistency, we
+        // we the time to 12:00:00 UTC. This is use so that the local date

double we in this sentence :) This is also related to #2842409: Selecting the same day in a date between filter returns no results and many more issues.

mpdonadio’s picture

Assigned: mpdonadio » Unassigned
FileSize
2.04 KB
1004 bytes

Adjusted the comment in #49.

12:00:00 UTC is what it does now, for better or worse. It we make it 00:00:00 UTC, then we have other problems. Changing it, though, would be a BC break. We can discuss it in #2830094: Deprecate and remove usages of datetime_date_default_time()..

alexpott’s picture

Status: Needs review » Needs work
Issue tags: +Needs tests

Shouldn't the fix \Drupal\Component\Datetime\DateTimePlus::createFromFormat() - shouldn't that set the time consistently using the same logic if the supplied format does not have a time?

Also even if we don't do that I think this is testable.

alexpott’s picture

Also \Drupal\datetime_range\Plugin\Field\FieldWidget\DateRangeWidgetBase::massageFormValues() seems to be doing something similar too.

mpdonadio’s picture

#51, re DateTimePlus(), it would be practically impossible to determine whether a supplied format contains time or not, since it (via \DateTime::createFromFormat) can work with format, which may even contain a portion time, and handle in a sane way.

In the field items we explicitly only with with the two storage formats we have defined.

#52, not sure which part you are thinking of. We use datetime_date_default_time() for date-only items (which set 12:00:00 UTC). These

$start_date->setTime(0, 0, 0);
$end_date->setTime(23, 59, 59);

are for the all-day variant of daterange, because we set the date to be midnight-to-midnight to represent something that lasts all day. We also have an issue to nuke datetime_date_default_time(), #2830094: Deprecate and remove usages of datetime_date_default_time().. I think if we consolidate this approach in the computed item in the value, we can get rid of nearly all of the the non-test uses. For the uses in the test, we would move the logic to a DateWhateverTestBase::massageTestDate() to handle this and other duplicated code.

I think the test approach in #40 would work, assuming we can fix to mess in #2832264: DateTimeItemTest is not valid.

When I was debugging this, I was pretty confident this was the best approach to address the immediate need of random failures during second rollovers.

alexpott’s picture

Here's a test that proves the above approach will work.

The last submitted patch, 54: 2829848-2-54.test-only.patch, failed testing.

mpdonadio’s picture

#54 is essentially what I wanted to add to DateTimeItemTest, just in a slightly different way. Reworking this to be DateRangeItemTest and renaming a few other things is probably good for the long term. I'll post some tweaks later.

mpdonadio’s picture

No substantial changes from #54, just a little cleanup to give this a proper home and a test class to build off of (eg, start to mirror what DateTimeItemTest should be doing).

Pending review, I think this is committable w/o the cleanup in #2832264: DateTimeItemTest is not valid and is sufficient to address the critical / random fail. We have other (non-critical) issues to handle the rest of comments raised in the issue. It also applies cleanly to 8.3.x.

alexpott’s picture

Status: Needs review » Reviewed & tested by the community

Nice changes to the test +1.

Given that I've only added test coverage to the issue I think I'm in a good position to RTBC this. I also think we can have a followup to remove some of \Drupal\datetime_range\Plugin\Field\FieldWidget\DateRangeWidgetBase::massageFormValues().

alexpott’s picture

  • catch committed a92f817 on 8.3.x
    Issue #2829848 by xjm, mpdonadio, alexpott: Random test failure in...

  • catch committed b455c23 on 8.2.x
    Issue #2829848 by xjm, mpdonadio, alexpott: Random test failure in...
catch’s picture

Status: Reviewed & tested by the community » Fixed

Opened the follow-up at #2846963: Clean up DateRangeWidgetBase::massageFormValues()..

Cleaned up the following two issues on commit.

  1. +++ b/core/modules/datetime/src/DateTimeComputed.php
    @@ -40,14 +41,28 @@ public function getValue($langcode = NULL) {
    +        // that the local date  portion is the same, across nearly all time
    

    extra space, fixed on commit.

  2. +++ b/core/modules/datetime/src/DateTimeComputed.php
    @@ -40,14 +41,28 @@ public function getValue($langcode = NULL) {
    +        // @todo Update comment and/or code when per the chosen solution in
    

    Stray 'when'.

Committed/pushed to 8.3.x and cherry-picked to 8.2.x, thanks!

  • catch committed a92f817 on 8.4.x
    Issue #2829848 by xjm, mpdonadio, alexpott: Random test failure in...

  • catch committed a92f817 on 8.4.x
    Issue #2829848 by xjm, mpdonadio, alexpott: Random test failure in...
xjm’s picture

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.