Found in dblog after cron:

Notice: Undefined offset: 1 in Drupal\linkchecker\LinkCheckerService->statusHandling() (line 209 of /var/www/html/drupal/web/modules/contrib/linkchecker/src/LinkCheckerService.php)

Comments

waverate created an issue. See original summary.

waverate’s picture

Looks like it is coming from /src/LinkCheckerService.php:

    $statusCode = $response->getStatusCode();
    if ($statusCode == 200
      && !empty($response->getBody())
      && !empty($response->getHeader('Content-Type'))
      && !empty($response->getHeader('Link'))
      && preg_match('/=|\/|,/', $response->getHeader('Link')[1]) == FALSE
      && !in_array($response->getHeader('Link')[1], ['#top'])
      && in_array($response->getHeaderLine('Content-Type'), [
        'text/html',
        'application/xhtml+xml',
        'application/xml',
      ])
      && !preg_match('/(\s[^>]*(name|id)(\s+)?=(\s+)?["\'])(' . preg_quote(urldecode($response->getHeader('Link')[1]), '/') . ')(["\'][^>]*>)/i', $response->getBody())
    )
hass’s picture

Can you try to figure out what exactly is causing the issue or provide a repro case, please?

hass’s picture

Status: Active » Postponed (maintainer needs more info)
waverate’s picture

Hmm. Not going to be easy to come up with specific steps as these errors are generated during cron and currently cron is checking 20 links for status.

However:

a. The error is not occurring 20 times per cron so I feel there are some links that are fine.

b. Because this is failing in LinkCheckerService, am I correct in assuming it it is only occurring while checking External links?

c. I can't tell if the error occurring on a 200 status code or in another status code.

I may need to set up a simpletest.me and try a couple of combination of links to see what type of status code generates this error.

hass’s picture

I guess it is one link and this can be caused by remote servers. In D7 I have implemented a lot of fallbacks, but we need to find out what bugs Guzzle may has here. With the 20 links where one is causing this we may find the issue by checking one by one link.

waverate’s picture

Status: Postponed (maintainer needs more info) » Active

Okay. This took a while to find. It looks like it is occurring for external links with 200 error code response. Here are an example of a couple of a couple of links and their response from curl.

1. https://queenmobs.com/2018/11/marianne-micros-interview/

[server-1 ~]$ curl -I https://queenmobs.com/2018/11/marianne-micros-interview/
HTTP/1.1 200 OK
Date: Thu, 04 Jul 2019 00:34:30 GMT
Content-Type: text/html; charset=UTF-8
Connection: keep-alive
Set-Cookie: __cfduid=d4b6ac21fcde33bc8b4818f1883ea1ca61562200469; expires=Fri, 03-Jul-20 00:34:29 GMT; path=/; domain=.queenmobs.com; HttpOnly
X-Pingback: https://queenmobs.com/xmlrpc.php
Link: <https://queenmobs.com/wp-json/>; rel="https://api.w.org/", <https://queenmobs.com/?p=28596>; rel=shortlink
Cache-Control: max-age=86400
Expires: Fri, 05 Jul 2019 00:34:29 GMT
Vary: Accept-Encoding
Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
Server: cloudflare
CF-RAY: 4f0d03872c4fcaa8-YYZ

2. https://arboretumpress.com/

[server-1 ~]$ curl -I https://arboretumpress.com/
HTTP/1.1 200 OK
Server: nginx
Date: Thu, 04 Jul 2019 00:36:34 GMT
Content-Type: text/html; charset=UTF-8
Connection: keep-alive
Strict-Transport-Security: max-age=86400
Vary: Accept-Encoding
Vary: Cookie
X-hacker: If you're reading this, you should visit automattic.com/jobs and apply to join the fun, mention this header.
Link: <https://wp.me/P7IVZg-2>; rel=shortlink
X-ac: 2.yyz _dfw 
waverate’s picture

Further to #7, node looks like it is added to linkchecker_index correctly and without error; and links look like they are added o linkchecker_link correctly and without error.

The IS error occurs during cron.

hass’s picture

Missing code has been removed from function check() and therefore the incorrect variable $response->getHeader('Link')[1]) causes the notice. This is for sure incorrect code. We must not rely on the response object here. The $response->uri was added in check() to provide the fragment to the statusHandling() function and nothing else as we cannot get the values otherwise.

Missing:

      // Add 'uri' property to core response object for 'fragment' check and
      // consistency with HTTPRL object.
      $response->uri = $uri;

Wrong:

      && preg_match('/=|\/|,/', $response->getHeader('Link')[1]) == FALSE
      && !in_array($response->getHeader('Link')[1], ['#top'])
...
      && !preg_match('/(\s[^>]*(name|id)(\s+)?=(\s+)?["\'])(' . preg_quote(urldecode($response->getHeader('Link')[1]), '/') . ')(["\'][^>]*>)/i', $response->getBody())

Correct was:

if ($response->code == 200
    && !empty($response->data)
    && !empty($response->headers['content-type'])
    && !empty($response->uri['fragment'])
    && preg_match('/=|\/|,/', $response->uri['fragment']) == FALSE
    && !in_array($response->uri['fragment'], array('#top'))
    && in_array($response->headers['content-type'], array('text/html', 'application/xhtml+xml', 'application/xml'))
    && !preg_match('/(\s[^>]*(name|id)(\s+)?=(\s+)?["\'])(' . preg_quote(urldecode($response->uri['fragment']), '/') . ')(["\'][^>]*>)/i', $response->data)
waverate’s picture

Of interest:

a. the error does not occur when cron is run from drush drush cron, and

b. the error occurs twice per link if cron is run from https://example.com/cron/[token].

eric.guerin@ucsf.edu’s picture

I am getting similar offset error, during the CRON job.

Notice: Undefined offset: 1 in Drupal\linkchecker\LinkCheckerService->statusHandling() (line 222 of /mnt/www/html/ucsfitode1/docroot/modules/contrib/linkchecker/src/LinkCheckerService.php)
#0 /mnt/www/html/ucsfitode1/docroot/core/includes/bootstrap.inc(600): _drupal_error_handler_real(8, 'Undefined offse...', '/mnt/www/html/u...', 222, Array)
#1 /mnt/www/html/ucsfitode1/docroot/modules/contrib/linkchecker/src/LinkCheckerService.php(222): _drupal_error_handler(8, 'Undefined offse...', '/mnt/www/html/u...', 222, Array)
#2 /mnt/www/html/ucsfitode1/docroot/modules/contrib/linkchecker/src/LinkCheckerService.php(182): Drupal\linkchecker\LinkCheckerService->statusHandling(Object(GuzzleHttp\Psr7\Response), Object(Drupal\linkchecker\Entity\LinkCheckerLink))
#3 /mnt/www/html/ucsfitode1/vendor/guzzlehttp/promises/src/Promise.php(203): Drupal\linkchecker\LinkCheckerService->Drupal\linkchecker\{closure}(Object(GuzzleHttp\Psr7\Response))
#4 /mnt/www/html/ucsfitode1/vendor/guzzlehttp/promises/src/Promise.php(169): GuzzleHttp\Promise\Promise::callHandler(1, Object(GuzzleHttp\Psr7\Response), Array)
#5 /mnt/www/html/ucsfitode1/vendor/guzzlehttp/promises/src/FulfilledPromise.php(39): GuzzleHttp\Promise\Promise::GuzzleHttp\Promise\{closure}(Object(GuzzleHttp\Psr7\Response))
#6 /mnt/www/html/ucsfitode1/vendor/guzzlehttp/promises/src/TaskQueue.php(47): GuzzleHttp\Promise\FulfilledPromise::GuzzleHttp\Promise\{closure}()
#7 /mnt/www/html/ucsfitode1/vendor/guzzlehttp/guzzle/src/Handler/CurlMultiHandler.php(119): GuzzleHttp\Promise\TaskQueue->run()
#8 /mnt/www/html/ucsfitode1/vendor/guzzlehttp/guzzle/src/Handler/CurlMultiHandler.php(146): GuzzleHttp\Handler\CurlMultiHandler->tick()
#9 /mnt/www/html/ucsfitode1/vendor/guzzlehttp/promises/src/Promise.php(246): GuzzleHttp\Handler\CurlMultiHandler->execute(true)
#10 /mnt/www/html/ucsfitode1/vendor/guzzlehttp/promises/src/Promise.php(223): GuzzleHttp\Promise\Promise->invokeWaitFn()
#11 /mnt/www/html/ucsfitode1/vendor/guzzlehttp/promises/src/Promise.php(267): GuzzleHttp\Promise\Promise->waitIfPending()
#12 /mnt/www/html/ucsfitode1/vendor/guzzlehttp/promises/src/Promise.php(225): GuzzleHttp\Promise\Promise->invokeWaitList()
#13 /mnt/www/html/ucsfitode1/vendor/guzzlehttp/promises/src/Promise.php(62): GuzzleHttp\Promise\Promise->waitIfPending()
#14 /mnt/www/html/ucsfitode1/vendor/guzzlehttp/promises/src/EachPromise.php(101): GuzzleHttp\Promise\Promise->wait()
#15 /mnt/www/html/ucsfitode1/vendor/guzzlehttp/promises/src/Promise.php(246): GuzzleHttp\Promise\EachPromise->GuzzleHttp\Promise\{closure}(true)
#16 /mnt/www/html/ucsfitode1/vendor/guzzlehttp/promises/src/Promise.php(223): GuzzleHttp\Promise\Promise->invokeWaitFn()
#17 /mnt/www/html/ucsfitode1/vendor/guzzlehttp/promises/src/Promise.php(267): GuzzleHttp\Promise\Promise->waitIfPending()
#18 /mnt/www/html/ucsfitode1/vendor/guzzlehttp/promises/src/Promise.php(225): GuzzleHttp\Promise\Promise->invokeWaitList()
#19 /mnt/www/html/ucsfitode1/vendor/guzzlehttp/promises/src/Promise.php(62): GuzzleHttp\Promise\Promise->waitIfPending()
#20 /mnt/www/html/ucsfitode1/docroot/modules/contrib/linkchecker/src/Plugin/QueueWorker/LinkCheck.php(79): GuzzleHttp\Promise\Promise->wait()
#21 /mnt/www/html/ucsfitode1/docroot/core/lib/Drupal/Core/Cron.php(180): Drupal\linkchecker\Plugin\QueueWorker\LinkCheck->processItem(Array)
#22 /mnt/www/html/ucsfitode1/docroot/core/lib/Drupal/Core/Cron.php(145): Drupal\Core\Cron->processQueues()
#23 /mnt/www/html/ucsfitode1/docroot/core/lib/Drupal/Core/ProxyClass/Cron.php(75): Drupal\Core\Cron->run()
#24 /mnt/www/html/ucsfitode1/docroot/core/modules/system/src/Form/CronForm.php(166): Drupal\Core\ProxyClass\Cron->run()
... abbreviated
shubhangi1995’s picture

Assigned: Unassigned » shubhangi1995
ohorbatiuk’s picture

Assigned: shubhangi1995 » Unassigned
Status: Active » Needs review
StatusFileSize
new1.83 KB

Status: Needs review » Needs work

The last submitted patch, 13: undefined-offset-error-3065045-13.patch, failed testing. View results

joel_osc’s picture

Patch in #13 seems to be working fine, errors are gone. Thank-you @chmez.

baluertl’s picture

StatusFileSize
new63.67 KB

Now tested on our project and hereby I confirm that the patch shared in comment #13 does resolve the issue of displaying the unreal number on progress bar via /admin/config/content/linkchecker page:
Screenshot comparison

c-logemann’s picture

The patch won't apply anymore. Needs to be re-rolled or better solved in an issue-fork.

eiriksm’s picture

Issue tags: +Needs tests

Also definitely needs tests for:

- Cases where the link part of the header is an array with one value (which causes this error)
- Cases where the link part refers to a fragment that is actually on the page
- Cases where the link part refers to a fragment that is not on the page

joseph.olstad’s picture

@C_Logemann, the patch does not need a reroll, patch 13 still rolls against the HEAD of 8.x-1.x

eiriksm’s picture

Status: Needs work » Needs review
Issue tags: -Needs tests
StatusFileSize
new3.14 KB
new1.29 KB

Here is an updated patch with the tests I feel we need. Including a test-only patch that should fail with the notice from the issue summary.

Also fixed the test failures in #13. That patch got rid of the notice, but the actual functionality for fragments was effectively removed. So that's not ideal :)

Status: Needs review » Needs work

The last submitted patch, 20: 3065045-test-only.patch, failed testing. View results

eiriksm’s picture

Status: Needs work » Needs review

That test fail was expected. Back to NR

ex dj’s picture

#13 eliminates the error. However, "Date Checked" in the report is now 12/31/1969.

banoodle’s picture

Status: Needs review » Reviewed & tested by the community

Patch #20 works well for me (thanks!). Also confirming Last Checked dates are accurate (doesn't have problem mentioned about patch #13 in comment #23).

  • eiriksm committed b54e359 on 8.x-1.x
    Issue #3065045 by eiriksm, chmez, waverate, hass, banoodle: Undefined...
eiriksm’s picture

Status: Reviewed & tested by the community » Fixed

Thanks everyone!

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.