Problem

Oembed takes the first provider URL it finds and discards all others, however some providers provide multiple endpoints for different types of assets, or where assets are distributed across different hosts.

Solution

Loop over each endpoint and check for a matching URL.

Hello,

I've been studied the media core module, and I have found that it is not possible to support multiple websites with the same provider at the same time. I think it is really annoying that it supports only the centralized approach because the internet is supposed to be decentralized.

The name of the provider is fixed by the service that you use. I will take the example of my usecase to be easier to understand. So my usecase is the PeerTube, a decentralized video platform.

So oEmbed imposes us the provider name as we can see in those examples:

For Youtube:
https://www.youtube.com/oembed?url=https://www.youtube.com/watch?v=xHLes...
provider_name = Youtube

For PeerTube:
https://framatube.org/services/oembed?url=https%3A%2F%2Fframatube.org%2F...
provider_name = PeerTube
https://peertube.cpy.re/services/oembed?url=https%3A%2F%2Fpeertube.cpy.r...
provider_name = PeerTube

Consequently we have something like that in the ressource fetcher yml file pointed by the oembed_providers_url variable from media.settings

[
  0	=> {
    provider_name:	"PeerTube",
    provider_url:	"https://framatube.org",
    endpoint: [
      0 => {	
        url	"https://framatube.org/services/oembed",
        discovery	true
      }
    ]
  },
  1	=> {
    provider_name:	"PeerTube",
    provider_url:	"https://peertube.cpy.re",
    endpoint: [
      0 => {	
        url	"https://peertube.cpy.re/services/oembed",
        discovery	true
      }
    ]
  }
]

After that we arrive to the part that the media core module fails totally.

In Drupal\media\OEmbed\ProviderRepository

$keyed_providers = [];
    foreach ($providers as $provider) {
      try {
        $name = (string) $provider['provider_name'];
        $keyed_providers[$name] = new Provider($provider['provider_name'], $provider['provider_url'], $provider['endpoints']);
      }
      catch (ProviderException $e) {
        // Just skip all the invalid providers.
        // @todo Log the exception message to help with debugging.
      }
    }

If we read that code, it will iterate on every provider and fill the $keyed_providers variable. But of course, it will erase the data from the other websites with the same provider. After that, the core media module will be ok but only for the last website you have declared with the common provider.

I will try to work on that but this will probably be a big change on how the media module uses oEmbed.

For those who want to work on that, the core doesn't allow to override the oembed_providers_url setting yet. So set it manually or there is already an issue on drupal.org: #2999018 Expose oEmbed provider URL setting in the Media configuration form

I have a working module that supports only one website of Peertube because of this issue. If you want to debug this issue, it will give you a usecase to debug: PeerTube

Thank you to those who will help me to resolve this issue.

CommentFileSizeAuthor
#41 3008712-41.patch3.85 KBchr.fritsch
#37 interdiff-3008712-36-37.txt2.3 KBmsuthars
#37 drupal-oembed_multiple_endpoints_support-3008712-37.patch4.75 KBmsuthars
#36 drupal-oembed_multiple_endpoints_support-3008712-36.patch4.12 KBs.abbott
#34 drupal-oembed_multiple_endpoints_support-3008712-34.patch4.13 KBs.abbott
#31 drupal-oembed_multiple_endpoints_support-3008712-31.patch2.71 KBs.abbott
#29 drupal-oembed_multiple_endpoints_support-3008712-29.patch2.65 KBs.abbott
#26 drupal-oembed_multiple_endpoints_support_no_default-3008712-26.patch3.04 KBphjou
#26 drupal-oembed_multiple_endpoints_support-3008712-26.patch3.05 KBphjou
#23 drupal-oembed_multiple_endpoints_support-3008712-23.patch4.27 KBphjou
#23 interdiff.txt899 bytesphjou
#20 Screenshot_2019-11-03 https drupal-contrib lndo site.png23.42 KBphjou
#20 drupal-oembed_multiple_endpoints_support-3008712-20.patch1.67 KBphjou
#17 3008712-16.patch4.26 KBSam152
#6 drupal-oembed_has_a_centralized_approach-3008712-6.patch5.85 KBphjou
#5 drupal-oembed_has_a_centralized_approach-3008712-5.patch5.85 KBphjou
#4 drupal-oembed_has_a_centralized_approach-3008712-4.patch5.79 KBphjou
#3 drupal-oembed_has_a_centralized_approach-3008712-3.patch5.13 KBphjou
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

phjou created an issue. See original summary.

phjou’s picture

Issue summary: View changes
phjou’s picture

Ok I just did a patch that fix this issue. I think some tests will be broken since the provider URL was just a string before and it is now an array.
The provider attribute was not used so much in the code, so the patch is not so big for now.

Don't hesitate to tell me if something is broken with this patch.

phjou’s picture

phjou’s picture

phjou’s picture

Status: Active » Needs review
phjou’s picture

Version: 8.6.x-dev » 8.8.x-dev
phjou’s picture

Issue tags: +Seattle2019
Silicon.Valet’s picture

I'm helping to review this while at the core contribution workshop at Drupalcon Seattle

phenaproxima’s picture

Discussed this with @phjou at DrupalCon. I think, though I'm not certain, that the use case can be addressed by merging the endpoints from the two providers into a single provider, called PeerTube. The provider_url value is not really used by anything; it's just available to source plugins in case they need it. The endpoints are what actually define the availability and discoverability of oEmbed resources, and a single provider can define any number of those.

@phjou said he will try merging the endpoints together and will report back if it works.

phenaproxima’s picture

Title: oEmbed implementation has a centralized approach » oEmbed URL resolution does not take multiple endpoints into account
Category: Bug report » Feature request
Priority: Major » Normal
Status: Needs review » Needs work
Issue tags: +Needs issue summary update, +Needs tests

@phjou and I walked through this and I think I now understand what the problem is. We were able to get his use case to work, but we did need to make some small modifications to the oEmbed system first.

The problem is that, in multiple places, the oEmbed system assumes that the first endpoint of any given provider is the one that should be used to generate the final resource URL. In a decentralized set-up like PeerTube, this assumption falls apart entirely.

So, our course of action here is that anywhere we do something like reset($endpoints), we should be doing something more like this:

foreach ($endpoints as $endpoint) {
  if ($endpoint->matchUrl($url)) {
    // This is the endpoint we're looking for
  }
}

This is technically a feature request, not a bug, so I'm adjusting the issue as such. I will also adjust the issue summary to more clearly explain what's going on.

phenaproxima’s picture

Issue tags: +oembed

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.0-alpha1 will be released the week of October 14th, 2019, which means new developments and disruptive changes should now be targeted against the 8.9.x-dev branch. (Any changes to 8.9.x will also be committed to 9.0.x in preparation for Drupal 9’s release, but some changes like significant feature additions will be deferred to 9.1.x.). For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Sam152’s picture

I am experiencing the same problem. Facebook provides different endpoints for different schemes. The scheme for embedding videos is second on the list and is thus never called.

The schemes themselves are also patterns that can be on the same domain name, so I don't think the current approach is sufficient.

Sam152’s picture

Status: Needs work » Needs review
Issue tags: -Needs tests

This should to do it. This is quite different to the previous patches, so if I've misunderstood the scope of this issue, I would be happy to file a new one instead.

Otherwise, let me know and I can update the issue summary with some concrete examples.

Sam152’s picture

Helps to upload the patch too 😅

Status: Needs review » Needs work

The last submitted patch, 17: 3008712-16.patch, failed testing. View results

phjou’s picture

Just noticed that the patch we made at drupalcon is not there. The actual patch I did is a wrong aproach.
I will do it tomorrow. But we will need to write a test.

phjou’s picture

Sorry, I didn't had the time yesterday, but there is the patch. Working great with multiple endpoints with the dev version of the peertube module and should work for any provider giving multiple endpoints. Just attached the image of the format that the contrib module has to create in order to work with that patch.

@phenaproxima That's basically what we modified at DrupalCon.

Now we just need to write a test for this.

Sam152’s picture

Hi @phjou, it looks like we both wrote very similar patches, with the same test failures. Any feedback on the test that has already been written? I think it makes sense to combine efforts at this stage.

phjou’s picture

Hi @Sam152 Yeah your patch is doing the same, but it looks better ;)
The question is that do we want to introduce a new function in the API? For me it would make sense since we are doing the same thing three times.

We just have to fix the two failing tests.

phjou’s picture

Just changed a little bit your function to always return a provider and maybe fix the test that has no provider.

phjou’s picture

Oh I probably broke more tests because of the call to that function that we have in UrlResolver.php

phenaproxima’s picture

Category: Feature request » Bug report
+++ b/core/modules/media/src/OEmbed/Provider.php
@@ -97,4 +97,23 @@ public function getEndpoints() {
+  /**
+   * For the given media item URL find an endpoint with schemes that match.
+   *
+   * @param string $url
+   *   The media URL used to lookup the matching endpoint.
+   *
+   * @return \Drupal\media\OEmbed\Endpoint
+   *   An endpoint for the URL.
+   */
+  public function getEndpointMatchingUrl($url) {
+    $endpoints = $this->getEndpoints();
+    foreach ($endpoints as $endpoint) {
+      if ($endpoint->matchUrl($url)) {
+        return $endpoint;
+      }
+    }
+    return reset($endpoints);
+  }

I don't know if we should add another method to Provider. I think I'd rather have this logic in UrlResolver, to be honest.

That said, I think we might actually be able to reclassify this issue as a bug, not a feature request. The fact is, the oEmbed specification specifically says that a provider might provide multiple endpoints, and there is nothing I'm aware of that says each endpoint couldn't match different URL patterns. However, our oEmbed implementation utterly disregards this and assumes that the first endpoint is always the right one. From UrlResolver::getResourceUrl():

    $provider = $this->getProviderByUrl($url);
    $endpoints = $provider->getEndpoints();
    $endpoint = reset($endpoints);
    $resource_url = $endpoint->buildResourceUrl($url);

What makes it so certain that the first endpoint is the correct one? This should be doing something similar to what UrlResolver::getProviderByUrl() is doing:

      foreach ($provider_info->getEndpoints() as $endpoint) {
        if ($endpoint->matchUrl($url)) {
          return $provider_info;
        }
      }

So this, IMHO, buts up incredibly close to "bug" territory and should be considered a bug fix for 8.9.x and 9.0.x. If we're lucky, maybe committers would backport it to 8.8.x, too.

phjou’s picture

Ok so let's keep patch #20 with the Unit test from #17
drupal-oembed_multiple_endpoints_support_no_default-3008712-26.patch

I also created almost the same patch but that is keeping the reset to get the first element because it seems that when we are doing the loop, sometimes there is no match. In that case it will keep the first element as it was already working. Let's see if change something for the broken tests.
drupal-oembed_multiple_endpoints_support_no_default-3008712-26.patch

And I agree that it looks more than a bug than a feature request since it doesn't respect the oembed documentation.

phenaproxima’s picture

  1. +++ b/core/modules/media/src/OEmbed/UrlResolver.php
    @@ -159,8 +159,12 @@ public function getResourceUrl($url, $max_width = NULL, $max_height = NULL) {
    +    foreach ($endpoints as $endpoint) {
    +      if ($endpoint->matchUrl($url)) {
    +        $resource_url = $endpoint->buildResourceUrl($url);
    +      }
    +    }
    

    This may accidentally change the behavior of getResourceUrl(). With this, the last matching endpoint will win, not the first. To keep things as backwards-compatible as possible. I think we should move this to a utility function of UrlResolver (maybe a new public static method?) which returns the result of buildResourceUrl() on the first matching endpoint, not the last.

  2. +++ b/core/modules/media/src/Plugin/Validation/Constraint/OEmbedResourceConstraintValidator.php
    @@ -110,6 +110,11 @@ public function validate($value, Constraint $constraint) {
    +      foreach ($endpoints as $endpoint) {
    +        if ($endpoint->matchUrl($url)) {
    +          $resource_url = $endpoint->buildResourceUrl($url);
    +        }
    +      }
    

    If we add that utility function, we can call it here to avoid repeating ourselves.

  3. +++ b/core/modules/media/tests/src/Unit/ProviderTest.php
    @@ -0,0 +1,40 @@
    +  public function testGetEndpointMatchingUrl() {
    +    $provider = new Provider('Sample provider', 'https://sample', [
    +      [
    +        'schemes' => [
    +          'https://foo/*',
    +          'https://bar/*',
    +        ],
    +        'url' => 'https://first-endpoint',
    +      ],
    +      [
    +        'schemes' => [
    +          'https://baz/*',
    +        ],
    +        'url' => 'https://second-endpoint',
    +      ],
    +    ]);
    +
    +    $this->assertNull($provider->getEndpointMatchingUrl('https://invalid-url'));
    +    $this->assertEquals('https://first-endpoint', $provider->getEndpointMatchingUrl('https://foo/1')->getUrl());
    +    $this->assertEquals('https://first-endpoint', $provider->getEndpointMatchingUrl('https://bar/2')->getUrl());
    +    $this->assertEquals('https://second-endpoint', $provider->getEndpointMatchingUrl('https://baz/3')->getUrl());
    

    I think this will fail now, because we no longer have a getMatchingEndpointUrl() method. But if we have a new utility method of UrlResolver, we can add a test of that to UrlResolverTest.

Version: 8.9.x-dev » 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

s.abbott’s picture

I believe this addresses the main issues from #27.

Chris Burge’s picture

Status: Needs work » Needs review
s.abbott’s picture

...or it would if the patch worked. Take 2, but with a less broken file this time!

Status: Needs review » Needs work
msuthars’s picture

Assigned: Unassigned » msuthars
s.abbott’s picture

(Hopefully) fixed up the failing test. Should actually pass now.
Also worth noting that I didn't add any tests for the change to OEmbedResourceConstraintValidator, but there weren't any tests for it in the first place, and I didn't feel comfortable writing a full new test suite.

s.abbott’s picture

s.abbott’s picture

Again with the failing patches. I even checked first that time!

P.S. Sorry for any accidental spamming. This is my first real contribution and I'm still getting used to the process.

msuthars’s picture

Add utility function in UrlResolver to avoid repeating code.

msuthars’s picture

Assigned: msuthars » Unassigned
Status: Needs work » Needs review
phenaproxima’s picture

phjou’s picture

I think that this issue #3168301: oEmbed validator should use the urlResolver to get the resource URL that has been merged in core already contains part of the current issue

chr.fritsch’s picture

Rerolled and made getEndpointMatchingUrl protected, since it's now only used in the UrlResolver.

Version: 9.1.x-dev » 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

edysmp’s picture

Status: Needs review » Reviewed & tested by the community

This is working as expected. Thanks!

  • catch committed 600da9f on 9.2.x
    Issue #3008712 by phjou, s.abbott, msuthars, Sam152, chr.fritsch,...

  • catch committed 348ae20 on 9.1.x
    Issue #3008712 by phjou, s.abbott, msuthars, Sam152, chr.fritsch,...
catch’s picture

Version: 9.2.x-dev » 9.1.x-dev
Issue summary: View changes
Status: Reviewed & tested by the community » Fixed
Issue tags: -Needs issue summary update

OK this makes plenty of sense (although the patch was easier to follow than the issue) - updated the issue summary myself in the end.

Committed 600da9f and pushed to 9.2.x. Also cherry-picked to 9.1.x. Thanks!

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.