Problem
Oembed takes the first provider URL it finds and discards all others, however some providers provide multiple endpoints for different types of assets, or where assets are distributed across different hosts.
Solution
Loop over each endpoint and check for a matching URL.
Hello,
I've been studied the media core module, and I have found that it is not possible to support multiple websites with the same provider at the same time. I think it is really annoying that it supports only the centralized approach because the internet is supposed to be decentralized.
The name of the provider is fixed by the service that you use. I will take the example of my usecase to be easier to understand. So my usecase is the PeerTube, a decentralized video platform.
So oEmbed imposes us the provider name as we can see in those examples:
For Youtube:
https://www.youtube.com/oembed?url=https://www.youtube.com/watch?v=xHLes...
provider_name = Youtube
For PeerTube:
https://framatube.org/services/oembed?url=https%3A%2F%2Fframatube.org%2F...
provider_name = PeerTube
https://peertube.cpy.re/services/oembed?url=https%3A%2F%2Fpeertube.cpy.r...
provider_name = PeerTube
Consequently we have something like that in the ressource fetcher yml file pointed by the oembed_providers_url variable from media.settings
[
0 => {
provider_name: "PeerTube",
provider_url: "https://framatube.org",
endpoint: [
0 => {
url "https://framatube.org/services/oembed",
discovery true
}
]
},
1 => {
provider_name: "PeerTube",
provider_url: "https://peertube.cpy.re",
endpoint: [
0 => {
url "https://peertube.cpy.re/services/oembed",
discovery true
}
]
}
]
After that we arrive to the part that the media core module fails totally.
In Drupal\media\OEmbed\ProviderRepository
$keyed_providers = [];
foreach ($providers as $provider) {
try {
$name = (string) $provider['provider_name'];
$keyed_providers[$name] = new Provider($provider['provider_name'], $provider['provider_url'], $provider['endpoints']);
}
catch (ProviderException $e) {
// Just skip all the invalid providers.
// @todo Log the exception message to help with debugging.
}
}
If we read that code, it will iterate on every provider and fill the $keyed_providers variable. But of course, it will erase the data from the other websites with the same provider. After that, the core media module will be ok but only for the last website you have declared with the common provider.
I will try to work on that but this will probably be a big change on how the media module uses oEmbed.
For those who want to work on that, the core doesn't allow to override the oembed_providers_url setting yet. So set it manually or there is already an issue on drupal.org: #2999018 Expose oEmbed provider URL setting in the Media configuration form
I have a working module that supports only one website of Peertube because of this issue. If you want to debug this issue, it will give you a usecase to debug: PeerTube
Thank you to those who will help me to resolve this issue.
Comments
Comment #2
phjouComment #3
phjouOk I just did a patch that fix this issue. I think some tests will be broken since the provider URL was just a string before and it is now an array.
The provider attribute was not used so much in the code, so the patch is not so big for now.
Don't hesitate to tell me if something is broken with this patch.
Comment #4
phjouComment #5
phjouComment #6
phjouComment #7
phjouComment #8
phjouComment #9
phjouComment #10
Silicon.Valet CreditAttribution: Silicon.Valet at Red Hat commentedI'm helping to review this while at the core contribution workshop at Drupalcon Seattle
Comment #11
phenaproximaDiscussed this with @phjou at DrupalCon. I think, though I'm not certain, that the use case can be addressed by merging the endpoints from the two providers into a single provider, called PeerTube. The provider_url value is not really used by anything; it's just available to source plugins in case they need it. The endpoints are what actually define the availability and discoverability of oEmbed resources, and a single provider can define any number of those.
@phjou said he will try merging the endpoints together and will report back if it works.
Comment #12
phenaproxima@phjou and I walked through this and I think I now understand what the problem is. We were able to get his use case to work, but we did need to make some small modifications to the oEmbed system first.
The problem is that, in multiple places, the oEmbed system assumes that the first endpoint of any given provider is the one that should be used to generate the final resource URL. In a decentralized set-up like PeerTube, this assumption falls apart entirely.
So, our course of action here is that anywhere we do something like
reset($endpoints)
, we should be doing something more like this:This is technically a feature request, not a bug, so I'm adjusting the issue as such. I will also adjust the issue summary to more clearly explain what's going on.
Comment #13
phenaproximaComment #15
Sam152 CreditAttribution: Sam152 as a volunteer and at PreviousNext commentedI am experiencing the same problem. Facebook provides different endpoints for different schemes. The scheme for embedding videos is second on the list and is thus never called.
The schemes themselves are also patterns that can be on the same domain name, so I don't think the current approach is sufficient.
Comment #16
Sam152 CreditAttribution: Sam152 as a volunteer and at PreviousNext commentedThis should to do it. This is quite different to the previous patches, so if I've misunderstood the scope of this issue, I would be happy to file a new one instead.
Otherwise, let me know and I can update the issue summary with some concrete examples.
Comment #17
Sam152 CreditAttribution: Sam152 as a volunteer and at PreviousNext commentedHelps to upload the patch too 😅
Comment #19
phjouJust noticed that the patch we made at drupalcon is not there. The actual patch I did is a wrong aproach.
I will do it tomorrow. But we will need to write a test.
Comment #20
phjouSorry, I didn't had the time yesterday, but there is the patch. Working great with multiple endpoints with the dev version of the peertube module and should work for any provider giving multiple endpoints. Just attached the image of the format that the contrib module has to create in order to work with that patch.
@phenaproxima That's basically what we modified at DrupalCon.
Now we just need to write a test for this.
Comment #21
Sam152 CreditAttribution: Sam152 as a volunteer and at PreviousNext commentedHi @phjou, it looks like we both wrote very similar patches, with the same test failures. Any feedback on the test that has already been written? I think it makes sense to combine efforts at this stage.
Comment #22
phjouHi @Sam152 Yeah your patch is doing the same, but it looks better ;)
The question is that do we want to introduce a new function in the API? For me it would make sense since we are doing the same thing three times.
We just have to fix the two failing tests.
Comment #23
phjouJust changed a little bit your function to always return a provider and maybe fix the test that has no provider.
Comment #24
phjouOh I probably broke more tests because of the call to that function that we have in UrlResolver.php
Comment #25
phenaproximaI don't know if we should add another method to Provider. I think I'd rather have this logic in UrlResolver, to be honest.
That said, I think we might actually be able to reclassify this issue as a bug, not a feature request. The fact is, the oEmbed specification specifically says that a provider might provide multiple endpoints, and there is nothing I'm aware of that says each endpoint couldn't match different URL patterns. However, our oEmbed implementation utterly disregards this and assumes that the first endpoint is always the right one. From UrlResolver::getResourceUrl():
What makes it so certain that the first endpoint is the correct one? This should be doing something similar to what UrlResolver::getProviderByUrl() is doing:
So this, IMHO, buts up incredibly close to "bug" territory and should be considered a bug fix for 8.9.x and 9.0.x. If we're lucky, maybe committers would backport it to 8.8.x, too.
Comment #26
phjouOk so let's keep patch #20 with the Unit test from #17
drupal-oembed_multiple_endpoints_support_no_default-3008712-26.patch
I also created almost the same patch but that is keeping the reset to get the first element because it seems that when we are doing the loop, sometimes there is no match. In that case it will keep the first element as it was already working. Let's see if change something for the broken tests.
drupal-oembed_multiple_endpoints_support_no_default-3008712-26.patch
And I agree that it looks more than a bug than a feature request since it doesn't respect the oembed documentation.
Comment #27
phenaproximaThis may accidentally change the behavior of getResourceUrl(). With this, the last matching endpoint will win, not the first. To keep things as backwards-compatible as possible. I think we should move this to a utility function of UrlResolver (maybe a new public static method?) which returns the result of buildResourceUrl() on the first matching endpoint, not the last.
If we add that utility function, we can call it here to avoid repeating ourselves.
I think this will fail now, because we no longer have a getMatchingEndpointUrl() method. But if we have a new utility method of UrlResolver, we can add a test of that to UrlResolverTest.
Comment #29
s.abbott CreditAttribution: s.abbott commentedI believe this addresses the main issues from #27.
Comment #30
Chris Burge CreditAttribution: Chris Burge at University of Nebraska commentedComment #31
s.abbott CreditAttribution: s.abbott commented...or it would if the patch worked. Take 2, but with a less broken file this time!
Comment #33
msutharsComment #34
s.abbott CreditAttribution: s.abbott commented(Hopefully) fixed up the failing test. Should actually pass now.
Also worth noting that I didn't add any tests for the change to OEmbedResourceConstraintValidator, but there weren't any tests for it in the first place, and I didn't feel comfortable writing a full new test suite.
Comment #35
s.abbott CreditAttribution: s.abbott commentedComment #36
s.abbott CreditAttribution: s.abbott commentedAgain with the failing patches. I even checked first that time!
P.S. Sorry for any accidental spamming. This is my first real contribution and I'm still getting used to the process.
Comment #37
msutharsAdd utility function in UrlResolver to avoid repeating code.
Comment #38
msutharsComment #39
phenaproximaComment #40
phjouI think that this issue #3168301: oEmbed validator should use the urlResolver to get the resource URL that has been merged in core already contains part of the current issue
Comment #41
chr.fritschRerolled and made
getEndpointMatchingUrl
protected, since it's now only used in theUrlResolver
.Comment #43
edysmpThis is working as expected. Thanks!
Comment #46
catchOK this makes plenty of sense (although the patch was easier to follow than the issue) - updated the issue summary myself in the end.
Committed 600da9f and pushed to 9.2.x. Also cherry-picked to 9.1.x. Thanks!