Problem/Motivation

Steps to reproduce:

Add this to a custom module:

function mymodule_media_source_info_alter(&$sources) {
  $sources['audio_oembed'] = [
    'id' => 'audio_oembed',
    'label' => t('Remote audio'),
    'description' => t('Use audio from remote sources.'),
    'allowed_field_types' => ['string'],
    'default_thumbnail_filename' => 'no-thumbnail.png',
    'providers' => ['SoundCloud'],
    'class' => 'Drupal\media\Plugin\media\Source\OEmbed',
  ];
}

Set up a media type using the source.

When you try to save a new media item pointing to soundcloud, you get the following error:

"The provided URL does not represent a valid oEmbed resource."

This is because soundcloud specifies type float for their oembed version, and Symfony's xml parser returns an array for floats.

Proposed resolution

Patch attached which makes it work, but not sure what the best solution is here.

Remaining tasks

User interface changes

API changes

Data model changes

Comments

catch created an issue. See original summary.

catch’s picture

Status: Active » Needs review
StatusFileSize
new623 bytes
plach’s picture

Status: Needs review » Needs work
Issue tags: +Needs tests

Looks good and works well, but we need tests.

Version: 8.6.x-dev » 8.7.x-dev

Drupal 8.6.0-alpha1 will be released the week of July 16, 2018, which means new developments and disruptive changes should now be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

phenaproxima’s picture

Issue tags: +Media Initiative
phenaproxima’s picture

Status: Needs work » Needs review
Issue tags: -Needs tests
StatusFileSize
new593 bytes
new1.82 KB

It turns out that our existing tests were already perfectly poised to catch this bug. Here ya go!

catch’s picture

#6 looks good to me.

phenaproxima’s picture

Title: Core oembed support can't parse soundcloud XML » oEmbed system can't parse XML attributes

The last submitted patch, 6: 2982372-6-FAIL.patch, failed testing. View results

b_sharpe’s picture

Status: Needs review » Reviewed & tested by the community

Works great and tests fail pass as they should, RTBC!

larowlan’s picture

+++ b/core/modules/media/src/OEmbed/ResourceFetcher.php
@@ -194,4 +194,23 @@ protected function createResource(array $data, $url) {
+    // XML resources may have a 'type' attribute in the version tag. In this
+    // case, Symfony's XML parser will interpret the version tag as an array
+    // whose text content is contained in the '#' key.

can you point me to some docs about this?

phenaproxima’s picture

Unfortunately not; the behavior is undocumented as far as I can tell. Here's the offending line from Symfony's XmlEncoder::decode():


        foreach ($rootNode->attributes as $attrKey => $attr) {
            $data['@'.$attrKey] = $attr->nodeValue;
        }

// BEHOLD!
        $data['#'] = $rootNode->nodeValue;

        return $data;

The only condition this is hidden behind is that the text value of the node is returned directly if, and only if, there are no attributes. If there are attributes, this is what we get. Sucks that it's not documented, though!

alexpott’s picture

What about if type or some other attribute is added to the title element in the XML? Do we need a more generic thing to normalise $data?

alexpott’s picture

Status: Reviewed & tested by the community » Needs work

Step debugged through the Symfony code. The code that's responsible for this in our case is:

    /**
     * Parse the input DOMNode into an array or a string.
     *
     * @return array|string
     */
    private function parseXml(\DOMNode $node, array $context = array())
    {
        $data = $this->parseXmlAttributes($node, $context);

        $value = $this->parseXmlValue($node, $context);

        if (!\count($data)) {
            return $value;
        }

        if (!\is_array($value)) {
            $data['#'] = $value;

            return $data;
        }

As far as I can see there is no way around this. And if we continue to use XMLEncoder we're going to be fragile if any oembed provider adds an attribute to any element.

I think what we should do here is to commit a workaround here but not do it as we've done here. It should be an ugly hack that does not add a protected method so there is no API change at all and we should open a follow-up to harden this code or use a different way to get XML into an array.

alexpott’s picture

The hack could look something like this:

    if (strstr($format, 'text/xml') || strstr($format, 'application/xml')) {
      $encoder = new XmlEncoder();
      $data = $encoder->decode($content, 'xml');
      // @todo document...
      if (is_array($data['version'])) {
        $data['version'] = $data['version']['#'];
      }
    }

in \Drupal\media\OEmbed\ResourceFetcher::fetchResource()

Alternatively if we want to fix this reliably here we could do something like

    if (strstr($format, 'text/xml') || strstr($format, 'application/xml')) {
      // Convert XML to JSON for consistent array structures regardless of
      // XML attributes.
      $internalErrors = libxml_use_internal_errors(true);
      libxml_clear_errors();
      $content = simplexml_load_string($content, "SimpleXMLElement", LIBXML_NOCDATA);
      libxml_use_internal_errors($internalErrors);
      if ($error = libxml_get_last_error()) {
        libxml_clear_errors();
        throw new ResourceException($error->message, $url);
      }
      if ($content === FALSE) {
        throw new ResourceException('The fetched resource XML could not be parsed.', $url);
      }
      $data = Json::decode(json_encode($content));
    }

... well probably move that into a protected method in \Drupal\media\OEmbed\ResourceFetcher()

phenaproxima’s picture

I like the second approach better. This also explains why an older iteration of the oEmbed system did exactly that -- convert XML to JSON and then back again. I didn't know why it did that previously (there was no comment about it) so I removed it. Now it's bitten us in the butt, so I'll restore it in a protected method. Patch forthcoming today.

phenaproxima’s picture

Status: Needs work » Needs review
StatusFileSize
new3.13 KB

How's this look?

Status: Needs review » Needs work

The last submitted patch, 17: 2982372-17.patch, failed testing. View results

alexpott’s picture

  1. +++ b/core/modules/media/src/OEmbed/ResourceFetcher.php
    @@ -116,7 +114,7 @@ protected function createResource(array $data, $url) {
    -    if ($data['version'] !== '1.0') {
    +    if (!$this->verifyResourceVersion($data['version'])) {
    

    This will fail.

  2. +++ b/core/modules/media/src/OEmbed/ResourceFetcher.php
    @@ -194,4 +192,42 @@ protected function createResource(array $data, $url) {
    +    $data = json_encode($content);
    +    return Json::decode($data);
    

    This looks weird... either we do json_encode()/json_decode() or Json::encode()/Json::decode() but mising them feels awkward. Let's go with Json:: as that's what we're using for the json variant.

phenaproxima’s picture

Status: Needs work » Needs review
StatusFileSize
new3.11 KB
new972 bytes

Ugh, sorry. Didn't have enough of a caffeine jolt this morning. All fixed.

phenaproxima’s picture

StatusFileSize
new2.83 KB
new451 bytes

Hey look, I'm still a moron.

The last submitted patch, 20: 2982372-20.patch, failed testing. View results

catch’s picture

Status: Needs review » Reviewed & tested by the community

This looks a lot happier.

alexpott’s picture

Status: Reviewed & tested by the community » Fixed

Committed and pushed 6ca0cd27aa to 8.7.x and faa28e8e08 to 8.6.x. Thanks!

  • alexpott committed 6ca0cd2 on 8.7.x
    Issue #2982372 by phenaproxima, catch, alexpott: oEmbed system can't...

  • alexpott committed faa28e8 on 8.6.x
    Issue #2982372 by phenaproxima, catch, alexpott: oEmbed system can't...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.