At the moment, if the same prefix is defined twice by two different modules, the namespace is not a string but an array of namespaces, this is due to module_invoke_all() in drupal_get_rdf_namespaces(), which lead to such output in the head tag:

  xmlns:dc="Array"

- If 2 or more modules define the same namespaces for the same prefix, then there is no conflict, and the prefix should be assigned only one namespace

- If more than one namespace is assigned to the same prefix, there is a conflict and Drupal should do something about it. The contrib RDF module will have the right UI and will warn users in such case, but drupal_get_rdf_namespaces() should have a basic strategy in case this happens. 2 options:

  • drop the prefix altogether, which would prevent any inaccurate RDF data to be outputted. We simply cannot decide which prefix/namespace should take over the other one, and even if we knew, we would have to edit the module whose prefix is overriden and rename it to something else (again something for contrib). That means that we will need some kind of agreement across contrib to associate a prefix with its namespace, but I don't see a better solution here. (and I don't see contrib adopting hundreds of RDF namespaces any time soon anyways).
  • allow modules to override each other's mapping via the system table weight column, but IMO that's not a viable for the reason mentioned above. If a namespace is broken in a module, it should just be fixed in that module. It's the same reason why we don't have a hook for altering namespaces.

I prefer option 1. Note 1: some RDFa parsers and RDFa 1.1 is likely to have a set of popular prefixes hard-coded which would mitigate this solution, but modules which want to reuse prefixes like SIOC or FOAF for other purposes than what they are intended to are badly designed. Note 2: one might want to deal with multiple revisions of the same vocabulary, but I really think we should not deal with this complex case (a workaround is to use a different prefix for each version).

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

JeremyFrench’s picture

Is this a bug for 7 or a feature request? It is something which may need to be looked at in the future but as far as I can see it doesn't break anything right now.

scor’s picture

Title: rdf.module should deal with conflicting namespaces » drupal_get_rdf_namespaces() should deal with conflicting namespaces
Status: Active » Needs review
FileSize
5.46 KB

This is a bug: if 2 modules in contrib want to declare a prefix for a namespace, they will indeed work properly as long as they are not enabled on the same site. If they are installed on the same Drupal site, they will definitely run into the problem where the prefix they declared in hook_rdf_namespaces() will be outputted xmlns:bla="Array" which is not what we want.

This patch comes with tests and implements the logic detailed in #0.

This patch also fixes some minor typos in the function names of rdf.test.

Status: Needs review » Needs work
Issue tags: -RDF

The last submitted patch, 721082_rdf_ns_conflict_2.patch, failed testing.

scor’s picture

Status: Needs work » Needs review
Issue tags: +RDF

This locale test passes on my two dev machines.

#2: 721082_rdf_ns_conflict_2.patch queued for re-testing.

Anonymous’s picture

Status: Needs review » Needs work

As scor said, this is definitely a bug.

This solution does a good job of handling the namespace collision as long as neither namespace is redefined. However, once hook_rdf_namespaces is fixed, the bug would resurface, I believe.

For instance:

  • User installs module X (which has redefined foaf)
  • Drupal saves x_rdf_mapping to the database, including foaf:CrazyNewClass
  • Currently, all is fine because foaf is defined twice and the prefix has been dropped.

  • Someone (either user or module creator) fixes hook_rdf_namespaces and foaf is no longer defined twice
  • Because foaf:CrazyNewClass is defined in the mapping for the bundle, we have triples that assert typeof http://xmlns.com/foaf/0.1/CrazyNewClass.
  • All of the sudden, we have inaccurate statements.

We can fix this by running all of the curies through a function at the time of module installation to ensure that their prefix has only been defined once. If it has been defined more than once, or not defined at all, don't save the term in the mapping.

I have a first crack at this solution mostly finished, will post a patch shortly.

scor’s picture

We can fix this by running all of the curies through a function at the time of module installation to ensure that their prefix has only been defined once.

I think this would qualify as a separate issue for hardening. as far as I understand this would live outside hook_rdf_namespaces() drupal_get_rdf_namespaces(), right?

Anonymous’s picture

The way I implemented this, it does change the way hook_rdf_namespaces is invoked slightly. I have created a function in the RDF module that actually invokes the hook and merges the duplicates. drupal_get_rdf_namespaces then calls this function instead of invoking the hooks directly.

I did this because I need a list of the RDF namespaces as an array, so as to not repeat the functionality, I moved it to a separate function.

Anonymous’s picture

I am posting the patch here for discussion, we can move to a separate issue if need be.

We would need to change hook_rdf_mapping implementations slightly. First, the function needs to invoke any hook_rdf_namespaces that defines the namespace prefixes it wants to use. Then it needs to call rdf_get_curie for the terms, instead of setting the curie as a string value.

I included the revised hook_rdf_mapping for node as an example.

scor’s picture

I feel this is patch adds a lot of complexity given the stage we're at in the Drupal 7 development cycle. This patch would definitely deserve it own issue, and I'll provide a review there.

Anonymous’s picture

I have created #736210: RDF Mapping API mints fake terms when namespaces collide. I feel that this secondary bug is potentially harmful to the use of RDF and that it is deserving of an API change, but as scor points out, it is unlikely to happen very often.

scor’s picture

Status: Needs work » Needs review
FileSize
4.43 KB

rerolled patch from #2.

Anonymous’s picture

Even though we probably can't change the handling of curies at this point, I still think it might be good to do the invocation of the hook from a new function in rdf. We can then call that from drupal_get_rdf_namespaces. This approach allows us to reuse the functionality. We might want to come up with a better name for the function than the one I used, though.

145 critical left. Go review some!

+++ includes/common.inc	8 Mar 2010 16:48:17 -0000
@@ -256,8 +256,12 @@ function drupal_get_breadcrumb() {
+  foreach (rdf_get_rdf_namespaces_array() as $prefix => $uri) {
+    // Do not include prefixes that are defined for more than one URI.
+    // Otherwise we may make inaccurate statements in RDF.
+    if (!is_array($uri)) {
+      $xml_rdf_namespaces[] = 'xmlns:' . $prefix . '="' . $uri . '"';
+    }

The change in drupal_get_rdf_namespaces

+++ modules/rdf/rdf.module	8 Mar 2010 16:48:17 -0000
@@ -799,3 +799,57 @@ function theme_rdf_metadata($variables) 
+function rdf_get_rdf_namespaces_array() {
+  $ns = array();
+
+  foreach (module_invoke_all('rdf_namespaces') as $prefix => $uri) {
+    if (is_array($uri)) {
+      if (count(array_unique($uri)) == 1) {
+        // All namespaces declared for this prefix are the same, merge all
+        // duplicates.
+       $uri = $uri[0];
+      }
+    }
+    $ns[$prefix] = $uri;
+  }
+  return $ns;
+}

The new function.

Powered by Dreditor.

scor’s picture

Status: Needs review » Needs work

I agree. Again, this is some cruft left from the very first RDF patch landed when there was no rdf.module to handle the rdf logic. At that time, hook_rdf_namespaces() was invoked in common.inc and system.module was implementing this hook. Now that hook_rdf_namespaces() has been moved to rdf.module with #720610: Move the core RDF namespace definitions in the rdf module, there is no need to keep the invocation of hook_rdf_namespaces() in common.inc and it should be moved into rdf.module.

The other drawback of the existing drupal_get_rdf_namespaces() is that it's only able to return a serialized list of namespaces, only useful when outputing xml. That's not flexible if we want to include these namespaces in other formats like JSON, RSS, turtle, etc. rdf.module should return an array of namespaces which common.inc serializes for its own needs (in this case, include them in the HTML output).

Given that this patch adds some logic for dealing with conflicting namespaces, I really think this logic should live in rdf.module, just another argument for moving as much code as we can from common.inc to rdf.module.

Anonymous’s picture

It makes sense to move all the logic to RDF.

I think we need two different options for the array result. One needs to simply merge duplicate namespace definitions, but leave arrays for prefixes that have two or more unique namespaces assigned—like I used in rdf_get_curie. The other needs to totally flatten the array (and possibly provide warnings that namespaces need to be redefined).

We could do this either by passing the namespaces through two functions like I did in the patch, or by passing in a flag to one function.

scor’s picture

Status: Needs work » Needs review
FileSize
4.91 KB

I've moved the invocation of hook_rdf_namespaces into rdf.module. No API changes from drupal_get_rdf_namespaces()'s point of view since it still returns the exact same xmlns serialization, just that instead of doing all the logic, it calls the new rdf_get_rdf_namespaces() from rdf.module. The advantage is that modules willing to use the RDF namespaces in other formats can get an array or prefix, uri values from rdf_get_rdf_namespaces(). I'm not convinced we should overload rdf_get_rdf_namespaces() with an extra flag to get the raw (potentially conflicting) namespaces, as, really, any module willing take on that job can simply get it by calling module_invoke_all('rdf_namespaces').

Status: Needs review » Needs work

The last submitted patch, 721082_rdf_ns_conflict_15.patch, failed testing.

scor’s picture

Title: drupal_get_rdf_namespaces() should deal with conflicting namespaces » Move hook_rdf_namespaces() invocation into rdf.module and prevent conflicting namespaces
Status: Needs work » Needs review
FileSize
6.14 KB

Fixed rdf_get_rdf_namespaces() which had remaining cruft form when it was part of common.inc, and improved its documentation. This patch new comes with 2 separate tests, one for rdf_get_rdf_namespaces() and another for drupal_get_rdf_namespaces() which ensure the namespaces are parsed correctly in the XHTML document.

Anonymous’s picture

Status: Needs review » Reviewed & tested by the community

As scor says in #5:

No API changes from drupal_get_rdf_namespaces()'s point of view since it still returns the exact same xmlns serialization, just that instead of doing all the logic, it calls the new rdf_get_rdf_namespaces() from rdf.module.

There is just a bit of logic added to rdf_get_rdf_namespaces to make sure that we don't get xmlns:prefix="Array" in the namespace declarations.

Tests check for all of the different cases of namespace/prefix overlap.

RTBC.

scor’s picture

Title: Move hook_rdf_namespaces() invocation into rdf.module and prevent conflicting namespaces » Prevent conflicting namespaces and move hook_rdf_namespaces() invocation into rdf.module
FileSize
6.14 KB

reroll to remove offsets and better title. Besides fixing an annoying bug, this patch also adds a test to ensure the XML serialization of the RDF namespaces parses correctly, something we don't have yet in core.

Dries’s picture

Status: Reviewed & tested by the community » Needs review
+++ includes/common.inc
@@ -254,9 +254,8 @@ function drupal_get_breadcrumb() {
 function drupal_get_rdf_namespaces() {

Mmm, wouldn't that fail when rdf.module is not enabled?

+++ modules/rdf/rdf.module
@@ -95,6 +95,32 @@ function rdf_rdf_namespaces() {
+function rdf_get_rdf_namespaces() {

It feels like we have one 'rdf' too many in the function declaration. Why not rdf_get_namespaces()?

scor’s picture

@Dries: thanks for the review. fixed the rdf.module dependency in drupal_get_rdf_namespaces() and renamed rdf_get_rdf_namespaces() to rdf_get_namespaces().

Anonymous’s picture

Status: Needs review » Reviewed & tested by the community

Since the dependency was fixed, I think this can go back to RTBC.

webchick’s picture

Status: Reviewed & tested by the community » Fixed

Looks like all of Dries's concerns were addressed. scor and I also talked about this earlier, and it comes with lovely tests. :)

Committed to HEAD with a couple of PHPDoc additions.

Status: Fixed » Closed (fixed)
Issue tags: -RDF

Automatically closed -- issue fixed for 2 weeks with no activity.