Currently it is close, but not quite.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

c960657’s picture

Status: Active » Needs review
FileSize
10.96 KB

It takes quite a lot of code to get it right, but I think it is useful also outside the browser, so I put it in common.inc.

The code is based on the implementation I did in Net_URL2::resolve() for PEARs Net_URL2 package.

Perhaps we can combine _drupal_parse_url() and valid_url() at some point.

joshmiller’s picture

Status: Needs review » Needs work
+++ includes/browser.inc	6 Sep 2009 12:41:12 -0000
@@ -1065,30 +1065,19 @@ class BrowserPage {
    * @return
    *   An absolute path.

This function no longer returns a string. It returns an array. This is an API breaker... which is banned post code freeze. May need to return a string to keep API intact.

+++ includes/common.inc	6 Sep 2009 12:41:13 -0000
@@ -1371,18 +1371,180 @@ function valid_url($url, $absolute = FAL
+ * Contrary to PHP's parse_url(), this function supports relative URLs.

This word isn't right, perhaps "Relative URLs are supported in addition to PHP's parse_url() typical usage."

+++ includes/common.inc	6 Sep 2009 12:41:13 -0000
@@ -1371,18 +1371,180 @@ function valid_url($url, $absolute = FAL
+    // - "schema" and "authority" may not be empty strings.
+    // - "path" is always present, possibly as an empty string.
+    // - "query" may be an empty string (e.g. "http://example.com?") or NULL
+    //   (e.g. "http://example.com"). The same applies to "fragment".

This should be moved up into the comment head above the function.

+++ includes/common.inc	6 Sep 2009 12:41:13 -0000
@@ -1371,18 +1371,180 @@ function valid_url($url, $absolute = FAL
+  else {
+    return FALSE;
+  }

just "return FALSE;" -- the 'else' is implied and wastes space.

+++ includes/common.inc	6 Sep 2009 12:41:13 -0000
@@ -1371,18 +1371,180 @@ function valid_url($url, $absolute = FAL
+/**
+ * Remove dots as described in RFC 3986, section 5.2.4.
+ *
+ * For instance, "/foo/../bar/baz" is converted to "/bar/baz".
+ */

@return?
@params?

+++ includes/common.inc	6 Sep 2009 12:41:13 -0000
@@ -1371,18 +1371,180 @@ function valid_url($url, $absolute = FAL
+    if (substr($path, 0, 2) == './') {

Since we are dealing with user supplied input, all "==" should be "===" otherwise, we open ourselves up to some weird PHP false positives.

+++ includes/common.inc	6 Sep 2009 12:41:13 -0000
@@ -1371,18 +1371,180 @@ function valid_url($url, $absolute = FAL
+      // Step 2.A

I don't understand the "Step 2.A" etc... BTW you have two "Step 2.A"s

+++ includes/common.inc	6 Sep 2009 12:41:13 -0000
@@ -1371,18 +1371,180 @@ function valid_url($url, $absolute = FAL
+      if ($relative_parts['path'] == '') {

needs "==="

+++ includes/common.inc	6 Sep 2009 12:41:13 -0000
@@ -1371,18 +1371,180 @@ function valid_url($url, $absolute = FAL
+        if (substr($relative_parts['path'], 0, 1) == '/') {

needs "==="

+++ includes/common.inc	6 Sep 2009 12:41:13 -0000
@@ -1371,18 +1371,180 @@ function valid_url($url, $absolute = FAL
+          if ($base_parts['authority'] && $base_parts['path'] == '') {

needs "==="

I'm on crack. Are you, too?

hass’s picture

Interesting code...

c960657’s picture

Status: Needs work » Needs review
FileSize
15.49 KB

I upgraded _drupal_parse_url() to a public function, drupal_parse_url(), and made it return all keys returned by parse_url(), i.e. also "user", "pass", "host", "port" (note that "user" and "pass" are not mentioned in RFC 3986). I also added some tests and elaborated the Doxygen comment.

This function no longer returns a string. It returns an array.

Huh? getAbsoluteUrl() still returns a string AFAICT, so no API break here.

Since we are dealing with user supplied input, all "==" should be "===" otherwise, we open ourselves up to some weird PHP false positives.

The $base_parts and $relative_parts arrays are both returned by drupal_parse_url(), so the format is well known. In particular, drupal_parse_url() guarantees that $parts['path'] is always a string. Of course, using === doesn't hurt, but I think we usually just use ==.

I don't understand the "Step 2.A" etc... BTW you have two "Step 2.A"s

The implementation closely follows the recipe in the RFC, section 5.2.4. Step 2.A is split into two if-clauses and implements the following:

  A.  If the input buffer begins with a prefix of "../" or "./",
      then remove that prefix from the input buffer; otherwise,

People wanting to mess with this code really should have the RFC handy, so I haven't copied the actual text from there into the comments.

c960657’s picture

FileSize
15.44 KB

Removed the "authority" and "userinfo" parts from drupal_parse_url() to make it more like parse_url().

Status: Needs review » Needs work

The last submitted patch failed testing.

boombatower’s picture

Project: Drupal core » Browser
Version: 7.x-dev » 7.x-1.x-dev
Component: browser system » Code

Move to Browser module for further development.