Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
We have a wonderful browser in SimpleTest, so why shouldn't people be able to use it in other places?
Here's a patch which makes it possible. The patch is kind of big because it replaces every call to $this->drupalGet to $this->browser->get and $this->drupalPost to $this->browser->post
Comment | File | Size | Author |
---|---|---|---|
#96 | 340283-browser.patch | 38.47 KB | boombatower |
#92 | 340283-browser.patch | 38.64 KB | boombatower |
#89 | 340283-browser.patch | 38.52 KB | boombatower |
#88 | 340283-browser.patch | 38.6 KB | boombatower |
#70 | 340283-browser.patch | 37.11 KB | boombatower |
Comments
Comment #1
keith.smith CreditAttribution: keith.smith commentedComment #3
dmitrig01 CreditAttribution: dmitrig01 commentedshould apply
Comment #4
dmitrig01 CreditAttribution: dmitrig01 commentedoops, forgot browser.inc
Comment #5
Crell CreditAttribution: Crell commentedSubscribing. No time to really review right now. Glancing at the code quickly my first reaction is that most of the properties of DrupalBrowser should be protected, not public, and have proper accessors.
+1 on the concept, though. Potentially couldn't this even replace drupal_http_request?
Comment #6
boombatower CreditAttribution: boombatower commented+1 on concept, postponing #330504: Create auto-install script until completed.
I'll review thoroughly later.
Comment #7
boombatower CreditAttribution: boombatower commented#3 & #4 seem to indicate you would like it reviewed.
Comment #8
boombatower CreditAttribution: boombatower commented/cvs/drupal/drupal/ would appear to be an issue, not to mention do patches add to CHANGELOG.txt? I thought webchick or Dries did that, maybe I'm mistaken.
Comment #10
catchPatches do add to CHANGELOG.txt - although often at the last minute when requested. This seems like enough of a new feature it'd warrant a mention though.
Comment #11
Damien Tournoud CreditAttribution: Damien Tournoud commentedI like the idea of having a better HTTP client / browser available, but this one relies on cURL, so I don't think it can fly as-is.
Comment #12
boombatower CreditAttribution: boombatower commented@catch: thanks for the clarification.
@Damien: I agree having a generic HTTP client would be nice (as I have uses for it outside of SimpleTest), but what do you mean by it relies on cURL so that is no go?
Comment #13
Damien Tournoud CreditAttribution: Damien Tournoud commented@boombatower: I mean that Drupal 7 currently doesn't require the cURL extension to be available. Do we really want to make that a requirement? Because replacing drupal_http_request() by something cleaner, faster and more powerful would be really nice.
Comment #14
catchcURL isn't available on all systems, and I don't think we should up requirements for Drupal as a whole just for the browser.
To me this means having the browser abstracted with the same requirements ought to be fine - maybe even aggregator module could require it with a pluggable browser in contrib, but not core core stuff.
Comment #15
boombatower CreditAttribution: boombatower commentedI really like the idea personally and cURL doesn't seem like an out of the question library does?
Comment #16
boombatower CreditAttribution: boombatower commentedAnother idea given in IRC would be to make a core browser.module, but I personal think it should replace the drupal_http_request functionality.
Note: also requires simplexml.
Comment #17
boombatower CreditAttribution: boombatower commentedModules that currently use drupal_http_request() in core:
They all have their own parsing code and such which could be much simplified/reduced with DrupalBrowser and would keep all the code in one place.
Summary (requirements): curl, simplexml for PHP.
Question: Should core use DrupalBrowser?
Definitely refactor the browser so that it is not only usable by SimpleTest, but by other developers. Perhaps browser.module. Allow it to function with drupal_http_request so that the parsing/traversing capabilities can be used.
Comment #18
Dave ReidI had a plan for a #64866: Pluggable architecture for drupal_http_request(). This issue and that one could probably work together. The abstract browser could be used as one of the 'pluggable' methods for drupal_http_request.
Comment #19
boombatower CreditAttribution: boombatower commentedSo looks like we need pluggable backends for drupal_http_request: one for curl and one with fsockets (then a setting to use one or the other depending on the host). The DrupalBrowser will always use curl and will do more, but core will only use it to parse with simplexml stuff.
Comment #20
chx CreditAttribution: chx commentedThis is another of those pluggable frameworks we struggle with. If this is an interface (and I think it is) then the methods we need are get and post, we need a property which shows whether this browser is capable of keeping a session (curl can as it knows about cookies). We should ditch the socket based implementation and instead use the HTTP stream wrapper. Basic iimplementation is
file_get_contents($url)
and the meta data is going to be in$GLOBALS['http_response_header']
which we can copy into our object. curl implementation, in simpletest directory, uses curl as we have it.Comment #21
chx CreditAttribution: chx commentedOh and parsing is totally different from this browser.
Comment #22
boombatower CreditAttribution: boombatower commentedAlright, I'll start working on this.
Comment #23
boombatower CreditAttribution: boombatower commentedOne thing we need to make sure of is that the browser works fine with url's other than drupal ones, currently it only does partially.
Comment #24
boombatower CreditAttribution: boombatower commentedLocal code is coming along, but I'll hold of with patch for a bit.
Comment #25
boombatower CreditAttribution: boombatower commentedHere is a very early patch, just to get an idea of my direction and see what other think.
DamZ noted that I should look at: http://pear.php.net/package/HTTP_Request2. Any opinions: should it be abstracted as much as http_request2 lib or more like what I have?
Comment #26
boombatower CreditAttribution: boombatower commentedMore functional browser (minimal POST supported), API refactoring.
Direction looked at by chx.
Comment #27
boombatower CreditAttribution: boombatower commentedNeeds:
1) A bit more documentation and filling out.
2) Support for uploads
3) Socket backend
4) Setting in admin page
5) Some of the items in the TODO list, included in patch
This will due a post request correct now. The browser is designed to correctly interpret urls and actions for non-drupal paths as well (even interprets base HTML element).
Bed time, 4:17 am here.
Comment #28
boombatower CreditAttribution: boombatower commentedThis took a bunch of tinkering to get POST to work with wrapper as I tried to use http_build_query() to encode fields, but later found out when I tried to replace the code in the curl backend that it failed there to. Issues was it uses & by default and I need an actual &. Just changed to
http_build_query($fields, NULL, '&')
It would be possible to support cookies in stream backend, but we can worry about that later. All other items still apply.
Comment #29
boombatower CreditAttribution: boombatower commentedSupports proper loading of wrappers and multiple (different) wrapper browsers can be open at the same time.
Not sure where to put admin setting, but code runs off
browser_wrapper
currently.Turned the wrapper backend info() method into a hook and implemented in system module.
Comment #30
boombatower CreditAttribution: boombatower commentedThis patch starts the integration with DrupalWebTestCase, which will prove that code works and tell me what else I need to add.
The included, temporary, browser.test proves that it works with the basis DrupalWebTestCase.
I have started cleaning things out, very slowly, so don't review the DrupalWebTestCase changes for code style and such, just bugs. The block test currently gets 24 passes, 18 fails, and 0 exceptions. It does several posts before the db craps and I get:
Of course being SimpleTest it found a number of bugs in my code during the implementation. :)
Comment #31
boombatower CreditAttribution: boombatower commentedThis still needs a lot of work, but I'd like to see how many tests fail with this in place so far.
Comment #33
boombatower CreditAttribution: boombatower commentedIn order to keep the patch size down I have remove the SimpleTest integration and will just focus on browser and tests. Once is at a good point then we can commit and then work on a followup patch to replace SimpleTest browser.
Comment #35
boombatower CreditAttribution: boombatower commentedInteresting reasons for why it was passing locally...anyhow. The reason it failed on test box is that the browser was accessing the host drupal site and not the prefixed database since it never set the prefix in the test (as DrupalWebTestCase does that).
So added and walah.
Comment #37
boombatower CreditAttribution: boombatower commentedI get the feeling something related to curl craps.
Removed curl form test to see.
Comment #39
boombatower CreditAttribution: boombatower commentedI expect this to still fail on test client (I need to investigate that), but it passes locally. Added a TON of documentation.
I still need to add a bit more documentation, clean up some of the API, and provide hooks, but it is very far along.
Comment #41
boombatower CreditAttribution: boombatower commentedThis is much cleaner then the browser available through DrupalWebTestCase. It still needs some additional functionality before it can replace it, but I would like to get some reviews so we can move this closer to getting in.
My ideal plan would be to commit this very close to as it is now. Make a few more clean up patches and get it up to speed with DrupalWebTestCase, then work on integrating with DrupalWebTestCase and replacing the current code.
Then I want to either replace drupal_http_request or make it a wrapper that calls this browser.
After that we can start using the parsing/printing capabilities to clean up areas of core per #17.
Comment #42
boombatower CreditAttribution: boombatower commentedAdditional documentation, proper browser implementation for generic request($method) [and docs], support for HTTP header redirect and meta refresh tag [and tests].
Comment #43
boombatower CreditAttribution: boombatower commentedOutstanding todos:
* Hooks for get, post events. (optional)
* HTTP authentication (easy followup patch)
* Deal with Drupal-Assertion callback stuff. (decide on api, followup)
* Error handling. (decide how it should work)
* Deal a from containing one button without a name, even no value? (followup patch)
* Page support proxy behavior. (cool, followup patch)
Error handling within the Browser class is the main thing left.
Comment #44
boombatower CreditAttribution: boombatower commentedError handling should be fine.
Cleaned up documentation and code a bit more. This is looking very clean for an initial patch. I don't want to add any more so that the patch stays at a reasonable size.
All tests pass locally, I'll need to investigate what testing client doesn't have setup for this. Any additional tests will be obvious when merged with SimpleTest since all the tests in core will use the browser.
Comment #45
boombatower CreditAttribution: boombatower commentedCurious what bot reports.
Comment #47
boombatower CreditAttribution: boombatower commentedSpent some time debugging the issue and figure out it happens when clean_urls are disabled. Not an issue with test client configuration.
Code was not being triggered...and had a bug.
to
and browser_test.module was flawed
to
Comment #49
boombatower CreditAttribution: boombatower commentedI was messin with the server before to figure out issue and had a left over.
New patch to ensure dry run.
Comment #51
boombatower CreditAttribution: boombatower commentedSomehow I completely lost all my recent changes....several hours worth.....arrrrrrrrrrrrrg. Apparently I never rolled the patch.
Had added a bunch of documentation....cleaned a number of things.............very very annoying.
Comment #52
boombatower CreditAttribution: boombatower commentedI don't think this copy will every be as perfect as the other...but it will have to suffice.
Comment #53
Dries CreditAttribution: Dries commentedIs this meant to deprecate drupal_http_request()? That is, should the xml-rpc backend, the aggregator module, etc all leverage this new browser system? If we're going through the effort of abstracting things out, it is good to know what we hope to accomplish with it (in core).
Comment #54
tstoecklerfrom #41 by boombatower:
Comment #55
Dries CreditAttribution: Dries commentedI missed that. That's good news. It took several years to get drupal_http_request() to work properly -- lots of edge cases with headers, timeout handling, redirect loops, etc. Fortunately, we have some tests for it. Either way, the sooner we can do that mapping, the more confident I'd be in the re-usability and robustness of this new "browser" component.
Comment #56
boombatower CreditAttribution: boombatower commentedAlso from #17:
Some systems do XML parsing and writing which could be handled via the browser's BrowserPage component that uses SimpleXML. Thus writing XML is as easy as $element->asXML();
AS explained through the comments...I would like to get the browser in without replacing anything so that the patch stays small (get big fast). Then I will go through and start replacing all areas that either do XML related operators, drupal_http_request(), and SimpleTest's DrupalWebTestCase itself. There are a couple of additions needed to replace DrupalWebTestCase, but this code is very close and should provide a much cleaner base.
Comment #57
Dries CreditAttribution: Dries commentedOK with me, I'm sure we can make the mapping work.
Comment #59
boombatower CreditAttribution: boombatower commentedTest slave crapped.
Comment #60
Owen Barton CreditAttribution: Owen Barton commentedThis looks good to me. Is there anything in particular we need reviewed, other than basic code style and layout? It would be great if chx could review and check that his architecture thoughts have been sufficiently covered.
So the follow up patches would be to:
(a) Replace the simpletest browser and appropriate XML calls with calls to this
(b) switch drupal_http_request to use this as the backend (a simple wrapper for simple use cases), and test the various modules that use this function
(c) anything else?
I was also thinking that there are probably some useful libraries of http test cases that we could draw from, to make sure we have all the obscure edge cases covered. For example, I assume perhaps the simpletest project has some self tests we could refactor to point at the new browser. Also the wget, curl and lftp projects would likely have some nice tests we could use for inspiration.
It would also be great to make sure the hooks are capable of allowing a contrib to provide transparent optional caching of http requests (assuming the browser isn't doing this already).
Comment #62
boombatower CreditAttribution: boombatower commentedIt doesn't currently provide caching hooks, but we can add that in followup patch.
My plan in terms of tests was to use the SimpleTest tests and write more as we integrate it into SimpleTest itself. The basic tests included prove that it functions and most of the features work.
Comment #63
boombatower CreditAttribution: boombatower commentedUpdated for system.module not applying.
Also, in terms of review...yea just basic code review and it would be nice to get an "ok" from chx.
Comment #64
chx CreditAttribution: chx commentedThere are a number of things I dislike, mostly the separation of http wrapper and browser. There should be a browser class that the various implementations can extend. Keep some methods abstract as necessary. Then have a factory (which i called get in cached and system queue they might need renaming) which returns a browser object. Use the preferred-class-stored-in-variable way as seen in cache or system queue. And then you can use DBTNG or cache style one line wrappers around the code if it gets too ugly.
Comment #65
boombatower CreditAttribution: boombatower commentedWrapper architecture
I am not sure I agree about the http wrapper and browser architecture both logically and in terms of implementation.
The browser is the highest level, it controls everything and runs the show. As such it doesn't make sense that it should be extended by lower level functionality.
Now when I go to use the browser I end up with a variety of classes that all extend Browser. That doesn't seem to make sense logically.
Even to look at DBTNG:
Mysql is a type of connection and thus extends a DatabaseConnection (generic). In the same way:
It is an interface since no methods or handling are required.
HttpWrapper is not a type of browser and thus shouldn't extend it.
Browser as class
There are two reasons the browser is a class: 1) it stores a lot of context that would require a bunch of ugly statics or globals if written as functions, 2) it followed the DrupalWebTestCase standard which was easy to use and clean.
After thinking about it the reasons for making the browser a set of functions vs a class is somewhat 50/50 it could be done each way...although it would be ugly if done as functions due to all the context that would need to be retrieved at the start of each function. Since both implementations of the browser may be in use at the same time, core using default, contrib forcing other/custom, then the functions would need to also have to specify the type of backend to use in each call...very ugly vs as an object where you just use the factory to get the type you want and use the object reference over and over without having to specify type over and over.
vs
Also since the BrowserPage class only makes sense as a class it seems to fit that the browser is a well for the reasons above and consistency.
Not to rush this, but this patch is holding up a few others...and if to be fully implemented before code freeze...needs to get moving.
Comment #66
boombatower CreditAttribution: boombatower commentedAfter a discussion on skype earlier today with chx we decided on an architecture which requires a bit of refactoring.
Since it is extremely unlikely that anyway will ever want a wrapper backend other than curl or PHP stream we decided to remove the abstraction layer and simply merge it into the base browser code.
The browser will be turned into a set of functions with static variables instead of a class and will not longer accept a wrapper backend parameter. Instead the browser will default to curl if available, otherwise it will use the PHP stream wrapper.
The BrowserPage class will remain since it is a good candidate for a class. It is build on SimpleXML which is an object oriented architecture and since we would like to be able to have multiple pages loaded into memory we need an easy way to keep track of state information with the convenience methods it provides.
For the majority of use cases, a simple GET request, the following code will be required.
I have included a very early version of the refactoring with a function browser_get() and basic architecture.
Comment #67
boombatower CreditAttribution: boombatower commentedWoops.
Comment #68
Dries CreditAttribution: Dries commented- Documentation at the top of the file still mentions 'class'. It needs to be updated.
- _browser_header_parse_all doesn't seem to be used.
- browser_header_name() wasn't a self-explanatory name.
- In browser_get(), $state isn't used.
- Why is browser_header_name() not a private function?
- Why is browser_init() not a private function?
- This doesn't look right:
- I don't understand browser_state_set(). It's not self-explanatory.
Comment #69
Dries CreditAttribution: Dries commentedOther than the small code comments, I wanted to add that the design is a bit odd. The use of drupal_static() felt abusive and really begs for a simple class-based design.
Comment #70
boombatower CreditAttribution: boombatower commentedThis is a good example of the new power the browser provides. The one used by DrupalWebTestCase cannot work on sites other than the local drupal install (at least not well, one reason why it only works if site is in root or up one..and clean urls are on.) due to its weak base URL handling (relies on url()). The new browser interprets the base URL according to spec.
The api.drupal.org site provides a great example of the browser working through its paces.
With some debug code added we can see what the browser did to get the end result page (same as you see in your browser).
1) The first request is the initial request to view form definition (as you would hit home page). (GET)
2) The form submission with default values and the search parameter. (POST)
3) The server redirects to a search URL. (GET)
4) After finding the function it redirects to final location. (GET)
I'll update the remaining documentation after dinner.
Comment #71
Damien Tournoud CreditAttribution: Damien Tournoud commentedI agree with Dries on the need for a "simple class-based design" here.
This is nothing more then an HTTP client, so the "Browser" name sounds far-fetched.
API-wise, here is what I would love to see:
Comment #72
boombatower CreditAttribution: boombatower commentedAs for #71...that is exactly what it did before.
Considering all the form handling stuff I'd say its a bit more than an HTTP client.
Either way...arg on the change in architecture again.
To clarify #67 was for chx to look at and was not done. The main documentation block has, still, not yet been updated. (dinner)
Comment #73
Crell CreditAttribution: Crell commentedI'm with Dries and Damien here. Whether or not it's technically pluggable or swappable, this use case begs to be a robust class/object. I don't know why there was a consideration of it not being one. It's a tight coupling between complex functionality and relevant stateful data. That's an object.
Comment #74
boombatower CreditAttribution: boombatower commentedI think that removing the plugabble architecture is fine...and if we decide to go back to class (I am fine with) then won't be complete loss since I already refactored the code to remove the interface.
Comment #75
chx CreditAttribution: chx commentedI do not want to hold this issue further. We are fighting on every corner whenever OOP gets involved. We are using drupal_static instead of writing a set/get pair of function. In my view, OOP is evil -- while sometimes necessary there should be a reason to use it. There are two reasons AFAIK, one is building on an already OOP system like PDO or SimpleXML the other is pluggability where an interface, a base class and subsequent classes help. The browser itself has neither. The page is SimpleXML based. Also, using this is WAY simpler than OOP could be. But do what you want. I am tired of fighting this for two years now. I know that sun was complaining not even knowing what a method or an interface is but sure go ahead and undo Drupal. You guys, who were raised on Java, learned Java, work with Java will never ever understand that people do not get this OOP crap. Having DBTNG is bad enough. Let's use OOP when necessary and never otherwise.
But again: I am so out of this issue. I find it funny that since December noone poke in this issue but once I tried to make something more Drupalish, immediately there is the usual for-OOP, loud crowd and the against-OOP much huger crowd is nowhere and I am so tired of trying to speak for them. But do whatever you want, just get this in.
Comment #76
kbahey CreditAttribution: kbahey commented@Damien & Crell
I agree that it is an HTTP Client, and that drupal_http_request() can benefit from that.
But I am trying to see what does OOPifying this buys us in real concrete terms (other than being OOP)? Why not just a keyed array as arguments, and get it over with?
Comment #77
chx CreditAttribution: chx commentedWell, let's pit codes against each other:
vs DamZ:
we can stop here. It's absolutely a matter of taste whether you pass around $browser before or after the function name:
$browser = browser_init(); browser_post($browser, ...)
vs $browser = new browser; $browser->post(...). I have merely removed the need for the $browser variable to be held outside of the browser routines and moved it into a static -- once again, in the past we have used a get/set function pair to do the same, we do not need any more.The OOP bloat, however, is shining immediately in Damien's example
$headers = $http_client->getHeaders();
this is absolutely typical that even for the simplest cases you want to add a getter. Don't. Keep it simple.So to sum up: being OOP or not being OOP makes little difference API-wise however OOP is alien to many people and also it leads to bloat. Always. I do not want to point out other issues where I needed to fight hard to get OOP-induced bloat thrown out but it happens.
Comment #78
Crell CreditAttribution: Crell commentedchx, I'm not going to get into another debate with you about the finer points of OO vs. procedural. There's plenty of time for that in Paris where I have two sessions submitted on the subject. :-) You know that I have also gone on record repeatedly opposing OOP where it has no value or is harmful, so I hope you don't think I am pushing OO everywhere just for the sake of "being Java-ish".
I am only going to respond to one of your points for now: That there is some huge majority of PHPers in the world that are incapable of ever understanding "evil OOP" because it's just so hard for them. To be blunt, I call bull.
There are, in fact, hundreds of thousands of developers in the world well-versed in Classic-OOP (C++, Java, PHP, C#, etc., which is different than the Javascript or Perl or Ruby approaches). The professional and semi-professional PHP world has long since moved to mostly OO, even farther than I think is healthy for PHP code. There is a huge sea of developers who get shivers when they *don't* have classes everywhere, and to them Drupal's hooks and drupal_static() and such are completely incomprehensible. Perhaps if that great majority of OO-fearful developers doesn't speak up more, it's because there aren't as many of them as you imagine?
How about giving Drupal developers a little credit for being able to handle a method call? We all learned the Drupal-specific weirdness that is FAPI and the crazy indirection of hooks. A language construct used by hundreds of thousands of people around the world is not our mortal enemy. It's just a tool, like anything else.
Comment #79
chx CreditAttribution: chx commentedWell be it as I said. Do OOP just do not bloat it. But then you guys will review what's forthcoming 'cos I won't.
Comment #80
Owen Barton CreditAttribution: Owen Barton commentedI think the OOP/no-OOP discussion is not a big deal in this case. Developers just want to be able to submit http requests in an easy to use fashion. I honestly think most developers are happy with a "function params/access array", or with a "create object/access methods/parameters" pattern. If they don't know the former (perhaps they never used any randomly downloaded php lib before using Drupal?) they will pick it up pretty quick as soon as they do any DB work.
The situation is different, and I agree needs more care/thought when we are expecting developers to extend classes or implement interfaces themselves to be able to use a system. So far we have not come up with any use cases here though, and so the question is simply - which approach leads to a simpler and more elegant internal structure/code for the browser.
Comment #81
Dries CreditAttribution: Dries commentedDecision: let's go with a _simple_ OO based design.
Comment #82
Damien Tournoud CreditAttribution: Damien Tournoud commentedLet's redo this. I'm not sure why I put getters in the example above. Probably to map with Simpletest drupalGetContent(), but it isn't really useful here.
Comment #83
boombatower CreditAttribution: boombatower commentedI'll go ahead and map the function based code (with the merged backends) _back_ to _simple_ OOP style code. I will not make any further changes.
I have lots of things I want to add, and areas of core to clean up with this. Those are for followup patches just as any additional thoughts to changing the browser right now.
I still don't want to call this just an http_client. We have functions like clickLink() etc that will be added back...doesn't seem like just an http client. Not to mention that is actually understands what forms are in the sense that you can "click" a button instead of just submitting it as a post var.
I dono...thin line there.
Comment #84
Crell CreditAttribution: Crell commentedboombatower: Actually that sounds like a perfect case for subclassing. The base class is just an http client that makes requests and holds the response data for you to do stuff with. A subclass of that provides browser emulation with clickLink(), form handling, and so forth. That way, people just doing JSON or ReST requests can use the simple object without the extra overhead of the fancier web-browser-in-a-box.
Comment #85
boombatower CreditAttribution: boombatower commented@Crell: please look at #63...and see if that is more what you are looking for. I can return it to that, but remove curl/stream separation..or keep it.
Comment #87
boombatower CreditAttribution: boombatower commentedI am going to start from #63 and remove the wrapper interface stuff. No one has responded in a while...so please "forever hold your piece".
Comment #88
boombatower CreditAttribution: boombatower commentedNothing new, essentially turned the #63 OO implementation into the #70 backendless implementation. Based on the comment I am not sure anyone look at that code...since it is exactly what was being requested. I just merged the backend interfaces into one class since chx and I agreed that there really is not other use case.
There are still things that can be added and a few patches that have gotten into the SimpleTest browser that are not in this one, but those are great followups. This browser provides the essentials and does it well. Cleaned up version of DrupalWebTestCase browser, although I know it can be cleaned up further.
I would like to get some reviews and then gets this in if possible so I can start on followups.
Comment #89
boombatower CreditAttribution: boombatower commentedQuick, code clean.
EDIT: This is holding up a number of other patches so if we could prioritize this that would be great.
Comment #90
c960657 CreditAttribution: c960657 commentedI suggest renaming the array key to
headers
for consistancy.The code allows $url to be false - the documentation should reflect that.
This method not only does a POST request but also does some forms handling. A raw post() method is required if Browser is to replace drupal_http_request(). I suggest renaming the method to submitForm(). It could possibly support <form method="GET"> too.
The name of the submit button is not necessarily sufficient for identifying which form on the page to submit. It is possible to have multiple forms containing submit buttons with the same name, or even two submit buttons in the same form with the same name.
I suggest mentioning in the documentation for getAbsoluteUrl() that it only has limited support for resolving relative URLs. I would be nice with a general Drupal function for doing this, possibly based on the Net_URL2 PEAR package (see Net_URL2::resolve()).
getDefaultFieldValue() and processField() misses a @return Doxygen comment. Would it be cleaner if these functions always returned an array instead of treating forms that generate only one value specially (i.e. a
<input name="foo" value="bar">
would returnarray('foo' => 'bar')
)?Also, would it be cleaner to get rid of the $html_value and $type parameters for getDefaultFieldValue() and processField() and instead let them figure this out themselves?
Perhaps getDefaultFieldValue() and processField() could be combined into one function - i.e. like this:
post() does not support if several fields have the same name (because the post variables are stored in an array with the name as key). If PHP receives a request, e.g.
foo=1&foo=2
, only the last value appear in $_REQUEST. But here it is the first value that is preserved AFAICT.Some systems actually use multiple fields with the same name, e.g. Bugzilla (though only in GET requests, I think), but perhaps we can live with this limitation?
The wording of $input (a SimpleXMLElement representing an <input> or <select> element), $fields (representing key/value pairs used for filling out the form), and $post (key/value pairs to include in the request) is not very clear. I suggest e.g. $elements, $suppliedVariables, $postVariables, respectively (perhaps not the best suggestion - but the idea is to make it clear that the two latter are similar, and to avoid the use of the word “input”, because <input> elements are not the only form elements).
This does not support <button> elements.
I assume this should be
$this->asText($element)
.I assume this should be
$this->getSelectElements($select)
.The method name should probably include “option”.
Where is the getAllOptions() method defined?
Why strip the query string? Please add a reference to the RFC describing what is going on in this method.
What is the purpose of this? HTTP header names are case-insensitve. In the current Simpletest browser this is handled by lower-casing the name. See also #303838: drupal_http_request - case sensitive HTTP header field names.
AFAIK the empty string is a valid header value.
Perhaps this should be called getResponseHeaders(), so they are not confused with request headers.
idividual
The typo "peform" occurs several times.
Do what?
It would be convenient with a setRequestHeader($name, $value) method so you do not have to set all headers in one method call, possibly overwriting a previous call to setUserAgent().
Perhaps the two implementations (cURL and PHP streams) should be implemented using the pattern described here: http://www.garfieldtech.com/blog/pluggable-systems-howto
Comment #91
chx CreditAttribution: chx commentedThanks for the review. I fixed a missing < / > in the review. Perhaps the two implementations (cURL and PHP streams) <= not. We have ditched that because there is no sensible other implementation.
Comment #92
boombatower CreditAttribution: boombatower commentedI made some changes you suggested...but the majority are known and are followup patches....this is holding up too much as is already 300000X better then what we have.
That is a PHP attribute name, not mine.
I am aware of the fact that we want to separate....follow up patch, as for replacing drupal_http_request()....it only does GET.
getDefaultFieldValue() and processField() misses a @return Doxygen comment. Would it be cleaner if these functions always returned an array instead of treating forms that generate only one value specially (i.e. a would return array('foo' => 'bar'))?
Howso? If I do $arr['foo'] = 'bar'; then $arr['foo'] = 'bar2'; ...bar2 wins. Also that is a horrible edge case...I don't see the use...you want to specify the last value and don't care about rest.
Limitation of the previous system, things like this are good followups.
Those are left overs from all the conversion of stuff.
Followup patch...already 30x better then before.
Just to ensure they are all using one standard to they are consistent and comparisons work well. Don't effect the functionality.
Was not aware...curious could you provide an example (I have changed the code)...it look something like?
Seems odd
Make the function actually work...placeholder.
If you look the initial patches we abstracted to be pluggable...we decided against that since we don't think it makes sense for anything other than PHP wrappers and curl.
Comment #93
boombatower CreditAttribution: boombatower commentedBased on review and my cleanup...any other changes..unless major bugs need to be done in followup.
Comment #94
Dries CreditAttribution: Dries commented@boombatower, I think I'm happy to commit this patch but it is somewhat difficult to see the delta between what c960657 suggested and what you changed. Could you create follow-up issues for the things that are left to be fixed, and link them from a comment in this post? Like that, we have a better overview, and a better way to track progress. Thanks! Once the follow-up issues are created and linked from, I'll know better whether to commit this patch or not.
I'll do another review of this patch shortly.
Thanks for your hard work!
Comment #95
c960657 CreditAttribution: c960657 commentedYes, but if you post to a form that has the following two input fields, in a regular browser the latter wins, but in this browser the former is posted.
This pattern is actually useful: If the former is a type=hidden field and the second is a checkbox with the second name, you can on the server side detect whether the checkbox was not checked or whether it was not included in the form at all.
That's good enough for me. I just want to note that some frameworks (e.g. PEAR) provide a number of different HTTP libraries, including the http PECL module.
But even if we decide on only having these two, it might make the code clearer if the code for the two different implementations were split into a base class and two subclasses, instead of being mixed in one class using if's. Well - just a suggestion.
In these lines, $options['header'] change type from an array to a string. This is confusing. I suggest using two different variables for the two different representations.
Huh? drupal_http_request() also supports POST, PUT etc.
Anyway, if you know that you are going to rename post() to submitForm(), it may be easier to do it now rather than in a followup.
I think using lower-case is much better - it is simpler and easier to explain. The suggested implementation does not use the case used in the RFC, e.g. ETag or WWW-Authenticate, so this is just as “inaccurate” as the plain lower-case version. The header names are not visible to the end user, so I don't think its worth trying to make them look nice.
The suggested implementation doesn't properly change e.g.
LOCATION
intoLocation
- it only ever changes the first letter AFAICT.E.g.
TE:
, see http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.39The still do.
What do you think about my suggestion about combining getDefaultFieldValue() and processField() into one function?
Comment #96
boombatower CreditAttribution: boombatower commentedWhich is how it was in the earlier ...patches and it was decided to go this route.
That still seems like an edge case and was not supported by the previous version.
This is still following the pattern of DWTC...with some cleaned up code a number of additions. So we need to keep looking at it like that...once this is in...feel free to pump out issues to add/correct functionality.
The purpose of this issue is to "abstract" it and move it out of DWTC..and a bit of cleanup on the side. This is not intended to be a perfect browser as its predecessor was not.
Changed to all lowercase...
My thought it to rework the form handling in a separate issue. I would really like to get this in so all the other things waiting on thsi can get started..then we can continue to clean as well.
Also changed the header handling to allow blank ones...very odd.
Once this gets in...i'll go through and make a number of issues and link to them from here.
Comment #97
webchickDries said he wanted to re-review this, leaving this issue to him.
Comment #98
Dries CreditAttribution: Dries commentedCommitted to CVS HEAD. Marking as 'code needs work' until the follow-up issues are created. Thanks boombatower.
Comment #99
boombatower CreditAttribution: boombatower commentedThe following are the items that I parsed from the comments and my own plans. I hope to get the majority of them in before freeze. Although that is an aggressive goal.
Comment #100
boombatower CreditAttribution: boombatower commentedMoving to new component.
Comment #102
c960657 CreditAttribution: c960657 commentedWhat other things are you referring to?
So far, no code in core uses the new browser. While trying to convert drupal_http_request() to a wrapper around the browser (#553280: Integrate browser.inc and drupal_http_request()), I noticed that it lacked several methods in order to provide the same functionality as drupal_http_request(). I think this would also be the case when replacing the Simpletest browser in #553278: Replace DrupalWebTestCase browser with browser.inc. In order to fix these, modifying/extending the browser API is necessary.
There is just one week until the API freeze on October 15, but it seems like there is still a lot to do. If we are not able to get at least #553278: Replace DrupalWebTestCase browser with browser.inc committed in time for the freeze (not that fixing that bug itself would constitute an API change, but it will give us an understanding of the necessary changes to the browser API), would it be better to remove browser.inc for D7?
Don't get me wrong — I like the idea about a separate browser class and I appreciate your work on this, but I'd rather not see D7 ship with three different HTTP implementations (drupal_http_request(), DWTC and browser.inc) because there is no time to consolidate the code.
Comment #103
c960657 CreditAttribution: c960657 commentedI have opened a separate issue for the discussion of my last comment:
#600032: Back out browser.inc