what does butler return when something is not there? [#1261156]

Comment	File	Size	Author
#49	1263682-context-variables-overriding.patch	10.52 KB	Crell

Comment #1

fgm

French

Paris, France

commented 31 August 2011 at 06:39

This should probably not throw any error, but return

- NULL, meaning unknown.
- FALSE, meaning no such datum

Log in or register to post comments

Comment #2

Crell commented 4 September 2011 at 03:52

Log in or register to post comments

Comment #3

Anonymous (not verified) commented 8 September 2011 at 07:52

If i haven't completely missed the boat. Value or no value, and then the case where there is no variable. So,

- VALUE, return the value youre requesting.
- NULL, return null if the variable is empty.
- FALSE, if there is no such variable.

Log in or register to post comments

Comment #4

fgm

French

Paris, France

commented 8 September 2011 at 08:52

A recurring problem with such choices is a need to differentiate between a value of NULL or FALSE and either of these being returned as a status answer. Unless we want to wrap the actual returned value in an object or array separating the value from the status of the value access (and we probably don't want this), there will always be such a value collision problem.

A possible solution, if we consider such requests for missing data to be fairly rare, would be to throw an exception, but this is a rather costly option if access to undefined values is expected to be non-exceptional.

Log in or register to post comments

Comment #5

jherencia commented 9 September 2011 at 17:37

#3 I think it might be possible to need this king of statement:

$low_res = $butler['http:get:low_res'];

so FALSE could be a valid return.

I see minor possibilities of the need of NULL as a valid return, in that case we should create a class Butler::Error and check if the return value is of that type.

I think we can assume NULL as value not found.

Log in or register to post comments

Comment #6

Stalski commented 8 September 2011 at 09:38

My vote goes to handler which are not able to return anything at all. I'd rather ask "canHaveValue" or "hasValue" than having to check on a real NULL or a FALSE ... I think the return values should be specified and are consistent at all time.

I have build an application with drupal this year with lots of mathematical functions in it. I was glad to have a full testable function that returned what was expected. In some cases, the handler was locked and the handler is not in the possibility of returning a value. In this case the demanding code knows.

I think an object or array like a JSON response (success= & real answer=) is error prone and not easy to use.

Log in or register to post comments

Comment #7

Crell commented 8 September 2011 at 23:47

My concern with a hasValue() method is that it could get called a lot along the critical path. I'm worried about ballooning our critical path method calls.

Perhaps if hasValue() is called only if getValue() returns false? So if getValue() returns !null, it's assumed that handler has The Answer(tm). If it returns NULL, then the context object checks hasValue(). If that returns true, it means "yes, the value really is NULL". If false, it means "Uh, I dunno", and the context object calls up the stack.

Would that work and be adequately performant?

Log in or register to post comments

Comment #8

Stalski commented 9 September 2011 at 07:47

I think it's a great idea!
It would certainly work, is a lot faster than what I suggested. If it is performant enough, future iterations will tell I persume.

Log in or register to post comments

Comment #9

fgm

French

Paris, France

commented 9 September 2011 at 08:28

If we go back to PHP fundamentals, NULL can be both a valid answer, or meaning the answer is unknown, throwing no error/warning/notice in earlier versions. Maybe we should go back to these roots and design so that the meaning collision on NULL is acknowledged and context values are built accordingly.

Or use exceptions: these have a lower cost than testing everytime if they are not triggered.

Log in or register to post comments

Comment #10

Stalski commented 9 September 2011 at 09:15

exceptions are not that ideal: performance issues and they are not designed for such things. The next closest thing is always returning an object / NullObject

Log in or register to post comments

Comment #11

Stalski commented 10 September 2011 at 09:29

A case could be: suppose we want to get the organic group context $context['og']

1. The og module already had set the og context.
- Here we always return the real value (primitive value or a ContextValue implementation)
1.1 this page has no og context to provide what so ever
- We return a NULL or FALSE, depending on a convention
1.2 this page did return a context value for OG
- fine

2. The og module did not come to set the og context, yet another (sub)module is asking for it already
- We should have the possibility to act accordingly. While the context object is not locked yet, we should be able to set the og context (if we are a module that can do that).
- In this case, if we get a ContextValueUnknown or something, we could use that information differently than if we would have gotten a FALSE or NULL, which could have been the purpose in the example above.
- In these cases, it's up to the people calling the context to understand that they will get a primitive or ContextValue. Which subclass can tell us what the context key setter intended.

Log in or register to post comments

Comment #12

Crell commented 10 September 2011 at 19:46

In this setup the OG module would *never* "set" the current OG. It would set the context handler that would derive the current OG on-demand. So the "it will exist later but doesn't yet" case is impossible. (That's one of the key reasons to move to this locked-lazy model for context information.)

Log in or register to post comments

Comment #13

Stalski commented 10 September 2011 at 19:57

Yeah, that's true but that was not really my point. it would be the og module providing the handler, no?

Nevertheless, the problem will occur when fetching something that is not fetchable vs something that returns a real NULL.
What do you think about the proposition of a class that indicates the value is not there?

Log in or register to post comments

Comment #14

fgm

French

Paris, France

commented 10 September 2011 at 20:39

This OG case in interesting, though: suppose a module wants to /optionally/ use OG context info, and it accesses, say drupal_get_context()['og'].

If OG is present, and there is a OG context, it will return a meaningful value on the first request, which will be locked after that.
If OG is present, and there is no OG context, it will probably return a meaningful value like FALSE, and so on
But if OG is not present ? What will the butler do ?

In the current Drupal development model, we can just use module_exists() or function_exists() before a call (or use module_invoke, which will always succeed and return NULL in case there is no implementation), or isset()/empy() a global, or peoperty_exists() a property on an object. But in the new butler mediation model, because the implementation is lazy, we can not check /a priori/ without causing an initial triggering of the needed code and there has to be a way to obtain a result or lack thereof.

Or we could maybe refine the context implementation so that something similar to these pre-flight checks can be performed without triggering the code defining the context. In some circumstances, the list (in whatever form) of possibly available context elements made available by a module could be defined on module install and stored in some context registry. Then, checking for existence of the property would not need to trigger the code required to obtain the value, and could be inferred from the data in the context registry: the code actually providing the context value and causing the lock would only be triggered when the data is actually requested.

On the other hand, on an abstract level, this remains a if foo() do result = bar() else result = null endif. We might be better served by something like try result = bar() catch (fooException) result = null endtry, as long as the exception case remains a statistical exception.

Or, of course, just being able to go like result = bar(), the PHP way. Which means giving strong semantics to on value, probably NULL, with the already discussed consequences on context definers: NULL would have to have that same meaning at their level to avoid collisions.

Log in or register to post comments

Comment #15

Crell commented 10 September 2011 at 23:43

Since we're using ArrayAccess, there is an offsetExists() method we can use.

http://us.php.net/manual/en/class.arrayaccess.php

That means that we can make isset($c['foo']) return TRUE if there is either a value there or a registered handler (but not actually invoking the handler), and false if there's nothing there. We probably want to make that traverse the stack, too, since looking up the value would do so.

Log in or register to post comments

Comment #16

pounard

French

commented 11 September 2011 at 23:44

+1 for the offsetExists() implementation, a return fixed return value such as NULL or FALSE will always cause ambigous use cases.

Log in or register to post comments

Comment #17

Crell commented 13 September 2011 at 01:49

Status:

Active

» Needs review

So, thinking aloud then...

offsetExists() on the context object pseudo-code:

While (trimming the context key by colons)
  If a context value is set on this key, return true.
  If a context handler is set on this key, return true.
If there is a parent context return $parent->offsetExists();
return False.

For offsetGet(), then, we do the following:

While (trimming the context key by colons)
  If a context value is set on this key, return it.
  If a context handler is set on this key, execute it, save the value, and return the value.
If there is a parent context return $parent->offsetGet();
return NULL.

To differentiate between NULL and "no value", you'd then do:

// If you know that NULL has some important meaning.
if (isset($c['foo:bar']) {
  $var = $c['foo:bar']; // Even if this is NULL, we know that it's a meaningful NULL.
}

// If you know that NULL means "nothing", you can just do this:
$var = $c['foo:bar']; // if $var is NULL, then we know that there's nothing there.

The catch is that means potentially iterating and climbing the context tree twice. However, if we remove the iterate-and-climb logic from offsetGet() then that method will return invalid data when called externally, since one would expect it to mean "if I ask for this value will I get something meaningful?"

Of course, I now realize that, duh, we're talking about different objects. :-)

So when offsetGet() calls a context handler, it does this:

$value = $handler->getValue();
if ($value != NULL && $handler->hasValue() == TRUE) {
  $this->values[$key] = $value;
  return;
}

Or something. My concern is whether this setup results in way too many function calls internally.

This of course dove-tails with #1263682: Getting context variables and overriding.

Thoughts?

Log in or register to post comments

Comment #18

Crell commented 13 September 2011 at 01:50

Alternatively, should offsetExists() call hasValue() on the handler?

Log in or register to post comments

Comment #19

pounard

French

commented 13 September 2011 at 07:24

I do not agree with #18 if you want to dissociate handler (handler is the value somehow) and context key being present: context having the key and handling not having a value can be valid in term of business stuff to do: the context handler being present means that he overrides the potential parent value on the same key, so an non valued handler means the key is present and have a 'null' handler (this is a valid override I guess).

EDIT: The algorithms described on #17 is a typical chain of responsability pattern: let's call a cat a cat and avoid killing kitten.

public function offsetExists($someKey) {
  // You can't do isset() there because null will return false then.
  // Alternatively with an ArrayObject implementation, the first test would have been:
  // if (parent::offsetExists($someKey)) {
  if (array_key_exists($someKey, $this->someStore)) {
    return TRUE;
  } else if (isset($this->parent)) {
    return $this->parent->offsetExists($someKey);
  } else {
    return false;
  }
}

public function offsetGet($someKey) {
  // Same comment as upper
  // if (parent::offsetExists($someKey)) {
  if (array_key_exists($someKey, $this->someStore)) {
    if ($this->someStore[$someKey] instanceof SomeHandlerInterface) {
      return $this->someStore[$someKey]->getValue();
    } else {
      $this->someStore[$someKey];
    }
  } else if (isset($this->parent)) {
    return $this->parent->offsetGet($someKey);
  } else {
    return null;
  }
}

Re-EDIT: These may be not performant, but let's hope the context chain will never get really deep: by proceeding using cloning this won't happen often. PHP 5.3 has a really fast function calls execution compared to 5.2 so that might worth the shot to test those kinds of algorithm.

Log in or register to post comments

Comment #20

Stalski commented 13 September 2011 at 08:25

Just some opinions on what's been said above.

While (trimming the context key by colons)

This is indeed the part where we need to be "super smart". The dangerous part lies in the fact that the getter can "set" values before returning. This issue is in fact very much related to #1263682: Getting context variables and overriding. The approach with $handler->hasValue() will clean up the offsetGet method in that issue and that's what we need.

// If you know that NULL has some important meaning

This is something we should deal with. The "knowing" part can be very clear to developers in some cases, but not that much in others. All context keys should have a very well documented usage and I don't know if that's ideal as contract to the world.
I think Crell is very good at such things as I always find my way in the current database facade. :)

I do agree with pounard on #18. The dissociation is something we really need. isset cannot call hasValue as hasValue is something on the handler. What I suggested before, was to have the ContextValue
Also in, even though it's pseudo code

$value = $handler->getValue();
if ($value != NULL && $handler->hasValue() == TRUE) {

it seems odd to get a value and asking if it has a value later on. In fact in my opinion described in #11, the $handler getting the value should be a ContextValue, and I think we should call hasValue on that object (or know it doesn't be checking what superclass of ContextValue it is (E.g. at that time, the handler could not find a value, so it sets its value to a NullContextValue). This way we can act on that as well to determine what to do. The advantage here is that it is the handlers responsibility and not the offsetGet itself.

I very much agree on the fact that we are dealing with chain of responsibility "behavior", although we don't create a chain in the constructor.

I am not sure in which direction to go since there are pro's and cons on each approach. I do think we need to think of the responsibilities of each object. Imo the ContextValue implementations can get a bit smarter on handling their internal hasvalue, canhavevalue, isderivedvalue, ... . What I want to say, is that currently we are letting offsetGet and now offsetExists doing to much work and they know too much.

Log in or register to post comments

Comment #21

pounard

French

commented 13 September 2011 at 08:39

I tend to agree on most of #20, except maybe about that:

What I want to say, is that currently we are letting offsetGet and now offsetExists doing to much work and they know too much.

These are basically the two primary accessors, so they have to do much work, in fact manipulating contextes is all about getting data, so I don't see a better place to put the business stuff in the context implementation. But where I might be wrong is that you could speak about the actual implementation I yet didn't look at.

Log in or register to post comments

Comment #22

Stalski commented 13 September 2011 at 08:51

Well, yes, when reading it again:"doing too much work" is wrong expressed. I did mean the second part about too much responsibilities. More specific, the handler should probably only consider returning a value (and not a derived one). The contextValue object could then do the job of hasValue vs isValue.
This could finally result in problems with primitives versus ContextValueInterface. Would it be so much worse in performance to have primitives exist as PrimitiveContextValue where has value is always the value and has value always true (even on empty strings, NULL, false, ...) ... or am I going a bit far off topic here?

Log in or register to post comments

Comment #23

fgm

French

Paris, France

commented 13 September 2011 at 09:27

Side note: should we wish to support PHP 5.0x, instanceof would trigger autoloading "SomeHandlerInterface ", but this is no longer the case.

But I think we require 5.3.x or later for D8 anyway.

Log in or register to post comments

Comment #24

Stalski commented 13 September 2011 at 09:34

I am currently switching from 5.3 to 5.2. There are indeed some difference on that matter.

Log in or register to post comments

Comment #25

pounard

French

commented 13 September 2011 at 09:39

I think PHP 5.3 is not an option anymore with an interface based heavy abstracted OOP design, it's so much faster with a lot of things.

Log in or register to post comments

Comment #26

fgm

French

Paris, France

commented 13 September 2011 at 09:56

@Stalski: wouldn't that be from 5.2 to 5.3 instead ?
@pounard: woudln't that be "PHP lower than 5.3" instead ?

Log in or register to post comments

Comment #27

Stalski commented 13 September 2011 at 10:38

@fgm I meant continuously skipping between 5.2 and 5.3 (most of the sites dont work on 5.2 but I skip to 5.3 for wscii and some others).

Log in or register to post comments

Comment #28

pounard

French

commented 13 September 2011 at 12:16

@#26 I said that PHP 5.3 is faster than 5.2 if it clarifies things.
EDIT: I totally understand #27, I'm often switching my PHP version from 5.2 to 5.3 and from 5.3 to 5.2 again depending on the projects I'm working on. Using VM can help, but with my Linux/Gentoo I just have a single command line to type to switch (in fact, PHP 5.2, 5.3 and 5.4 are all compiled on my box I can switch whenever I want).

Log in or register to post comments

Comment #29

Stalski commented 13 September 2011 at 15:47

LOL, exactly the same on my fedora. Our own fake Mamp ;)

Log in or register to post comments

Comment #30

pounard

French

commented 13 September 2011 at 16:01

Mine is not fake, it's handled by the distribution itself, PHP is *slotted* yay \o/
EDIT: Sorry off topic.

That said (in order to get back to the previous discussion) I wouldn't be afraid of number of function calls, but being afraid is one thing, measuring remains necessary.

Log in or register to post comments

Comment #31

Crell commented 13 September 2011 at 16:11

Drupal 8 requires PHP 5.3 anyway already, so that question is moot.

Log in or register to post comments

Comment #32

pounard

French

commented 13 September 2011 at 18:58

Please review http://drupalcode.org/sandbox/Crell/1260830.git/commit/fbf5358 or another method for fun http://drupalcode.org/sandbox/Crell/1260830.git/commit/ee7a91f

Log in or register to post comments

Comment #33

Crell commented 14 September 2011 at 04:44

Minutes from today's meeting, which led to the branches in #32, are here: http://groups.drupal.org/node/175544

Log in or register to post comments

Comment #34

Crell commented 14 September 2011 at 04:46

Marked #1263682: Getting context variables and overriding as a duplicate.

Log in or register to post comments

Comment #35

pounard

French

commented 14 September 2011 at 07:13

One other note, I made multiple commit, so the links are not accurate, please look at the head of branches. I spotted a bug the offsetGet() method returns the object instead of NULL in case of not set, which I will fix soon enough.

Please also tell me if the singleton usage for the null object seems weird to you, I'd remove it if people do not approve, I used a singleton because it's a neutral object so a single instance can be shared, but it can confuse people that writes handler. It can be fixed by removing the singleton pattern or by doing a better API documentation for handlers getValue() method.

I was a lot verbose in comments, that's on purpose, just give me feedback on comments, they are meant to be cleaned-up after that.

EDIT:

In the end, we concluded that NULL should mean #3 (I don't have a value, keep looking). For #4, we will introduce a new empty object specifically for the #4 return type. The context object, when traversing, will read an object of that type as "return null now, don't keep looking". NULL will mean "keep looking, unless you run out of keys all the way down the stack, and then return NULL".

I just noticed I implemented this the opposite way: handlers use the magic object to tell "I don't have somethinig for you here, keep looking" and the legacy NULL value as "my value is NULL, so stop here". It works too, I can fix it if you want to stick to the original decision, but this makes sense to me because else the magic object has two different meanings inside the context.inc file (internals to context: "no value here" and internal to handlers: "my value is null") which makes it inconsistent.

Re-EDIT:
Tests are missing, we do not test NULL return values anywhere (the edge case where value does not exists, particularly) so I didn't fix the existing handlers (the fake one in tests) to use a correct getValue() implementation: I will do this too.

Log in or register to post comments

Comment #36

fgm

French

Paris, France

commented 14 September 2011 at 09:21

For the sake of completeness, there is also another pattern to avoid the result ambiguity problem, and this is a very common way in traditional C programming, arguably less so on PHP even though it is present too. A short example with dlsym():

typedef void (*simple_demo_function)(void);

int main(void) {
 const char *error;
 void *module;
 simple_demo_function demo_function;

 /* Load dynamically loaded library */
 module = dlopen("libhello.so", RTLD_LAZY);
 if (!module) {
   fprintf(stderr, "Couldn't open libhello.so: %s\n",
           dlerror());
   exit(1);
 }

 /* Get symbol */
 dlerror();
 demo_function = dlsym(module, "hello");
 if ((error = dlerror())) {
   fprintf(stderr, "Couldn't find hello: %s\n", error);
   exit(1);
 }

Which could look like:

$handler->getError(); // reset the error information for this thread
$value = $handler->getValue();
if ($error = $handler->getError()) {
  // do some error handling
}
else {
  // use $value;
}

Log in or register to post comments

Comment #37

pounard

French

commented 14 September 2011 at 10:21

I'd say it depends on general error handling policy:
If it's exception based policy, there many ways to deal with it. Generally I tend to let business related exceptions pass up to the top: all errors become fatal for the internals: only the final UI component will deal with it, let it pass in debug mode for developers or hide itself in production mode: errors are easier to spot this way.
If you want a low level error handling I'd say it's more robust but it adds a lot of *almost business* error handling logic inside a generic piece of code, plus it hides the errors from the developer by making them silent, so hard to find. If a handler is unable to load a node and throw an exception for exemple, this handler probably should never had been set from the begining (the context was set with a wrong nid).
That said error handling is never easy: http://blogs.msdn.com/b/oldnewthing/archive/2005/01/14/352949.aspx

Log in or register to post comments

Comment #38

Crell commented 14 September 2011 at 18:31

I suspect the "nothing here, stop asking" case will be less common, so I'd rather use the object there since it's slightly more costly. Conversely, an empty object is not *that* expensive so I'm fine with the better DX of just returning a new object each time.

Log in or register to post comments

Comment #39

pounard

French

commented 14 September 2011 at 20:01

As you wish, the last thing is we need to do is decide.

Log in or register to post comments

Comment #40

Crell commented 21 September 2011 at 05:44

pounard: It doesn't look like you've updated the branches with the comments from #38. Can you do that so I can review and hopefully commit?

Also, it's not clear what the difference is between the branches. What magic do you mean? I need some metadata here to be able to close this off. :-)

Log in or register to post comments

Comment #41

pounard

French

commented 21 September 2011 at 10:09

Yes, did not updated the branch since #38 I'm checking again my code right now: it uses the null object for "nothing here, stop asking" (that means no value to return (not even null or false) but overrides parenting and force no value to exists here) which is equivalent to the null object class name: ContextOffsetDoesNotExists.

After having some thoughts on the subject, I'd say I'd prefer to see a hasValue($args) function on the ContextHandlerInterface interface instead in order to keep consistent with the offsetExists() function of the context itself: this would totally hides the null object existence for handlers implementors, what do you think?

The second branch was a test without using a specific object for "no value", it indeed works for the context object's internals but it leaves the handlers' return problem wide open, so this one can be ignored: but if we implement the hasValue() method: we could definitely get rid of the null object.

Log in or register to post comments

Comment #42

Anonymous (not verified) commented 22 September 2011 at 23:17

ok, i may be smited for saying this, but: why not just use a method on an object rather than php magic?

why is $context a magic array that resolves key-lookups to method calls, rather than a bog standard php object, with a $context->get('foo:bar') method?

i really don't get how a magic $context array is more discoverable and maintainable than an object, with methods that you can find based on, oh, i don't know, what the calling code looks like.

Log in or register to post comments

Comment #43

Crell commented 22 September 2011 at 23:28

beejebus: That's off topic for this thread. Let's stick to the issue and get it resolved ASAP.

That said, it's mostly just syntactic sugar. ArrayAccess is just syntactic sugar, but nice syntactic sugar.

Log in or register to post comments

Comment #44

Anonymous (not verified) commented 22 September 2011 at 23:53

ok, lets pretend i didn't raise this here.

where should i take this up? IMO, 'syntactic sugar' is simply not a good enough reason to introduce this level of non-obviousness to the API.

Log in or register to post comments

Comment #45

pounard

French

commented 23 September 2011 at 08:17

I totally get beejeebus on this one, indeed that's off-topic but the question is legitimate. I usually don't like magic myself. Like Crell said, it's just syntactic sugar (and yes once again I join beejeebus on this one don't think it's enough to buy). But in the end, why not: ArrayAccess makes obvious that you will use the object as an indexed bunch of data, and that's exactly what do the methods under the hood, so in this very case, I don't think it really hides the obviousness.

Log in or register to post comments

Comment #46

chx commented 23 September 2011 at 09:38

The existence of this issue should be a red flag against magic. Just sayin'.

Log in or register to post comments

Comment #47

pounard

French

commented 23 September 2011 at 11:48

Not doing the full synthesis, but I agree with beejeebus on one point: the ArrayAccess interface is confusing; At least because in our case, we will use isset() to tell "no value at all" while the actual isset() keyword in PHP means in all other cases "no non-null value here", this is an inconsistency with the language construct itself. For a lot of other reasons I'd say we could safely use the context object with two methods: hasValue('some:thing') and getValue('some:thing').

Log in or register to post comments

Comment #48

Crell commented 23 September 2011 at 18:01

The ArrayAccess question has moved to #1283542: Don't use ArrayAccess. Let's keep it out of this thread. Any reviews of #41? Pounard, can you post patches when you commit new code as well to make it easier for people to review?

Log in or register to post comments

Comment #49

Crell commented 25 September 2011 at 20:30

Status	File	Size
new	1263682-context-variables-overriding.patch	10.52 KB

pounard and I just spent a few hours circling this issue in IRC and chewing through code. Much progress was made and I've merged the resulting code into the wscci branch. The final diff I have attached here for reference, although it was merged directly in Git.

Thanks, everyone!

Log in or register to post comments

Comment #50

Crell commented 25 September 2011 at 20:31

Status:

Needs review

» Fixed

Log in or register to post comments

Comment #51

9 October 2011 at 20:31

Status:

Fixed

» Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Log in or register to post comments

what does butler return when something is not there?

Comments