Instead of using static variables, we can use something like this patch does. If people like this one then we can refactor the rest of core in subsequent patches, this is just bootstrap.inc and whatever calls those functions, This is better than global variables because reset always resets to whatever the original function specifed as default. Also, for testing purposes we could include a reset of all static variables in one fell swoop. I would not to like to refactor evertything in one patch.

Files: 
CommentFileSizeAuthor
#163 temporary_variables_node_taxonomy.patch2.49 KBDavid_Rothstein
#153 static-reset-254491-152.patch34.31 KBpwolanin
Passed: 10688 passes, 0 fails, 0 exceptions View
#152 static-reset-254491-151.patch31.47 KBcatch
Passed: 10688 passes, 0 fails, 0 exceptions View
#141 static-reset-254491-140.patch32.15 KBpwolanin
Failed: 10686 passes, 2 fails, 0 exceptions View
#132 variable_set_temporary.patch1.6 KBDavid_Rothstein
Passed: 10687 passes, 0 fails, 0 exceptions View
#112 static-reset-254491-111.patch13.53 KBpwolanin
Passed: 10687 passes, 0 fails, 0 exceptions View
#110 static-reset-254491-109.patch13.45 KBpwolanin
Failed: Failed to install HEAD. View
#108 static-reset-254491-108.patch15.83 KBpwolanin
Failed: Failed to install HEAD. View
#98 static_reset.patch13.13 KBbeejeebus
Passed: 10687 passes, 0 fails, 0 exceptions View
#92 static_reset.patch13.13 KBbeejeebus
Failed: 10437 passes, 6 fails, 0 exceptions View
#91 static_reset.patch10.47 KBbeejeebus
Failed: 10442 passes, 3 fails, 0 exceptions View
#80 static_reset.patch9.26 KBchx
Unable to apply patch static_reset_5.patch View
#79 static_reset.patch9.25 KBchx
Unable to apply patch static_reset_4.patch View
#78 static_reset.patch9.24 KBchx
Failed: Failed to install HEAD. View
#76 static_reset.patch9.14 KBchx
Failed: Failed to install HEAD. View
#75 static_reset.patch9.13 KBchx
Failed: Failed to install HEAD. View
#74 static_reset.patch9.52 KBchx
Failed: Failed to apply patch. View
#72 static_reset.patch9.5 KBchx
Failed: Failed to apply patch. View
#55 statics-254491-55.patch3.04 KBpwolanin
Failed: Failed to install HEAD. View
#53 statics-254491-53a.patch3.6 KBpwolanin
Failed: Failed to install HEAD. View
#52 statics.patch2.5 KBCrell
Failed: Failed to install HEAD. View
#48 254491_static_caching.patch11.52 KBksenzee
Failed: Failed to apply patch. View
#25 cache_redux.patch6.18 KBchx
Failed: Failed to apply patch. View
drupal_static.patch12.11 KBchx
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal_static.patch. View

Comments

beejeebus’s picture

nice! will review this shortly...

Damien Tournoud’s picture

I do like the idea, it will solve lots of problems with static variable (especially the strange function signatures in some cases).

But I'm not convinced by the "default" argument, that basically duplicate the variable in all cases (including when the default is not useful at all). This looks like a waste of memory.

chx’s picture

Damien, I do not get your problem, why would it duplicate...? The default stays unchanged and the data itself gets changed by the caller because it's returned by reference. Also, the memory implications are rather small, the default is typically like 0 or '' or array().

moshe weitzman’s picture

interesting. folks interested in this might also be interested in context module. It provides a static cache api as well ... Look especially at context.module itself, and not at its helper modules. Context has been useful for me on a couple of projects already.

Damien Tournoud’s picture

@chx: I was worried that some function could have a large $default. If its not the case (just something simple like array() or NULL), I'm ok with it, but it should be documented.

chx’s picture

context does than this very simple facility and it does not have a reset. It's a whole different thing, really. Mine is a search-replace drop in:

function function_name() {
  static $foo = bar;
}

becomes

function function_name() {
  $foo = &drupal_static('function_name', bar);
}

and then without getting function_nameinvolved, it can be reset by drupal_static_reset('function_name') nice and easy.

R.Muilwijk’s picture

I'd like this to be added very much. I am currently working on refactoring Taxonomy to use one centralized place to save the term data. I would figure more modules are duplicating data. With this function a lot of static's will be checked or they can be merged to one place.

It would also allow you to break for example into 'roles' which are normally cached in a static allowing you to add roles on the fly much easier. It could however lead to unexpected behaviour because non-core modules will be able to change data they shouldn't, this should be checked for each static.

bdragon’s picture

(subscribing, possibly testing tomorrow)

RobLoach’s picture

Subscribe.

bjaspan’s picture

Subscribe.

BioALIEN’s picture

Adding to my watch list.

dmitrig01’s picture

function function_name() {
  $foo = &drupal_static('function_name', bar);
}

Ok I have two comments on this: one, why don't we use debug_backtrace? Two, what if we have multiple static variables func?

chx’s picture

One, debug_backtrace is not too fast, two, I was just explaining what the code does, if you would have looked at the code you'd have seen $phase_index = &drupal_static('drupal_bootstrap_phase_index', 0); so it's not mandatory that you use the function name as the static name, it's just a possibility.

bjaspan’s picture

I strongly support this idea but the API is not quite right yet, though I am not yet sure I know what the right API is.

A major requirement we need to declare for Drupal coding standards is that using static caching as a "performance enhancement" in any way that makes it possible for any caller of any public API function to tell that static caching is enabled is a bug as severe as "the API function does not work at all." This does not apply to functions that intentionally build up internal state like drupal_set_header(), but consider:

* $node = node_load($nid)
* modify $node
* node_save($node)
* $new_node = node_load($nid)

Currently, the second node_load() returns the original $node. That's severely wrong; it should return the modified node.

The current drupal_static() API makes doing things right easier but not quite easy enough. Suppose that node load caches a loaded node with:

$nodes = &drupal_static('node_load', array());  // replaces current static $nodes = array();
$nodes[$nid] = $node;

then node_save() can call

$nodes =& drupal_static('node_load');
unset($nodes[$nid]);

to ensure that a later node_load() does the right thing. There are a few problems here: (1) It requires two lines of code for what should be one line. (2) It exposes all of $nodes to damage. (3) It requires duplicating the default value for the 'nodes' key ("array()") in two places or, as in my (buggy) example above, omits it from the second case and therefore leads to subtle bugs when someone calls node_save() before node_load().

If I could write unset(drupal_static('node_load')[$nid]) I'd be happier but PHP does not allow that syntax and anyway it still has problem (3).

I think the simplest solution is to support, in addition to the &drupal_static() function, explicit drupal_static_set(). drupal_static_get(), drupal_static_clear() functions. Then, node_load could do drupal_static_set('node-'.$nid, $node) and node_save could do drupal_static_clear('node-'.$nid); . However, when I suggested this to Dries, he preferred using the cache_get/set/clear() functions with a special table name of 'memory' (which I'd suggest we refer to as CACHE_TABLE_MEMORY). This is a good idea from the developer's point of view because it means there is one "caching API" instead of two. We can make cache_get return a reference (only useful in the 'memory' case) but we can't really add the "default" semantics you have for drupal_static() which makes the almost search/replace update possible.

While I'm thinking about it, another benefit of central static cache control is that we can limit the amount of memory used by the static cache (perhaps with FIFO or even an LRU GC if we get ambitious, though that is probably overkill). Not something we should try to do in this patch.

That's a lot of rambling without a specific suggestion. Sorry 'bout that.

chx’s picture

Barry, I already have drupal_static_reset, adding a clear is very very easy, like

function &drupal_static($name, $default = NULL, $reset = FALSE, $unset = FALSE, $key = '') {
  elseif ($reset) {
    $cache[$name]['data'] = $cache[$name]['default'];
  }
  elseif ($unset && is_array($cache[$name]['data']_) {
    unset($cache[$name]['data'][$key]);
  }
function drupal_static_clear($name, $key) {
  drupal_static($name, NULL, FALSE, TRUE, $key);
}

so that's quite solvable.

About using cache_set, I am unsure, I will study. Thanks for the review.

cwgordon7’s picture

This could also get rid of lots of global variables that exist simply because they want to be cross-function statics. So, overall, +1. However, a bit of thought needs to go into naming conventions, as this is going to be very prone to duplicates. I propose: 'origin_function__variable_name'. This way it will be very hard to create conflicting names, you must really be trying hard in order to break it.

Another fun idea: a drupal_static_set function:

<?php
function drupal_static_set($name, $value) {
  $temp = &drupal_static($name);
  $temp = $value;
}
?>

This would allow for modules to perhaps hook into core a bit more by being able to alter core's statics... which can be a very good thing.

Also agreed, a drupal_static_clear() function would be awesome too.

Dries’s picture

However, when I suggested this to Dries, he preferred using the cache_get/set/clear() functions with a special table name of 'memory' (which I'd suggest we refer to as CACHE_TABLE_MEMORY). This is a good idea from the developer's point of view because it means there is one "caching API" instead of two. We can make cache_get return a reference (only useful in the 'memory' case) but we can't really add the "default" semantics you have for drupal_static() which makes the almost search/replace update possible.

A lot of static variable caches could be cached longer, i.e. across multiple page requests. The caching in node_load() and node_save(), per Barry's example, could potentially be cached for an extended period of time. We don't necessarily want to cache this in the database, but in presence of memcached, we might want to cache it a bit longer. In absence of memcached, we could use a static variable.

Think of the static variable as a poorman's solution in a more versatile cache API. Using a single cache API will make it easier to do things like this. I don't quite know what the API would look like, and how what semantic sugar is required to communicate cache lifetimes, but it is well-worth thinking about this more.

Because Drupal is so flexible, it's also difficult to asses how feasible this all is without breaking any modules. I guess it requires some more thought and discussion. It might be nothing but a crazy idea.

R.Muilwijk’s picture

Yesterday I followed your chat in #drupal-dev with Crell and bjaspan. Generally agreed was:
a. Caching should be a simple 'get data' and 'set data', which can be done with or without OOP
b. The backend for a cache would be specified at 'table' level. Though I think this name should be renamed because table is outdated
and only for databases.
c. Different drivers could be included and users could specify in their settings file or in a cache info hook what table to use what driver. Drivers
that can be shipped with core could be: 'db', 'fsystem', 'static variable' and maybe even 'memcached' or 'apc'.

What are the benefits? One general Cache API which can be tweaked for each website. People could use EARLY_PAGE_CACHE with fsystem caching for cache_pages, use database caching for views and use memcached caching with expiration 5 minutes for node_load.

The static cache variable would just be another driver for the cache api because it is all just getting data and setting data. I really like the idea and it could lead to some nice performance gains.

Crell’s picture

#18 is a pretty good summary. I am going to try and draw up some code ideas shortly and post them, probably on my blog since they'll be longer, but we'll see.

chx’s picture

This was not agreed these were merely Crell's ideas and too big a change for now.

R.Muilwijk’s picture

Hmmz, I thought you liked the idea too, sorry for that. What is the reason it is too big of a change, you'd rather do it in stages?

chx’s picture

I will introduce a DRUPAL_STATIC_CACHE and replace static with cache_get/set into that table. cache_get will get a default option too. Meanwhile, I will cahnge the cache_get to return only $cache->data and have a cache_get_found call to determine whether cache_get found the data in the storage and a cache_get_headers to grab the headers. This makes it possible to replace the static with a simple cache_get because it no longer returns a complex data structure though we need a cache_set at the end of the function.

This has absolutely nothing to do with drivers and such. Once this is in, we can ponder on that.

Crell’s picture

chx: Put me on record as saying that a separate cache_found() function without an object to handle protected data members is about as nasty an API as you could ask for there. It makes no sense whatsoever.

moshe weitzman’s picture

FYI, we have a module today that allows you to specify different cache strategies for different parts of your site. See http://drupal.org/project/cacherouter

chx’s picture

Status: Needs review » Needs work
FileSize
6.18 KB
Failed: Failed to apply patch. View

Here is much unfinished work. I am debating w/ myself, should I use arrays or the kind of cache ids that drupal_get_filename now uses? If the latter, how we are going to reset? I was thinking on arrays as cid, actually, but not yet there.

chx’s picture

First questions first: all this stuff just to avoid globals? Are they really really so bad? Why not just change those statics -- we do not to change them all, not at all -- that needs a reset to a global?

catch’s picture

Category: task » bug
Priority: Normal » Critical

Since fixing this issue is a breaker on running all tests at once (see http://drupal.org/node/260360 ) I'm changing it to a critical bug.

chx’s picture

Oh sure it is critical -- Barry mentioned it blocks fields in core too.

chx’s picture

Please handle a new generalized caching API in a new issue.

Edit: using cache* for this issue is ok, adding necessary facilities for static is also ok but to totally revamp cache* -- only if we think that's the absolute best and I am not convinced at all.

chx’s picture

Here is a summary so far: the problem is that we need to be able to reset static in function A from function B. Solutions:

  1. My original idea was to introduce a separate API for this.
  2. If we go with cache* for this, then how do we want to store, say, the nodes static cached in node_load ? Do we want to have a cid of node_load_$nid?
  3. Or we want to code cid-as-an-array and have $cid = array('node_load' => $nid) or something similar as cid?
  4. Or just store node_load as an array and then $nodes = cache_get('node_load'); unset($nodes[$nid_to_reset]); cache_set....
  5. Or just use globals

The inner workings of cache_get is hardly relevant to this issue.

chx’s picture

Assigned: chx » Unassigned

I can't talk to anyone without having cache('static')->set('value', $data); thrown at me. That's a different issue and I am out of this.

chx’s picture

Status: Needs work » Closed (won't fix)
pwolanin’s picture

having a global seems the best solution if we care about performance (i.e. avoiding zillions of extra function calls). We then just rely on a clear convention rather than an API per-se. For example:

$GLOBALS['cache']['node_load'][$nid] = clone $node;

pwolanin’s picture

I guess you could throw an extra dummy function call into every function that currently has a static to gauge the difference?

chx’s picture

I won't fixed this because boombatower and David Tournoud are already on the problems with simpletest in two issues ( http://drupal.org/node/260882 and http://drupal.org/node/261258 ) and the rest can be fixed individually. If we do not want to convert all statics into this new api -- and it would be a big work -- then why bother at all.

litwol’s picture

I think that unfortunately all of you misunderstood the correct application of static caching.

Static caching should not be meant to live across multiple page views. it should not be meant to be shared by more than one site user. Thus static cache should not be implemented with persistent caching mechanisms (db, memcached and what not).

what i think static cache should be used for is reducing redundant query and computation execution from many times down to one time.

1)
Q) why cant we cache this for more than 1 page request?
A) Some data is just too dynamic and cannot be cached, one such case is access permissions. functions return different data based on user access. do not eliminate these function calls, eliminate their repetitiveness.

2)
Q) where can we see benefit from this?
A) have you ever seen devel query log? it is not uncommon to see repetitive calls like user_load(UID) or taxonomy_get_vocabularies(VID) when UID and VID do not change. this is very common when these calls are spread around multiple modules or functions when these functions need similar data. so if we have a function that fetches us data in a simple manner, without any complex computation, just data fetching, then why can we not static cache it? now consider nested cases that appear inside hook_nodeapi a single static cache implementation will reduce function and query calls by large order.

the other day i went around our code base and static cached a lot of such functions, i managed to reduce query calls down from ~1000 to ~650 thus reducing page execution time down from 4 seconds to 300ms, ponder on that.

litwol’s picture

Status: Closed (won't fix) » Needs work

i am setting this to patch (code needs work) because unfortunately our status messages do not include 'architecture (design needs more thought)'

sun’s picture

subscribing

earnie’s picture

subscribing.

Susurrus’s picture

Title: Static cache facility » Standardize static caching
Category: bug » task
Priority: Critical » Normal
Status: Needs work » Needs review

Changing a few aspects of this issue to hopefully get some more eyes on it.

Also, it would be great if we could get more eyes on this or someone to push this through to RTBC because right now there are many more uses for static caching that we're starting to see, like #281596: Performance: static caching of user objects in user_load(), respecting conditions and #91786: user_load() static caching.

Dries spoke on this to keep it limited to static caching across a single page load. It seems like there were discussions of pushing caching further, but I agree with others in that this is out of scope for this issue.

Should there just be a standard coding convention on using a $GLOBALS array instead. Seems like it's the most efficient and easiest way to solve this issue. I'm not sure why abstraction through more function calls would actually help this situation, but for some reason people want to add that.

smk-ka’s picture

Add #223075: Adaptive path caching to Susurrus' list.

RobLoach’s picture

Wait, so use global variables instead of some drupal_static function? This could somehow be indirectly related to #113435: Data API: first steps.

chx’s picture

Note that extending the cache system for multiple drives is WAY more than this patch wants to do. Can we get back to #1, i.e. the original patch?

pwolanin’s picture

Status: Needs review » Needs work

I'm looking at the original patch (at the top of the issue). fails to apply (fails on install.inc, bootstrap.inc, install.php). I think this is basically the right direction to go. We need a simple API, and don't need to confuse this with cache_set/get. The key piece this original patch is missing is a way to reset all the statics at the same time, which would be very useful for testing, etc.

so, in response to chx in #30, I'd argue that #1 is the right general approach, with several possible implementations.

webchick’s picture

Subscribing. Getting sick of committing these patches that add a $reset variable to $foo function, so would be nice to have this stuff centralized.

Owen Barton’s picture

@Dries - what you describe sounds very similar to my proposed caching enhancement (with a dated patch) at http://drupal.org/node/152901 - this implements a similar static cache "collector" and can save and preload the caches based on language, path and (optionally) per-user, and also add items to the cache later on that weren't cached originally.

The static cache element of that patch was one of the parts of that patch I was not done with, so I am happy we are looking at this here. The cache_collect function in my patch lets you provide some additional metadata that describes the cachability of that data, and the code should automatically manage saving it all via cache_set (if appropriate) at the end of the page load.

ksenzee’s picture

Status: Needs work » Needs review

FWIW, here's a reroll of #1 that applies to the new unstable tag.

ksenzee’s picture

FileSize
11.52 KB
Failed: Failed to apply patch. View
pwolanin’s picture

Status: Needs review » Needs work

this is still missing a facility for resetting all the statics in one call.

Also, I think the logic here is wrong:

if (!isset($cache[$name]['default'])) {

should be:

if (!isset($cache[$name]['data'])) {

since I might want to have no default value (NULL) and I think the code as-is would overwrite the previously stored data on every function call.

pwolanin’s picture

Also, I'm now recalling a proposal Crell had for an "environment" for each page load. Let's see how this could fit in with that, or perhaps be the start of it.

drewish’s picture

subscribing.

Crell’s picture

Status: Needs work » Needs review
FileSize
2.5 KB
Failed: Failed to install HEAD. View

A while ago I was talking to pwolanin about using a formal singleton object here instead of functions. Here's an implementation of what that would look like, with just one static call converted. (I didn't want to bother with the rest of 'em yet.)

pwolanin’s picture

FileSize
3.6 KB
Failed: Failed to install HEAD. View

I see no reason why we need to store or restore default values. The default value will always get used if needed.

Also, there is a major problem with both the last patches- the var names have to be globally unique, which is a real PITA. We could append or prefix each with the function name, but why not just require that as a separate parameter to the method, which then gives us an extra dimention for the reset.

Patch w/ a few example conversions attached.

Crell’s picture

Thanks to pwolanin for fixing totally braindead stupid mistakes in my patch. :-)

That said, I'm not sure about the context layer. It seems over-engineered to me. If we need namespacing, modules can just prefix the variable just as they do their own functions, or constants. And really, if you're not node_load() why do you have a static cache called 'nodes'? :-)

pwolanin’s picture

FileSize
3.04 KB
Failed: Failed to install HEAD. View

like so?

chx’s picture

One, though i know i started this now i wonder whether doing an extra function calll in some hyper - frequently called functions will slow down things. There is a reason why variable_get is static cached in url, you know. That said, you now add two function calls -- the more the merrier? What's wrong with &static::value('foo') if you so insist on OOP?

Crell’s picture

Going all-static-class with it would probably work, too. I can't see us needing to subclass in this case, which is usually where static binding gets really annoying in PHP < 5.3. That's one of the reasons I generally try to avoid it by default. (The other is that it becomes class-as-namespace, which is icky.)

If you really really needed the best of both worlds (no repeated function request but still global access), I think the following code would in fact work:

function foo() {
  static $things;

  if (empty($things)) {
    $things =& Statics::value('things');
  }

  // ..
}
chx’s picture

can we have &statics::values->data['things'] in there ? that would make it still possible to reset one, reset global and yet no function call.

edit: or just statics::values['things'] ? Is that a valid syntax?

Crell’s picture

chx: If there's no function call at all, then there's no decent way to set a default, is there?

chx’s picture

Well, if it's just one fucntion call then I can live with it, yes....

Status: Needs review » Needs work

The last submitted patch failed testing.

lilou’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch failed testing.

quicksketch’s picture

This patch sort of came up simple and straightforward but then became rather lost in the drastic proposals. If possible, I think we should go all the way back up the early proposals. Barry's/Dries's suggestions in #14/#17 sound by far the best to me. The syntax would be simple and fit in with our existing API (the proposed OO syntax makes me feel like I'm sticking a fork in my eye).

Regarding reusing cache_get/set, Barry mentions that variables could be passed by reference such as:

$my_variable = &cache_get('my_variable', 'memory');

Which of course is only necessary if you don't just use cache_set() to update the variable like any other cache.

$my_variable = cache_get('my_variable', 'memory');
// ... changes to $my_variable.
cache_set($my_variable, 'my_variable', 'memory');

Clearing statically variables we would use the same convention as the current cache_clear_all:

cache_clear_all('*', 'memory', TRUE);

This approach has the distinct advantage that our statically cached variables get passed to the swappable cache layer just like all other variables, so you could pass this information off to memcache (or other caching mechanisms) to save memory in your PHP executions (since memcache would take the load instead).

chx’s picture

$my_variable = &cache_get('my_variable', 'memory')->data;

that's what you want, 'cos cache_get returns an object, but then if the cached item does not exist then you have problems as you tried FALSE->data which wont fly. Of course we can change the return structure to include a valid/invalid or somesuch flag but then you need several lines of code instead of one. So this is why my original proposal included passing in a default. Now, most cache_get/set calls do not have or need a default, what they need is to see whether the object exists or not. Indeed, if we add a default to cache_get then we need to change every single cache call out there. Are we very sure this is the approach to take? I doubt. I do not think static caching is the same as any other caching exactly because it's not across requests.

slantview’s picture

Sorry to jump in here so late, but I have had quite a few ideas about this as of late. Especially because of http://drupal.org/node/251792 and since I wrote Cache Router. There is a lot of confusion because of the name of what really happens inside Cache Router, and apparently nobody has read the code ;)

Before you see the memcache class below and immediately think "MEMCACHE HAS NOTHING TO DO WITH IT YOU EFFIN R-TARD", Read the whole post please. It's about abstracting the storage and offering a generic class for getting and setting.

In Cache Router I do the following:

The class CacheRouter is the main global $cache object. Inside it I store a map to each Cache object which is an abstract class that is extended for each bin you define.

So for example, you would have

<?php

$cache = new CacheRouter();
$cache->get('myvariable', 'mybin');

?>

$cache has a protected variable named $map. $map is an array of Cache objects for each defined bin. So when I am doing the above ->get() what really happens is this:

<?php

$cache->map['mybin']->get('myvariable');

?>

now $cache->map['mybin'] is actually a cacheMemcache object or a cacheAPC object or a cacheFile object, etc, and each class defines it's own methods for get'ing, set'ing, flush'ing etc.

cache_get and cache_set are now really just convience functions because they are part of the Drupal API. Although you could technically do the following, but I would not recommend it:

<?php

global $cache;

$data = &$cache->get('myid', 'mybin')->data;

?>

anywhere in Drupal and get away with it, because if you look in cacherouter.inc you'll see the following:

<?php

function cache_get($cid, $table = 'cache') {
  global $cache, $user;
  if (!isset($cache)) {
    $cache = new CacheRouter();
  }
  
  $cache_object = $cache->get($cid, $table);  
  if ($cache_object && isset($user->cache) && $user->cache > $cache_object->created) {
    return 0;
  }
  
  return $cache_object;
}

function cache_set($cid, $value, $table = 'cache', $expire = CACHE_PERMANENT, $headers = NULL) {
  global $cache;
  if (!isset($cache)) {
    $cache = new CacheRouter();
  }
  return $cache->set($cid, $value, $expire, $headers, $table);
}

?>

So additionally and the reason all this is relevant to this conversation is the following code inside my parent class "Cache" and child class "cacheMemcache" which extends Cache:

<?php

class Cache {

  var $content = array();

  //...<snip>...

  function get($key) {
    if (isset($this->content[$key]) && $this->static) {
      return $this->content[$key];
    }
  }
  
  function set($key, $value) {
    if ($this->static) {
      $this->content[$key] = $value;
    }
  }
}

// And now memcache class (could be db, memcache, apc, etc)

class memcacheCache extends Cache {
  var $settings = array();
  var $memcache;

  //...<snip>...

  function get($key) {
    // Attempt to pull from static cache.
    $cache = parent::get($this->key($key));
    if (isset($cache)) {
      return $cache;
    }
    
    // Get from memcache
    $cache = $this->memcache->get($this->key($key));
    
    // Update static cache 
    parent::set($this->key($key), $cache);
    
    return $cache;
  }
}

?>

I hope this is all starting to make sense. My generic parent class "Cache" only holds the info in a content array. My child class "memcacheCache" holds the data in a persistent storage engine. Potentially you could create a child class called "Void" that does nothing and now you have a bin that you can only store stuff for the duration of the page load in a static array.

The static storage is an option you turn on/off in configuration currently. So for bins like "cache_page" you don't need it, cause it's only going to take up memory, but in "cache" you might actually want it on.

In closing, with some small changes to Cache Router, we are already doing this, and with cache_get / cache_set no less.

I really believe we could take this idea a bit further if you wanted to add it to core and instead of calling it Cache Router, we just called it Storage. Now what Storage can do is be a base class for the following following items that could all use the same functionality (Sessions, Variables, Locks, Cache, Static). The functions would all be the same for the base class: get/set/add/delete/flush The code could be written to be PHP5 specific and use abstract class Storage, class Cache extends Storage, class Memcache extends Cache, etc. And now we have reusable code that can be persistent or not and we don't have to write 5 similar pieces of code to do the same thing effectively.

I hope this makes sense as it is 5am here and I could probably be a little more clear.

Steve

slantview’s picture

I wanted to be a little more clear on how this is even remotely useful / relevant. Here is one function refactored to use this methodology, without completely redoing the API like i had proposed in the previous comment.

<?php

// This is in settings.php
$conf['cacherouter'] = array(
  'static' => array(
    'engine' => 'void',
    'static' => TRUE,
  ),
);

// This is in form.inc
function drupal_retrieve_form($form_id, &$form_state) {
  $forms = &cache_get('drupal_forms', 'static')->data;
  
  // We save two copies of the incoming arguments: one for modules to use
  // when mapping form ids to constructor functions, and another to pass to
  // the constructor function itself. We shift out the first argument -- the
  // $form_id itself -- from the list to pass into the constructor function,
  // since it's already known.
  $args = func_get_args();
  $saved_args = $args;
  array_shift($args);
  if (isset($form_state)) {
    array_shift($args);
  }

  // We first check to see if there's a function named after the $form_id.
  // If there is, we simply pass the arguments on to it to get the form.
  if (!function_exists($form_id)) {
    // In cases where many form_ids need to share a central constructor function,
    // such as the node editing form, modules can implement hook_forms(). It
    // maps one or more form_ids to the correct constructor functions.
    //
    // We cache the results of that hook to save time, but that only works
    // for modules that know all their form_ids in advance. (A module that
    // adds a small 'rate this comment' form to each comment in a list
    // would need a unique form_id for each one, for example.)
    //
    // So, we call the hook if $forms isn't yet populated, OR if it doesn't
    // yet have an entry for the requested form_id.
    if (!isset($forms) || !isset($forms[$form_id])) {
      $forms = module_invoke_all('forms', $form_id, $args);
      cache_set('drupal_forms', $forms, 'static');
    }
    $form_definition = $forms[$form_id];
    if (isset($form_definition['callback arguments'])) {
      $args = array_merge($form_definition['callback arguments'], $args);
    }
    if (isset($form_definition['callback'])) {
      $callback = $form_definition['callback'];
    }
  }

  array_unshift($args, NULL);
  $args[0] = &$form_state;

  // If $callback was returned by a hook_forms() implementation, call it.
  // Otherwise, call the function named after the form id.
  $form = call_user_func_array(isset($callback) ? $callback : $form_id, $args);

  // We store the original function arguments, rather than the final $arg
  // value, so that form_alter functions can see what was originally
  // passed to drupal_retrieve_form(). This allows the contents of #parameters
  // to be saved and passed in at a later date to recreate the form.
  $form['#parameters'] = $saved_args;
  return $form;
}
?>

I hope this makes a little more sense. I am basically restating what Nate said, but saying that "this is here now with cache router" with some minor code tweaks to cache router.

chx’s picture

Instead of World and Peace in two comments care to read #65 and answer me: what happens when cache_get fails? Currently we return FALSE and thus make it impossible to uniformly use cache_get()-> because cache_get does not necessarily return an object. For statics, we would much prefer if cache_get would always return an object and thus cache_get()->value would always be possible but this requires adding a default to cache_get which is not what the non-static stuff needs seemingly...

slantview’s picture

good point chx. after some sleep i realized that perhaps i am overcomplicating things for this. I am going to open a new issue for discussing a more abstract topic of cache / static / OOP implementations of such.

Edit: the mentioned two comments were also unpublished to lesser the clutter of this poor issue. chx.

Senpai’s picture

This issue is now soley about defiling, ahem, I mean defining a good method of cache_getting and cache_setting static variables sitewide. Any further discussion about a Cache API should happen at the newly aggregated discussion I've attempted to piece together from the previous 69 comments in this thread.

The Cache API discussion is now at #368915: Let's create a variable caching API. Carry on.

boombatower’s picture

Found this after creating http://drupal.org/node/348455#comment-1260247.

Can't quite tell from comments if this will allow for storing variables in memory. (haven't read entire thread)

chx’s picture

Status: Needs work » Needs review
FileSize
9.5 KB
Failed: Failed to apply patch. View

With cache_get, we are storing the results of costly calculations and we do not know at all what the result will be. So we need to make sure that we can answer the question "is this thing stored in the cache?" for cache_get regardless of the value it stores, be it NULL, empty string or whatever. Now, for statics are much less generic, we have a good idea what they are storing, they practically never store empty things and the question we need to answer "is there something non empty" and indeed mostly we simply use if ($foo) for the static vars. These two separate characteristic make unpractical a unified cache and static facility. This patch adds a central reset facility, converting statics to this function is a matter of search-replace.

Status: Needs review » Needs work

The last submitted patch failed testing.

chx’s picture

Status: Needs work » Needs review
FileSize
9.52 KB
Failed: Failed to apply patch. View

Revoked conversion of get_t and commented why I did so.

chx’s picture

FileSize
9.13 KB
Failed: Failed to install HEAD. View

The bot does not like me.

chx’s picture

FileSize
9.14 KB
Failed: Failed to install HEAD. View

Heh, patch early , patch often :)

Status: Needs review » Needs work

The last submitted patch failed testing.

chx’s picture

Status: Needs work » Needs review
FileSize
9.24 KB
Failed: Failed to install HEAD. View

D'oh!

chx’s picture

FileSize
9.25 KB
Unable to apply patch static_reset_4.patch View

Another try at pleasing the bot.

chx’s picture

FileSize
9.26 KB
Unable to apply patch static_reset_5.patch View

Sniff. Damien insists on posting this one too. But, last one, probably.

Damien Tournoud’s picture

Status: Needs review » Reviewed & tested by the community

Ok, the two latest patches were cleared by the testing bot.

I like this patch, and I believe we really need it. Static caching is completely different then other forms of caching: it doesn't persist between requests. Other forms of caching, that cross the request boundaries, should be treated separately.

#358663: Implement a locking framework and make caching pluggable is an umbrella issue for the rework of the caching subsystem for D7. We could implement in that mechanism that allows developers to indicate that some cached entries could be stored in fast memory cache (APC or memcache), even if it makes cached entries inconsistent between web heads. This will requires some thinking, but it's definitely doable.

Let's just not mix static caching (very cleanly improved in this patch) and this other form of (potentially in-memory) cross-request caching.

RobLoach’s picture

Every time drupal_static() is called, we'll have to recreate the value for the $default_value parameter. There isn't a way around this, right?

  $phases = &drupal_static('drupal_bootstrap_phases', array(
    DRUPAL_BOOTSTRAP_CONFIGURATION,
    DRUPAL_BOOTSTRAP_EARLY_PAGE_CACHE,
    DRUPAL_BOOTSTRAP_DATABASE,
    DRUPAL_BOOTSTRAP_ACCESS,
    DRUPAL_BOOTSTRAP_SESSION,
    DRUPAL_BOOTSTRAP_VARIABLES,
    DRUPAL_BOOTSTRAP_LATE_PAGE_CACHE,
    DRUPAL_BOOTSTRAP_LANGUAGE,
    DRUPAL_BOOTSTRAP_PATH,
    DRUPAL_BOOTSTRAP_FULL,
  ));

The default array is recreated every time this function is called. Before, since it was a static variable, the default array was only created once. I guess we could use the following, but it's pretty gross....

  static $default_phases = array(
    DRUPAL_BOOTSTRAP_CONFIGURATION,
    DRUPAL_BOOTSTRAP_EARLY_PAGE_CACHE,
    DRUPAL_BOOTSTRAP_DATABASE,
    DRUPAL_BOOTSTRAP_ACCESS,
    DRUPAL_BOOTSTRAP_SESSION,
    DRUPAL_BOOTSTRAP_VARIABLES,
    DRUPAL_BOOTSTRAP_LATE_PAGE_CACHE,
    DRUPAL_BOOTSTRAP_LANGUAGE,
    DRUPAL_BOOTSTRAP_PATH,
    DRUPAL_BOOTSTRAP_FULL,
  );
  $phases = &drupal_static('drupal_bootstrap_phases', $default_phases);
chx’s picture

Status: Reviewed & tested by the community » Needs review

Constructing that array is quite fast... and yeah we could do that if we really want to. I certainly dont. Microoptimization, seems to me esp given the number of calls to drupal_bootstrap on a typical page...

chx’s picture

Status: Needs review » Reviewed & tested by the community

Ouch. Did not want to reset, no.

boombatower’s picture

beejeebus’s picture

do we want to get rid of the reset param to ip_address?

that's the only question i have about this patch.

i've worked up a patch that converts the registry to use this system, was pretty straightforward and got rid of those nasty REGISTRY_* constants, but i'll wait until this lands to post that.

chx’s picture

Yeah, ip_adress will need to remove that reset. But, I am absolutely for getting the facility in and then we can, in parallel convert to our hearts content...

bjaspan’s picture

I am a huge fan of solving this problem. My fear is that the ability to wipe all statics easily will make it easier to module developers to write poor APIs (like node_load/node_save) that require external callers to wipe the static cache instead of writing good APIs that correctly maintain their own static cache. Of course, this patch makes writing good APIs much easier, too, but human nature being what it is, people will use any excuse to do things poorly.

Ah well. We can just flog anyone who does it wrong. :-)

boombatower’s picture

We really really need this for SimpleTest to work properly without continuing to add more manual static clearing, not to mention that statics that have no way to clear.

bjaspan’s picture

re #89: No, see, that's exactly my point. We do NOT need this patch to make it easier to SimpleTest to work without more static clearing. We need this patch so SimpleTest does not NEED to do static clearing, because all APIs will be able to keep their statics up to date on their own.

Okay, perhaps SimpleTest is the one exception to this rule, because SimpleTest actually blows away the database out from under modules within a page request, and that is beyond what any module/API can be expected to deal with; it is an abnornal case.

Anyway, I still support the patch. I'm just saying that I'm afraid it will be mis-used to make the problem of poor API design worse (though easier to work around) instead of better.

beejeebus’s picture

FileSize
10.47 KB
Failed: 10442 passes, 3 fails, 0 exceptions View

here is an updated patch with the reset gone from ip_address as per #87.

beejeebus’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
13.13 KB
Failed: 10437 passes, 6 fails, 0 exceptions View

and here's a non-broken version, setting back to CNR to get the test-bot to look at it.

Status: Needs review » Needs work

The last submitted patch failed testing.

boombatower’s picture

#90: I am fully aware that it will completely remove all the static clearing code and as we both seem to agree that is a major benefit.

I am not really sure I understand your concern, but it sounds like we are all for the patch.

bjaspan’s picture

Here's another way to explain my concern:

Bug reporter: "node_load() is totally broken, because if I call node_load(1), change it, node_save() it, and then node_load() it again, I get the original node, not changed node."

Response to bug report: "Well, this was a big problem in D6. But now anyone that wants to call node_load(), node_save(), and node_load() again can just use the Static Cache API to clear all static caches, including node's. Therefore, this flaw in the node API isn't important, so I'm marking the issue 'wontfix'."

Bug reporter: "AAARRRRGGGGHHH!"

Damien Tournoud’s picture

@Barry: better answer:

"Now that we have static cache API, node_save() can now clear the static cache of node_load(), and all those kitten are dancing happily under the moon light."

bjaspan’s picture

@Damien: Yes! Absolutely. That is the outcome we all want, of course. This patch makes it easy to make APIs that use static caches work correctly. Hurray! That's why I support the patch.

All I'm saying is that the static cache API can also be used for evil, and I suspect it will be. We'll just have to agree to flog people that do so until they stop.

beejeebus’s picture

Status: Needs work » Needs review
FileSize
13.13 KB
Passed: 10687 passes, 0 fails, 0 exceptions View

re-uploading to get another run from the bot.

according to the failed tests, we have a problem with cron, but i can't reproduce that running the tests locally :-(

Damien Tournoud’s picture

Status: Needs review » Reviewed & tested by the community

I still believe we need this. Patch still looks great, tests still pass (except when the bot misbehaved ;p), so I'm backing this again.

catch’s picture

We should remember to benchmark before/after when core's been converted.

moshe weitzman’s picture

FWIW, I back this too. This is blocking a lot of testing bugs. Lets commit, or at least state what the delay is about.

boombatower’s picture

Exactly, everytime I get one of the related testing bugs I point to this....

webchick’s picture

I haven't had a chance to dig in on this patch yet, but I spoke to Dries about this earlier in the week. His main concern about it was why we're introducing some special mechanism that new developers will need to learn, which at its core is really only a global variable.

unset($GLOBALS['statics']['ip_address']);

is something that's innately understood by every PHP developer. It's also less overhead, since you're affecting the variable directly.

drupal_static_reset('conf_path');

Is introducing yet another "Drupalism" to the mix.

I thought that the strength of moving to this system was the ability to reset all static caches to default values in setUp() or what not, but this isn't actually implemented in this patch. chx mentioned this is because it is hinging on the rest of core being converted.

Could someone post a summary here of why introducing another Drupalism has advantages over using built-in PHP mechanisms?

moshe weitzman’s picture

Once again, webchick has to speak for Dries. Sigh.

The reason not to use globals is the same as it always was. With globals, you have no central place to watch for changes to a variable's value. It could change anywhere. Thats extremely chaotic. There may be more reasons.

Dries is right that every PHP developer knows about globals. They also know to avoid them.

Thanks for the update, webchick.

chx’s picture

The problem with globals is that there is no central "reset to default" facility. Here we have it. Also, we have a good number of statics which needs just a local "reset to default" facility, currently this functionality is re-implemented over and over and over and over again. Finally, something that came to me during writing the admin activity log -- we can have shared statics, we haev a number of get-set function pairs that we could simplify, I guess. And yet, they are much more controlled than simple globals....

chx’s picture

More: currently bootstrap.inc is converted just because that was open in the editor. And when all files are done then the setUp in drupal_web_test_case can reset all static caches and we will be so very happy.

pwolanin’s picture

Echoing Moshe - there are potential issues with name spaces for globals, as well as unintended side effects (I was bit by this recently).

However, per my comments above (e.g. #53) I still feel that the drupal_static() function needs an additional parameter. I think Crell is mistaken in thinking that more than one module or even more than one function in one module would not want a local static called 'nodes'. I suggested in #53 a "$context" such as the name of the calling function, since the is guaranteed to be globally unique. Yes, smart developers will consistently append such a unique identifier, but I think this would be cleaner if we enforced it with a separated parameter. My suggestion in #53 what the first parameter:

+   * @param $context
+   *  A globally unique string, typically the name of the function
+   *  calling the method. Use, for example, __FUNCTION__ or for a
+   *  singleton object __CLASS__ . '::' . __METHOD__.
pwolanin’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
15.83 KB
Failed: Failed to install HEAD. View

Ok discussed with chx in IRC - since most functions only have one such variable, we really can live with just the one parameter. However, using the magic __FUNCTION__ constant is useful in core, since any core/contrib developer copying it over to another function will automatically get the right behavior.

Also, I reiterated to chx my point from above that we don't need to store the default values - I think chx now agrees, so re-rolling without this extra storage overhead.

Status: Needs review » Needs work

The last submitted patch failed testing.

pwolanin’s picture

Status: Needs work » Needs review
FileSize
13.45 KB
Failed: Failed to install HEAD. View

Huh installed fine for me locally - anyone else?

Here's a re-roll cleaning up the code comments a little more.

Status: Needs review » Needs work

The last submitted patch failed testing.

pwolanin’s picture

FileSize
13.53 KB
Passed: 10687 passes, 0 fails, 0 exceptions View

Hmm, I'm getting lost of exceptions when I run the tests locally: "Only variable references should be returned by reference"

This patch fixes that aspect, at least.

There is a fatal error in core currently - needs this patch to fix a prior commit: http://drupal.org/node/353883#comment-1387768

pwolanin’s picture

Status: Needs work » Needs review
chx’s picture

Status: Needs review » Reviewed & tested by the community

It's now fairly the same functionally as was RTBC'd before.

adrian’s picture

I feel this is one of the most important patches I have seen in a long time, from a perspective of API sanity.
The benefit I see to this, is it provides us with a formalized , generalized api to write get/set functions.

We do this in hundreds of places, and this is different in each of those cases :

Moya:HEAD adrian$ grep -r static * | wc
     240    2013   55152

We end up having strange inconsistencies such as http://api.drupal.org/api/function/module_list/6 , which has a very strange and constantly changing function signature because with every new release we find that we need to be able to access different parts of the code.

If this were implemented using the static cache, it would simplify a lot of the install and update code we need to write (ie: not having to have some magic set of constantly changing parameters to bypass the database code just to enable a fixed set of modules).

The fact of the matter is, that unless we decide to switch ALL the instances of static variable we use to globals, just simply stating that globals have the same effect has no bearing on this patch. Globals provide no oversight of what is happening, they have no way of enforcing namespace, they have no way of providing useful debugging information.

This is all pie in the sky, but it could also be an interesting venue for additional high performance improvements. ie: it could be used as a mechanism for implementing such things as storing the different registries and other things we have as memcached objects, avoiding hits to the database.

robertDouglass’s picture

Subscribe. Static caching in Drupal as-is sacrifices goats to volcanoes. A general API is very welcome.

drewish’s picture

being able to reset all the static caches scattered about the menu system would make my life infinitely easier. it'd let me delete a bunch of code copied and pasted from menu.inc where i only change one line of code.

boombatower’s picture

We now have a number of usecases and benefits in addition to the original reason and making the testing framework be able to TRUELY and SCALABLY clear all static caches. Lets get this in.

pwolanin’s picture

came across this: http://drupal.org/node/160263 suggesting we might need to assign the static to NULL, rather than using unset() - not sure if that's an issue for one array element versus the whole variable?

chx’s picture

It's only an issue for a whole variable.

webchick’s picture

Status: Reviewed & tested by the community » Needs work
Issue tags: -Needs issue summary update

I think this will need a re-roll now that #351797: Multiple load vocabularies was committed.

I'll discuss this again with Dries on Wednesday. Thanks for the follow-ups, folks!

catch’s picture

Status: Needs work » Reviewed & tested by the community

This patch doesn't do a full core conversion so shouldn't be affected by vocabulary load.

Dries’s picture

I think I'm OK with this patch because of the massive flush capabilities. Yet, from an API-level I'm not 100% satisfied. I'd prefer to make the API consistent with variable_get/_set/_reset. That is to say, I think the proposed API feels very 'cramped' and I'd prefer to have a getter and a setter functions.

It's almost as if we need to rename variable_get() to settings_get(), and re-use variable_get() for static variables. Or something like persistent_variable_get() vs static_variable_get(). I don't know for sure -- just thinking up load about how we can clean this up and how we can make this more intuitive.

chx’s picture

Pardon me but are you going to stall this issue with every sort of _get/_set you find already in core? First you mentioned cache_set/get which muddied the issue enough to took it ten months to recover. Second, now we have variable_get. Are we going to work our way through xmlrpc_server_get/set too?

Because this function needs no setter. What's the point? Once again, the API as it stands currently is a simple search-replace for most functions, trivial to use and allows everything we want. Forcing a set to every exit point of all functions that use this API achieves nothing.

robertDouglass’s picture

@Dries, I would like to respectfully disagree with your API naming comment. I wrote a version of this patch half a year ago and came up with something almost identical except that I didn't think to use __FUNCTION__ and instead got the function name from debug_backtrace (which Gábor rightfully pointed out is yucky). I think the API here is absolutely correct. Once you've seen how the drupal_static function works it makes perfect sense that drupal_static_reset is the other part of it. The only possible renaming I could imagine would be drupal_static_flush... but that change isn't worth delaying the patch. This has long been recognized as a critical problem in Drupal, and this is EXACTLY the solution that I've hoped for. Nice work everyone!

robertDouglass’s picture

I think it would be dangerous to confuse these functions with cache_set and cache_get. It's actually nice that the functions don't have the name "cache" in them. Static "caching" is much more a problem of managing scope in a procedural world. If we were programming Java, having a variable retain scope for multiple invocations of a function would be a matter of having it be a member of the same class. The fact that we've been calling this static caching all along is a misnomer. Please don't rename the functions to be confused with cache_get and cache_set.

Dries’s picture

Status: Reviewed & tested by the community » Needs work

chx, your comment in #124 adds no technical argument. It's a rant.

Personally, I think $foo = &drupal_static(__FUNCTION__); is lot less readable than using a static variable or having _get() and _set() functions. I think this is a step backwards from a DX point of view. The goal is to have clean and readable code, not to have the shortest possible code.

Elsewhere in Drupal, we name things by convention instead of using __FUNCTION__ (eg. function names). For core, I prefer to explicitly name things instead of using __FUNCTION__.

I think we have an opportunity to clean up some of our caching layers/APIs (eg. variable_get/variable_set) and create a predictable and human-readable API for different caching schemes.

Dries’s picture

Just to be clear, I'm not advocating that we extend variable_get() and variable_set() to handle both temporary (static) and persistent (database) storage.

I'd prefer an API that is consistent with variable_get()/variable_set(). For example, we could have: permanent_variable_set() vs static_variable_set(). Admittedly, these are 2 ugly names but you get the point. The combination of all these functions gives developers a nice palette and a good overview of different storage options that are consistent, predictable and easy to use without having to be a PHP-god.

chx’s picture

Status: Needs work » Closed (won't fix)

Nothing to see here, move on, move on...

moshe weitzman’s picture

@Dries - you can only ask so much of a development team. We have been working and re-working and debating on this issue for years (page context). After 122 replies, you ask everyone to re-think. Thats at least fewer than the 258 replies when the advanced_help patch got blown apart #299050-258: Help System - Master Patch. Each of these patches takes man-years to get to RTBC. Heartbreaker issues like these have driven many talented developers to work in Contrib or build custom sites. When is the last time we saw a patch committed from Miccolis, Adrian, Hahn, walkah, Arto? An RTBC after 100 followups should be a pretty good indicator that we are putting forth our very best proposal.

pwolanin’s picture

To follow up as to why get/set is problematic:

The goal was to keep us as close to the existing custom of using static variable as possible, while making D7 more testable and making sure that any static could be reset.

The approach in the patch make the conversion of core easy - any kidn of get/set approach will require detailed analysis of all of core for each function where we are converting a static, instead of a simple 1-for-1 replacement.

I agree that it *looks* a bit ugly, but that does not make it hard to use

David_Rothstein’s picture

FileSize
1.6 KB
Passed: 10687 passes, 0 fails, 0 exceptions View

Although it would certainly be time-consuming to convert all of core to use a get/set mechanism, I doubt it would require that much detailed analysis, at least for most functions. Would it?

Attached patch shows a simple implementation of Dries's basic idea and I think there's no doubt it makes the code much more readable. As far as I can tell, this variable_set_temporary() function is all you would really need, although maybe there's some benefit to splitting out variable_get and other functions too -- I haven't thought about it too deeply. With this approach, I think SimpleTest could just call variable_init() in order to clear all the static caches at once...

Damien Tournoud’s picture

@David: please give up.

catch’s picture

Status: Closed (won't fix) » Needs review

variable_set_temporary() doesn't allow for any built in namespacing of variables - and old upgraded sites, not to mention custom modules, have all kinds of random variable names lying around.

So mixing the global $conf array with our current statics could potentially lead to all kinds of unpredictable behaviour which could take hours of tracking down on a site by site basis.

Additionally, variable_set_temporary() seems like it'd be more naturally used for things like $GLOBALS['conf']['cache'] = CACHE_DISABLED; (needed to fix a critical D6 page caching bug and also used in the cache exclude module for just one example of setting a variable just for a page request) - so now we're using the same API function for both 'static caching' and nifty manipulation of variables stored in the database during a request, not sure those should be mixed.

The way I learned PHP was by copying and pasting - I don't need to understand $foo = &drupal_static(__FUNCTION__) to be able to use it.

chx’s picture

You need an explicit set call if you do not want your values to get lost when the request end. We want our values to be lost. That's the fundamental reason behind a no-set. Moshe -- agreed, you can add hunmonk there.

sun’s picture

Status: Needs review » Closed (won't fix)

The reason why this issue has not really been re-opened yet is that there is a sense of frustration throughout Drupal (core) developers who came up with all sorts of arguments for the way this implementation works. But their arguments stand against one opinion to rather implement it differently (and if may add this, based on wild assumptions about a better DX, which were already taken into account in previous counter-arguments against this single opinion - as far as I can see).

If there was another permanent co-maintainer for Drupal core, "we" had a chance to get another trustworthy opinion on this issue. But there is no. Besides webchick who certainly does an awesome job on many other issues, but does not want to interfere with such low-level issues (that's at least how I understand her own statements).

Technically speaking, we are facing a n:1 relationship. That's not good, because it holds up progress and evolution. However, given the replies of both n and 1, and not having a strong opinion on this patch at all, I would say that we are in serious trouble here. I certainly know of other issues/situations where this 1 made all the n feel dumb and led to a better implementation. But particularly in this case, I am very unsure whether the same paradigm applies. The reason being that I have not seen any issue in the past where so many well-known developers argue for one solution.

I don't really want to start with rather unrelated arguments, but given the current situation, I'm seriously considering that some "UX" patches were committed recently without proper concept and zero thoughts regarding DX issues at all. Just like in this issue, all UX-people were in favor of those changes. Even in cases where someone had serious concerns, which were not taken into account at all. As a neutral observer on these issues, I would say that there seem to be very different measures.

To be (more than) honest, this debate made me tinker the entire evening about the healthiness of Drupal core development (even forgetting about major improvements elsewhere that truly deserve a red mark in Drupal's timeline).
Maybe it is because I am passionate about Drupal, maybe not, but I can feel the frustration and overkill here without being involved in the development at all. That is what makes me wary.

That said, I agree with Karoly that this issue won't fix. There is only 1 who could revert the status. Let's keep it that way.

moshe weitzman’s picture

@sun - now you have escalated the issue too far. We don't need a second permanent co-maintainer. We are expressing our frustration on this issue and Dries is surely listening.

And I can't fathom why you would continue a sentence that begins with "I don't really want to start with rather unrelated arguments". We are one dev team, and statements like yours get parrotted over and over until people actually believe them. There is no good to come from whining about how "foo" get in easier than "bar".

drumm’s picture

I discussed this a bit in IRC. Summary is that I'd like to see an implementation with the suggestions from #96-97 fully implemented. I see these new API functions as something to remove the $reset antipattern, and the current implementation examples just reshuffle the syntax. Hopefully a full example will make it more obvious that this API makes the intended use easy, or makes improvements obvious.

sun’s picture

1) If this community is healthy, then there is no "too far". As in any other relationship. I mostly listened to all others (including you) thus far and solely tried to express the collective sense in one follow-up. - I explicitly stated that I do not have strong feelings about this patch.

2) I surely hope that Dries is listening. But that does not undo the fact that there is just one opinion and one person listening.

3) Although it's off-topic for this issue, the symptoms are identical; not even similar. I am purposively not providing example issues. If the unaddressed concerns of those issues would have been taken into account like Dries' concerns in this issue, those patches would not have been committed. But they were; and I am neither talking about hypothetical issues nor about hypothetical concerns.

Sufficient reasons to escalate. At least, I do not think that a community of like-minded people can work out this way.

chx’s picture

@drumm and where will you find a fool who touches this issue now?

pwolanin’s picture

FileSize
32.15 KB
Failed: 10686 passes, 2 fails, 0 exceptions View

This additional conversion of the taxonomy module + test only took about 1 hour (i.e. I started about when drumm posted). catch suggested it as a reasonable place to demonstrate this in action in terms of reset-on-save.

chx’s picture

Status: Closed (won't fix) » Needs review

pwolanin, wow. Your patience and tenacity humbles me. Hats off.

Status: Needs review » Needs work

The last submitted patch failed testing.

Dries’s picture

Status: Needs work » Needs review

We had an n:1 relationship for 8 years, and so far it has worked. Drupal is not a democracy -- it never was. Drupal developers come and go for various reasons -- one of them may be frustration. That is OK -- I'm not here to retain people, I'm here to deliver the best possible Drupal and to do what is best for the project at large.

I review and commit hundreds of patches, and occasionally I disagree with an approach taken. Frequently, I commit patches that I disagree with.

If we are going to escalate disagreements and revert to wining, I can't do my job. My job is to define how long we're in listening/discussion/debating mode. Sometimes I'm quick to commit a patch, sometimes I'm not. You'll have to accept and respect that.

Instead of wining, we should all continue to focus on technical arguments. As far as I'm concerned, we were still actively debating the pros and cons of this patch. David's patch in #132 is a new approach, and I'm asking us to properly evaluate it. That hasn't happened yet -- in fact, his patch is being shot down out of frustration. I like #132 and it accomplishes the same goals outlined in this issue. Plus, it makes for more readable code. Other people like drumm are also trying to offer input and advice.

The goal is to make Drupal as good as we possibly can, so let's continue to compare the different approaches. Once we have compared approaches, I will make a decision.

webchick’s picture

Sorry, today was nuts but I'm caught up on this issue now.

Having read the discussion and the replies, the indisputable fact is that the vast majority of core developers are in favour of having a static caching mechanism to get rid of the $reset anti-pattern. On this we can all agree. Dries has also stated that he's ok with this move, so I don't see a problem here.

The patches up until today represent the best known method and syntax that was arrived at given the people involved in the patch, which is only a small portion of our total core development team (several of those who chimed in later indeed stated that they were not really following this issue closely prior to the controversy). All that Dries is asking for here is for more developers to throw their hats into the ring and try and come up with some alternative approaches, to make sure that we've reached what is ultimately the best solution going forward. What else would you expect a core committer to do? It's infinitely more difficult to change an existing, pervasive API than to implement it correctly the first time, and battling "complexity creep" is something we need to do for the long-term viability of the project. I back this decision up 200%.

Now how people go from "Dries thinks we should take another look at the API to make sure it's as good as it can be" to the kind of damaging, hurtful, and off-the-wall things people are saying here I have no idea. IMO, this sudden explosion of ire and hurt feelings is completely misplaced and extremely damaging both to the project, and to the community. You guys all seriously need to cool off and get some perspective on this issue, and I honestly think you owe Dries an apology.

Note that I also recognize that I hold blame here, too. I do tend to back off "low-level" patches since it takes me longer to come up to speed on them than it does more obvious bug fixes, usability improvements, etc. since I'm a self-taught PHP hacker with a smattering of community college, not a PhD in computer science and engineering with 10 years of experience with Drupal. :) So yeah, if there are issues where I feel that Dries is a better person to review, I'll normally ask him to do so, which in this instance led to almost a "good cop" / "bad cop" situation which is no good for anyone. I've also been using IRC as a primary communication mechanism more than issue queues, and have probably not been strict enough on some of the patches that have gone in lately (though I do try to be as much of a pain in the ass as possible! ;)). I'll try and work on these things going forward.

David_Rothstein’s picture

Well, @catch in #134 already reviewed my patch from #132 and raised a legitimate issue with it. Although it seems like it would be very rare, it is certainly true that if there ever were an inadvertent conflict between the database and a static variable, all hell would break loose -- so it may be that this approach is a little too "cute". Nonetheless, as stated above, it certainly wouldn't be hard to add variable_get_temporary() and variable_del_temporary() functions (probably call them something different) and have them all operate on a global $static variable instead, rather than global $conf. This is more along the lines with what Dries was suggesting anyway, as well as what @bjaspan suggested way back in #14.....

At that point, I'm not sure there's much of a technical difference between the two approaches at all. It's really just a matter of what kind of code style Drupal is going to have.

chx’s picture

David, the problem is the explicit set call. You are adding a totally unnecessary function call -- the patch we try to get in demostrates it can be done in one call. Separating it into two calls is an invitation for bugs and is another speed penalty.

Although. The set will only be called when the static variable is, as yet, uninitialized, isnt it? That's not that much compared to the fact that we add one function for every static usage... so maybe we can run with that. An API that in my view is unnecessarily complicated but at least exists is better than an API that's tho elegant but does not get in.

neclimdul’s picture

To put a couple things catch I think hinted at in his review, we're using a global so there is always the posibility of collisions(namepacing like he mentioned). Add to that the fact that we already have some voodoo around that particular global and we're asking for trouble.

This may be something that could easily be fixed but personally I think the distinct _get/_set calls are a mistake in this case. We're already asking developers to take a probable performance hit by adding a function call to every one of their requests. Static variables are used in these cases to ease load so smart developers are going to be mindful of this fact. On call versus possibly multiple is going to be easier to convince.

I think the variable based name is also uncomfortable. It hits that it is connected and should be used in ways similar to the long standing variable system in some way when it really isn't at all.

So based on those points I'm currently in favor of the drupal_static based patch.

catch’s picture

I'm disappointed to see my review of David's patch described as "shooting it down out of frustration". Apart from mixing statics with the variables table, pretty much the same approach (albeit with cache_set() in place of variable_set()) was put forward, reviewed and rejected much earlier in the issue, and there's no mention of that prior discussion, or why the original arguments made there don't stand, in Dries' feedback, which is where I think the frustration on this issue lies.

However, I took a look at how that approach might apply to our common uses of statics after chx's mention about the function call possibly only being required when the static wasn't initialized, to see how accurate that'd be in practice.

With a get/set approach that's similar to either variable or cache (regardless of naming and name spacing) - it's going to be considerably more limited than what we currently do with statics in core.

We have a lot of functions which either store an array of keyed items in a static or which use a static to track the progress of recursive function calls - most _load() for the former, and things like taxonomy_get_tree() for the latter. In the best case it'd be an extra function call per item/recursion.

More critical than that though, variable set doesn't let you add and remove items to an array.

Two examples - node_load_multiple() and taxonomy_get_tree().

node_load_multiple() stores an array of fully loaded nodes, keyed by nid, and at various stages accesses the full array of nodes, merges newly loaded nodes with the static array, for example:

How's a getter and setter going to let us do things like:

 $node_cache += $queried_nodes;

I could do

foreach ($queried_nodes as $node) {
  variable_set_temporary('node_cache_' . $nid, $node);
}

But that would make things like

$nodes = $node_cache;

impossible earlier on.

We could add a 'static ID' to the get/setter - make it look like cache_get/set, something which again, was already discussed and rejected, i.e.:

foreach ($queried_nodes as $node) {
  variable_set_temporary('node_cache', $nid, $node);
}

But then in taxonomy_get_tree() we have things like

      static $children, $parents, $terms;
      ...
      $children[$vid][$term->parent][] = $term->tid;
      $parents[$vid][$term->tid][] = $term->parent;
      $terms[$vid][$term->tid] = $term;

I don't see how get/set in any form is going to allow us to do multi-level arrays like that, or at least not in a concise way.

The advantage of drupal_static() is that once the static variable is set up, it operates exactly the same as if you'd done static $var - so you're free to manipulate that variable within the scope of your own function in whatever way you need. If we're unable to get the same flexibility with a get/set - then functions like node_load() and taxonomy_get_tree() won't be able to use it. Which means we either manipulate $GLOBALS directly or use static $var again for all but the most basic cases. If we can't use the same mechanism across core consistently, then it's not standardization.

robertDouglass’s picture

After reading all of the arguments I have to say I'm still in favor of the current proposed implementation which has the following advantages:

1. It gives you a reference to the statically cached variable so that you can operate on it directly and not need to re-save it.
2. Any function which is about to call another function can guarantee that its statics are cleared
3. I can't accidentally do anything to data without knowing it.
4. There is no namespace collision.

The UGLY parts of it are 100% due to PHP being a ghetto language. We can't change that. If PHP had a __CALLER__ magic constant we'd be able to hide the ugliness. My implementation used debug_backtrace to get the caller information, but that would kill us. So __FUNCTION__ is a perfectly beautiful way to do it. Except that it makes your eyes bleed.

Committing a cleaned up version of #141 would make my life as a developer much easier. None of the other suggestions come as close to a nice technical solution.

The two things that I can think of that could possibly happen that would make this code nicer are: PHP comes up with a __CALLER__ type of magic constant, or we rewrite all this to use private class variables where we do our caching.

earnie’s picture

we rewrite all this to use private class variables where we do our caching.

Do you mean something like http://drupalbin.com/8667

catch’s picture

FileSize
31.47 KB
Passed: 10688 passes, 0 fails, 0 exceptions View

This one just fixes the two failures in forum.module. Note the entirety of taxonomy and forum tests now only have two manual resets of the static due to the way simpletest works - within Drupal itself all statics are completely internal to the API now.

Drumm, is this a sufficient example to show where the cleanup is?

pwolanin’s picture

FileSize
34.31 KB
Passed: 10688 passes, 0 fails, 0 exceptions View

drupal_static_reset('taxonomy_vocabulary_load') in forum.test is a no-op and not needed since the call to taxonomy_vocabulary_save() now causes the reset anyhow.

Just removed that one line.

Dries’s picture

Status: Needs review » Needs work

OK, I re-read the entire issue, and in combination with catch's reply in #149, I decided to go ahead and commit this patch.

I've marked it as 'code needs work' until the documentation in the handbook has been extended.

To those who had the patience and persistence: thanks!

robertDouglass’s picture

Status: Needs work » Needs review

@earnie yes, I meant something like that. But I don't like that solution as much as #153 because we'd have to rewrite all of Drupal.

pwolanin’s picture

Status: Needs review » Needs work

for upgrade docs and follow-up issues

beejeebus’s picture

should i post a port of the registry to the static caching system here or create a new issue?

sun’s picture

Let me apologize for making an already emotional issue even more emotional.

catch’s picture

@justin - new issue please, but link it from here :)

pwolanin’s picture

Per discussion w/ catch - if you have a variable name more complex than just __FUNCTION__, use a ':' to separate the suffix (this will avoid colliding with any other valid function name). e.g. :

&drupal_static(__FUNCTION__ . ':second_var');
boombatower’s picture

or wouldn't the following work?

list($var1, $var2) = &drupal_static(__FUNCTION__);
David_Rothstein’s picture

Hmmm... this might be beating a dead horse, but in response to @catch in #149, I do not see at all what is special about functions like node_load_multiple() and taxonomy_get_tree(). The attached patch shows how I would convert these two functions to a get/set mechanism. They are no more complicated than anything else.

In each case, the set function only gets called when there is new, already expensive data that needs to be cached, so just like @chx said in #147 the performance impact should be almost unmeasurable. The recursion in taxonomy_get_tree() does not come into play, because the set call only happens the first time.

Regarding arrays, I also do not see why get/set would introduce any particular problem. If you store an array, then in order to clear the cache for one element of the array (as opposed to clearing the entire array, which is trivial), you would call the "get" function to bring the whole array into scope, followed by unset(), followed by the "set" function. In the other approach, you would need to do almost the exact same thing, right?: Call drupal_static() to bring the whole array into scope, followed by unset(). So it's three lines of code rather than two, but otherwise no difference -- again, just like @chx said in #147.

So I think @chx's comment in #147 is still a pretty good summary: The get/set approach requires an extra line of code that strictly speaking is not necessary. On the other hand, (I think) it makes the code more readable, forces developers to be explicit about when they do/don't want to store values for later use, and is consistent with Drupal's other APIs.

David_Rothstein’s picture

@boombatower in #162: That sounded interesting, but when I tried it, it didn't seem to work. Something like the following looks like it does work, though:

  $variables = &drupal_static(__FUNCTION__);
  $var1 = &$variables['var1'];
  $var2 = &$variables['var2'];
catch’s picture

Hmm, David's right about the arrays, I guess you can do what you like before you set it back. Still extra work compared to what went in though.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.