Splitted #1596472: Replace hard coded static cache of entities with cache backends.
Create a cache chain implementation (CacheChain
) in the patch, implementing CacheBackendInterface
, which can naturally replace any other cache backend in various subsystems.
The goal of this cache chain is typically the entity problem: being able to cache objects in both the remote persistant cache storage, and statically in the code.
This implementation uses the chain of command pattern for writing over all the backends, this means than any modification will be propagated to all backends. On the opposite, for read operations, it implements a chain or responsability pattern, which means that the first backend that return a value wins, this is built for reading operation performance.
It has a specific mode which can be toggle, and is on per default: when a cache entry is fetched from a later backend, all previous backends will have the value propagated (set) over them.
The implementation here is complete, quite fast, and fully unit tested with almost every limit test case possible. It works quite nicely.
The goal, in a near future is to use this into entity controllers. Another goal is once #1637478: Add a PHP array cache backend gets finalized and commited, use it into the tests for making those faster.
Side note: I have to admit my goal is to backport this to Drupal 7 if the patch is accepted. I think @catch would agree, as the entity_cache module could benefit from this, and be eaiser to maintain. I think that such backport would also greatly benefit to other modules such as Entity API and the Commerce module suite.
Comment | File | Size | Author |
---|---|---|---|
#63 | 1651206-63.patch | 20.16 KB | fubhy |
#56 | 1651206-56-cache_chain.patch | 21.4 KB | fubhy |
#53 | 1651206-53-cache_chain.patch | 21.36 KB | fubhy |
#48 | 1651206-48-cache_chain.patch | 22.39 KB | pounard |
#44 | 1651206-44-cache_chain.patch | 22.39 KB | pounard |
Comments
Comment #1
pounardComment #2
pounardA sample usage, imagine that my MySpecificEntityController uses a specific cache backend:
With this scenario, all entities will get cached naturally using the internal cache backend. But the problem here is I want to use a static cache? Let's assume for the example purposes that DrupalDefaultEntityController doesn't handle a static cache internally.
I just need to write this (in some initialization code somewhere, the place this happens is another issue matter):
And you're done! You controller is transparently doing both static and persistant cache. Time for review!
Comment #3
fubhy CreditAttribution: fubhy commentedWhat about a CacheChainInterface interface that extends CacheBackenInterface with a custom implementation for the constructor that, instead of having a $bin as an argument would accept an array of CacheBackendInterface objects?
Our cache chain for entity caching (or, more generic: "property container / property item caching") should also be aware of PropertyContainers / PropertyItems and therefore accept a list of property items that should be used as tags. So for me it would make sense to have a constructor with 2 arguments: 1) Array of Cache backends and 2) Array of properties that should be used as tags.
There is a whitespace after the closing bracket for the foreach loop.
Whitespace after the opening brackets.
White space after the PHPDocBlock
Good job on this first implementation. It is more or less exactly how I imagined it to look like except for the missing Property-aware implementation with support for tagging and the stuff that I mentioned about the contsructor.
Comment #4
fubhy CreditAttribution: fubhy commentedOops... Didn't mean to change the status... Others should take a look too :)
Comment #5
pounardIf we do that, the CacheChainInterface can't extend the CacheBackendInterface, which means that code using it will not be able to use it. The easier and clean way to achieve this would be to remove the constructor from the CacheBackendInterface, and replace it with a setter injector for the bin.
Alternatively we could implement something like this, but this would require an awful lot of code changes in actual core (although I'm not reluctent to it, but this would be a new cleanup issue):
I don't think this belong to the CacheInterface, messing up with properties is a business matter and should be dealt into the controller, not into the CacheChain. For it to be efficient and easy to maintain, the chain needs to remain generic and agnostic of any business matters.
Having a proxy implementation on top of it for this may be a good idea, but this proxy implementation should live into the entity namespace, not into the cache namespace, and should be entirely dealt with transparently into the entity controller instance and never leak outside.
The cache chain purpose, in being generic, is that we can replace it at runtime. This means that guy who write the entity controller may put a defautl cache implementation, static, persistant, or chain, but the site admin can safely replace it with anything else without worrying about business details. If we start implementing such business logic inside this component, this wouldn't be possible anymore and would be a serious regression from a performance tuning standpoint.
Thanks you for you review, I think that you actually did put the finger into business painful matter --the entity properties, conditions, and all-- but this should be dealt internally into the entity controller, while the CacheInterface being used must be replaceable (for site admins to be able to performance tune their site depending on what they profiled in the real world) without replacing the property logic, which means it must be decoupled from it. This must be it because it's a safeguard against cache bloating that could be provoked by wrong original developer assumption about its code usage, or be caused because of edge case hackish controller usage: we cannot protect ourselves against that, that's why it needs to be injectable and replaceable.
EDIT: Once again, thanks a lot, I will fix the whitespace problems.
Comment #6
pounardThis needs a follow-up issue for the missing tags property on cache objects documentation/specification.
Comment #7
pounardTime to wake up this issue!
Comment #8
fubhy CreditAttribution: fubhy commentedYes. Let's meet asap and discuss the architecture. I already had a small discussion about it with catch yesterday. So... When and where? Let's arrange a meeting on twitter.
Comment #9
pounardThe first patch I made still passes tests (I guess), is clean and nice, I think that only the documentation might be improved.
There is an exception thought: the problem of non specified cache tags in the $cached entry. I will attempt a backward compatible patch @home tonight if I have some time to modify cache entries in order to make them classed objects with correct accessors, such as getTags() getData() etc... keeping the $data attribute public for already existing code.
Does this sound like a good idea?
Comment #10
catchWe should definitely look at this - it's going to need to be its own issue and we might want to remove the bc layer before D8 is released (or possibly not, but it's worth discussing).
Comment #11
pounardI would very much remove this kind of BC layer everywhere for D8, it makes things hard to keep too many compatibility layers because contrib dudes will continue to use it and never switch to right method even if we keep it and mark it as deprecated. Plus, with the cache chain etc.. new helpers, I think that most contrib will have to rewrite their cache handling if they want a clean code anyway.
Comment #12
pounardBut, in the end, keeping it wouldn't mind either. Anyway, I proposed to keep BC just to implement a proof of concept to start with.
Comment #13
catchI opened #1748022: Make cache()->get() return a classed CacheItem object.
Comment #14
pounardThanks. Is that possible to continue this patch until commit before the other one or is it a blocker?
Comment #15
fubhy CreditAttribution: fubhy commentedOne
I think we need to discuss this. Propagation is not really optional in something that I would call a Cache Chain. After all, propagation is one of the key concepts of a cache chain and one of the two things that actually turn this cache backend into an actual chain.
Comment #16
pounardI have to disagree here, this is an optional feature and it holds a use case: we might use it as an horizontal chain of responsability cache, using multiple specialized backends that will each carry their own cached entries, case in which duplicating cache entries in multiple backends must not be done.
Comment #17
fubhy CreditAttribution: fubhy commentedCan you give us an example of a use case in which this would make sense? I am just worried that we are going to push code for something that is never going to happen. I've discussed this with @catch and basically we are both wondering what the use-case for this is. After all we believe that a one of the two essential concepts of a CacheChain is the propagation of stuff to higher level CacheBackends.
Comment #18
fubhy CreditAttribution: fubhy commentedWe are also missing the forwarded $tags array here ($backend->invalidateTags($tags))
Comment #19
pounard@#18 Nice catch! Needs to be both fixed and tested.
Comment #20
pounardHere is a new version just fixing #18 and adding a new test. I sadly cannot run tests anymore on my dev box (I can't figure out why) so it's a blind test.
Comment #22
podarok+ trailing whitespaces
Comment #23
catchYeah this really needs a use-case to justify it, and since we don't have one with core I'd much rather leave that to a contrib backend to implement if someone turns out to need it.
Comment #24
catchThis ought to help with #1187726: Add caching for configuration / rework config object loading (Was: Memory usage and i/o from config objects), bumping priority to major.
Comment #25
pounardAlright, the patch is usable as-is, we need to fix the tests and this should be comitable. I will try tonight.
EDIT: Only blocker will be the cache tags, but I think we can commit the patch before getting the tags usable.
Re-EDIT: Doing it right now.
Comment #26
pounardHere is a new patch which tests should pass. Note that:
Comment #27
fubhy CreditAttribution: fubhy commentedWhitespaces.
Whitespace here too.
-----
I am still not entirely convinced that we should commit the propagation toggle at all (even if it's already covered in the tests and helps there in some way). I would like to see a usecase first. Sorry for insisting on this one so much :).
Also, the docblocks in general are not that nice yet and could use quite some improvement. I don't know, however, if we might want to postpone that (considering that catch marked this issue as major and it blocks that other issue over at #1187726: Add caching for configuration / rework config object loading (Was: Memory usage and i/o from config objects)).
Comment #28
pounardFor the get propagation feature, I agree, but right now it helps testing some edge cases I'd like to keep unit tested. I will try later to see how I can arrange that, but right now, removing it would mean removing tests, we can live with that until we proceed to a second pass or follow-up.
I agree to documentation improvement, any suggestions will be greatly welcome.
EDIT:
I'm not posting any further whitespace fixing patches until the tests actually passed on test bots.Oh! It passed when I was writing this.Comment #29
fubhy CreditAttribution: fubhy commentedOkay. That works for me too... I am not opposed to leaving it as it is for now and unblocking that other issue first. The tests are green now. If you want I can work on the documentation now... I have two hours of time right now.
Comment #30
pounardWhitespace fixes.
Ok, sure, one thing thought, I'd like to keep all information which stands in the CacheChain top class PHPdoc (pattern names, writes being slower than read, etc...) I think it's both educational and informational, and describe quite well what exactly the class does. Aside of that, my english surely isn't perfect so any fixes are welcome :)
Comment #31
fubhy CreditAttribution: fubhy commentedYes, I will keep that of course. I like that too :). Ok, I will give it a go now. Brb :)
Comment #33
pounardOh cr*p I used the wrong patch to fix whitespaces! Here is the right one (sorry). @fubhy please check you are using the right one for doc patch.
Comment #34
fubhy CreditAttribution: fubhy commentedStill working on this. I found some other issues that need to be addressed too... Sorry. Will post the patch asap but not later than tomorrow morning.
Comment #35
fubhy CreditAttribution: fubhy commentedOki, so this is what I did:
Comment #36
fubhy CreditAttribution: fubhy commentedWhitespace fix.
Comment #37
pounardRandom notes:
0 < $index
instead of$index > 0
(in both get and getMultiple().checkCacheExistsNowhere
doesn't make sense IMO, each test is a unit test and I don't think it worth factorizing here.Nice catch for the get bug, you indeed did a nice fix, but I don't know why you totally messed up tests while they were working right.
Comment #38
fubhy CreditAttribution: fubhy commentedSure, we can do that... I don't really care which way we do it. I just thought that it makes more sense to iterate over the backends in reverse (last backend up to first backend).
The generic base test class for cache backends has a test for invalidating tags.
That one is a really simple method comparable to checkCacheExists in the base class. It makes the tests much more readable and shorter.
The generic test backend exposes that method. All other backend tests work the same way too (Memory backend, Database backend, etc. all use that method for setting up the cache backend object).
We have to do that or something similar if we want to use the generic base test class. We are testing with multiple bins sometimes (e.g. in the invalidateTags test) and if we don't do that we end up with the same objects for the backends in the cache chain. We don't want that. So either we put them into properties on the cache chain object (I understand that that sucks somehow) or we put them in an Array keyed by the $bin or something like that.
Comment #39
pounardWe are reusing the
$cache->tags
property, which is undocumented. I don't care weither or not other tests are using it, as it is non documented and non specified, we need to keep this todo until it has been desambiguated.Tests are not meant to be short, they are meant to be complete and fully readable in one pass, without having to navigate throught methods. It's a scenario, not optimized code.
This is not a backend test, other backends have been made for being extendable for contrib, this one is meant to test a specific implementation. Anyway, if we provide this method, we must use it in order to set a test class property and not for doing magic.
That test don't need to be generalized, it is based upon the fact that all contrib test backends will pass the generic cache backend tests.
Either way if we want to generalized it, we can still create a
createFirstBackend()
method, acreateSecondBackend()
method and acreateThirdBackend()
method for the subclass to override. This would be safer, less magic, and more encapsuled.Comment #41
fubhy CreditAttribution: fubhy commentedBy that definition we would not be able to use the $cache->data property either. It's undocumented as well ;). The tags property is just as documented/undocumented as any of the other properties on a cached item.
We still need to provide tests for the basic functionality that is expected for a cache backend. That's what the generic base test class does. And that's why we should make use of it. Even though this is a really abstract implementation for a backend it still is a backend. And so we should test it as a backend too.
Comment #42
catchI'd expect this to be extended or re-implemented in contrib to provide verification of cache items in 'local' cache storage. i.e. if I chain an APC and Database backend together, then I might want to check a timestamp vs. the APC and database storage to determine whether the stuff in the APC storage is valid or not (or there are other ways to handle this, but still there are use cases for extending it).
Comment #43
pounardOh right, I understand what you did there.
I'm not sure this is necessary, but[EDIT: Necessary or not,] it is a good idea. We should split it into two class instead.Comment #44
pounard@catch I'm not sure if this makes sense, as long as every backend actually implements fully the cache backend interface, this is supposed to work.
But, addressing this idea, I splitted up the test into two classes, fisrt one is the good old one, with some of extra tests of fubhy added back (one commented out because I was too lazy to adapt it) in which I added the 4 backends instanciation as overridable methods, and the second one an implementation of the generic backend test.
EDIT: Of course I kept all fubhy's fixes in the CacheChain class.
Comment #46
pounard#44: 1651206-44-cache_chain.patch queued for re-testing.
Comment #48
pounardRe test...
Comment #50
pounardWTF TESTBOT!
Comment #51
catch@pounard the problem with 'local storage' is that even if you successfully clear stuff on one server, it doesn't clear on another. We absolutely don't need to fix that here, it's just a reason that this might get extended in contrib.
Comment #52
pounardI think that this local storage problem is a recuring one, and that is wider than just the cache chain issue: if we correctly encapsulate the responsabilities, this should never impact the chain itself.
Comment #53
fubhy CreditAttribution: fubhy commentedLet's see if the testbot likes this...
Note: I renamed the CacheChain class to ChainBackend so it is in-line with the existing backend namings... Namingconventions, yai! Consequently I also renamed the test classes of course.
The tags test is back in as well.
Comment #54
fubhy CreditAttribution: fubhy commentedI just noticed that this docblock is utterly wrong :). Apart from that I don't think that I have anything to add to this. For me it is ready to go now.
Comment #55
pounardWith the docblock fix and if tests passes it will be RTBC for me.
Comment #56
fubhy CreditAttribution: fubhy commentedOk, this is it I think...
Renamed once more to: BackendChain - Makes more sense per agreement of catch, pounard and me.
Comment #57
fubhy CreditAttribution: fubhy commentedConsidering #55 this is RTBC now :)
Comment #58
pounardRTBC for me too.
Comment #59
pounardOups... Waking up this. I accidently removed the tag some posts upper.
Comment #60
webchickProbably makes sense to have catch look at this.
Comment #61
catchOverall this looks great. I'm fine with not handling the local caching issue here, that's probably better left for the APC issue.
I mainly found comment/style issues but there's quite a few of them, and some dead code:
The summary should be one line I think. Not sure if that's true for classes but it probably makes sense here anyway.
There's a few typos 'responsability', 'cackend' (although I like that one) and some of the description here is very clunky. If I have time later I could try to take a look at it but it could probably use looking over by someone not familiar with the patch for clarity.
Please reverse the yoda conditions. If we want to start using this (I personally can't stand them though), then it needs a separate issue rather than being snuck in.
Also we don't abbreviate variables, so $return vs. $ret.
This is going to conflict a bit with the invalid vs. deleted patch but I think it'll be fine to adapt once that gets in.
This has been removed by another patch already.
Typo 'voluntarely'. Also we don't need comments like "this is important, do not change!", or at least it should be rephrased to "this is essential to avoid false positives".
Should be "all backends", "of them" isn't very clear.
Comment #62
fubhy CreditAttribution: fubhy commentedAgree. One line summary + native speaker looking over it.
I think this was copy&pasted from the other backends. They use abbrevations in the exact same spot too ($ret). :*(
I will do a re-roll later and will also try to find a native speaker who is unfamiliar with the patch to improve the documentation.
Comment #63
fubhy CreditAttribution: fubhy commentedTried to address @catch's concerns. We will still need someone who didn't work on the patch to review our documentation (especially for the interface).
Comment #64
pounardThanks fubhy, if this passes tests, still RTBC IMO.
@catch I now you dislike yoda conditions, but believe me, when they're hidden in a far far away loop deep inside a cache chain, I do really like them, IMO they make it more readable (and from what I can remember, it's not illegal by coding standard to use them). About the $ret abbr. I do not understand the rant against, it's probably the only abbr. I'd keep because it's really really obvious of what it is when reading the code, and make some ugly 100+ chars line more readable:
$this->backends[$i]->set($cid, $return->data, $return->expire, $return->tags);
this hurts! Drupal and 200+ char lines is something that make my eyes bleed on a per hour basis.Comment #65
Stalski CreditAttribution: Stalski commentedGood work. RTBC for me too
Comment #66
catchTrying to help with docs a bit:
If that looks OK I can either fix on commit (or a new patch would work). Apart from that no further complaints.
Comment #67
fubhy CreditAttribution: fubhy commentedThat looks good! Let's go with that :).
Comment #68
catchFixed up the docs a bit more with fubhy in irc. Committed/pushed to 8.x.
Comment #69
pounardGreat! Thanks.
Comment #70.0
(not verified) CreditAttribution: commentedAdded considerations about D7.
Comment #71
Wim LeersFollow-up issue: #2541432: Follow-up for #2231595: Document why ChainedFastBackend cannot use BackendChain.