Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Problem/Motivation
MySQL doesn't perform well enough for this kind of thing
Proposed resolution
Move all database interaction to a service.
Remaining tasks
None
User interface changes
None
API changes
- All database interactions to be moved to
NodeStatisticsDatabaseStorage
update statistics_get()
is deprecated, use\Drupal::service('statistics.storage.node')->fetchView($id)
instead.
Beta phase evaluation
Issue category | Task |
---|---|
Issue priority | Normal |
Unfrozen changes | Unfrozen because it helps with migration for node counters. |
Prioritized changes | The main goal of this issue is performance. Previously Statistics has had a bad name because of the number of database reads and writes. This patch allows statistics to move to Mongo, Redis, Couch or another fast data store. This patch also provides a clean way to access the storage which can be used by migration. |
Disruption | None |
Comment | File | Size | Author |
---|---|---|---|
#199 | interdiff.txt | 1.92 KB | timmillwood |
#199 | improve_statistics-1446932-199.patch | 20.8 KB | timmillwood |
#181 | improve_statistics-1446932-181.patch | 20.79 KB | mallezie |
#180 | interdiff.txt | 8.65 KB | mallezie |
#180 | improve_statistics-1446932-180.patch | 20.82 KB | mallezie |
Comments
Comment #1
mikeytown2 CreditAttribution: mikeytown2 commentedjStats does something similar to your proposal (doesn't use a cache table though)
http://drupal.org/project/jstats
Comment #2
timmillwoodRight, so jstats writes to one table each page view, then writes to another table on cron.
I think it would be better to use Drupal core cache or queue system then it can hook into another external caching or queuing system such as memcache or redis more easily.
Comment #3
jcisio CreditAttribution: jcisio commentedThe trick is to replace "N updates" with "N inserts + 1 update" which is known much cheaper in MySQL. It may be not correct/applicable for another backend.
Comment #4
timmillwoodHere's a quick patch for this which creates a cache table for the node_counter and all reads are cached.
TODO: add some way of clearing the cache. Open for offers on when to clear it. Cron run? max-age header? something else?
Comment #6
jcisio CreditAttribution: jcisio commented@timmillwood I don't see why a
SELECT * FROM node_counter WHERE nid = :nid
is not as good as
SELECT * FROM cache_node_counter WHERE cid = :cid
The first one is even better in some db system, as nid is an integer. The second is only better with another cache backend other than the core default one.
Comment #7
timmillwoodThe advantage of using another cache backend is a big one.
This is a small step in the performance of the statistics module. More to come.
The test is failing because it's reading from cached data, and the first read from the database is when there is no data in there, which gets cached.
Comment #8
timmillwoodRe-thinking this, going to write the node_counter updates to a Drupal Queue, then to the node_counter table on cron. (patch coming soon)
Comment #9
timmillwoodThis patch seems to work, but I am still looking through issues in the test.
I just wanted to get something up here for now to get my general ideas across.
This patch writes node views to a queue, then each cron run these are written to the node_counter table. The real advantage of this comes when using an alternative backend for the queue.
Comment #10
timmillwoodComment #12
rjbrown99 CreditAttribution: rjbrown99 commentedI'm starting to play around with a variant of #4 with Drupal 6. It's working well, and my take is to set the cache expiration via a variable, then only cache for the current time + that variable. I set it to 15 minutes or 900 seconds by default. So I'm not clearing at all via cron, just allowing the cache to expire naturally.
I'd move to another stats module, but I have a bunch of stuff tied to core statistics so it's easier for me to just improve it than rethink everything else.
Here is what my D6 version looks like for reference (not that it will ever have a chance of being added.) One note to anyone else that tries to use it, my version of Drupal 6 has a backport of REQUEST_TIME which is not in D6 by default. If you try to use this you need to replace that with time().
I'll probably play with drupal_queue backport next with the other patches.
Comment #13
timmillwoodThis patch is basically the same as the last just with an updated test so it should pass.
Comment #15
timmillwood#13: statistics-queue-1446932-13.patch queued for re-testing.
Comment #16
timmillwoodWow! Not sure why but it seemed to fail every related test, worked fine locally!
Comment #18
timmillwoodAfter a local pull the test is now failing, but the patch still works. There must've been a change which is causing the test to fail.
Comment #19
tstoecklerIn the tests, you should use
instead of calling the URL on your own.
(Btw: cron.php was nuked recently, so that will be the cause of some of the failures.)
Comment #20
timmillwoodDon't think the tests are 100% there, replaced calling the cron URL to use $this->cronRun();
Comment #22
tstoecklerSome (mostly) minor clean-up stuff:
Leftover call to old cron.php
Trailing whitespace.
This seems like it could actually be a perfomance hit, even if it's only a small one. Should this be benchmarked for people with the default queueing backend?
This is not needed now that we use
$this->drupalCron()
Comment #23
timmillwoodThe full bootstrap will be a performance hit, but once dependency injection and other symfony things are sorted out, I will move it over to use that, and lower the bootstrap.
$base_url is needed for statistics.php call.
Comment #24
timmillwoodFinally got it to pass the tests locally, fingers crossed all looks ok.
Comment #25
msonnabaum CreditAttribution: msonnabaum commentedI dont see the point of this. We're just trading one INSERT for another. This assumes that you have a high performance (non-mysql) queue backend and that's why it's a better option. I think a better approach would be to have the stats backend be swappable.
Comment #26
timmillwoodHere's the first patch for this, but returns the following error as a response from statistics.php
Fatal error: Interface 'Drupal\Core\Statistics\StatisticsBackendInterface' not found in /Users/tim/Sites/statsback/core/lib/Drupal/Core/Statistics/DatabaseStatisticsBackend.php on line 6
Comment #27
timmillwoodThis one is now working.
Comment #28
RobLoachWhat if we were to move it all into the Statistics module itself? Also don't think we need StatisticsBackend wrapper since you can just re-register "statistics" with a different class if you wanted.
Comment #29
RobLoachHad the wrong namespace in the interface.
Comment #31
RobLoachHuh?
Comment #32
timmillwoodThis patch is not working for me. It causes statistics.php to return:
Symfony\Component\DependencyInjection\Exception\InvalidArgumentException: The service definition "statistics" does not exist. in Symfony\Component\DependencyInjection\ContainerBuilder->getDefinition() (line 690 of /Users/tim/Sites/statsback/core/vendor/Symfony/Component/DependencyInjection/ContainerBuilder.php).
Comment #33
timmillwoodI was able to resolve this by clearing cache.
Comment #34
timmillwoodThis patch added variable get to statistics_init to make changing the backend class a bit easier.
It also adds a get method which is uses in statistics_get
Comment #35
timmillwoodplease review
Comment #36
timmillwoodMight as well add delete too ;)
Comment #38
timmillwoodThis passes all the tests on my local machine.
Comment #39
timmillwoodAdding reset method in to set the day counter in node_counter to 0.
Comment #41
timmillwoodNot sure why this is failing testing.
Comment #42
timmillwoodComment #43
timmillwoodComment #45
timmillwood#31: 1446932.patch queued for re-testing.
Comment #48
alansaviolobo CreditAttribution: alansaviolobo commentedComment #49
timmillwoodBringing this issue back to life, planning to move everything to a service as described in https://www.drupal.org/node/2306083 and https://github.com/palantirnet/hugs/blob/master/src/HugTracker.php
Comment #50
timmillwoodFirst step in regards to making this work
Comment #54
dawehnerSome interface could we certainly useful, otherwise, well it would be totally pointless :)
Comment #55
timmillwoodAdding an interface to the initial service.
Comment #57
timmillwoodAdding all database queries from statistics.module to the service, apart from the one in the statistics_title_list function as this is a little more complex because it joins other database tables which other backends wouldn't be able to do.
Comment #59
rpayanmRerolling...
Comment #61
timmillwoodSwitching statistics_title_list function to call the statisticsTitleList storage service method too, I think this is now all DB queries into the service.
Comment #62
timmillwoodOops, empty patch.
Hope this one works!
Comment #65
timmillwoodComment #68
timmillwoodComment #69
timmillwoodRemoving the "statistics_title_list" function and calling the storage service directly from the block plugin.
Comment #70
dawehnerWhat about naming the class to make it clear that its for database ... what about using StatisticsDatabaseStorage
Do you mind indenting this :)
A little bit of docs about what is stored in the stats would be great
I still don't get why we don't make some more useful interface, given that we have a completely new API anyway.
This sounds like a unnecessary API removal.
Comment #71
timmillwoodWhat were you thinking? Maybe better as a new issue? Already thinking of making. A new patch to make it entity counter rather than node counter.
Statistics_title_list function made no sense as it was just calling statisticsTitleList method, now just calling the method directly from the block.
All other comments sound fine, will update patch.
Thanks for reviewing!!!!
Comment #72
timmillwoodUpdating re: #70
Comment #73
timmillwoodStatus: Needs review
Comment #75
timmillwoodOops, looks like I missed the new file from the patch when renaming.
Comment #76
dawehnerWell, I think we should avoid API breakage for things we introduce ....
Comment #77
timmillwoodSo you think we should keep all the function in statistics.module and just call the new service not to break the API?
Comment #78
Crell CreditAttribution: Crell at Palantir.net commentedAt this stage in the cycle we probably have to keep the old functions and mark @deprecated. But be aggressive in what we mark @deprecated now that there is a service to use instead.
This isn't a static method, so inject the service properly. \Drupal:: inside a class is a code smell, always.
The whitespacing here is AFU. :-) Please standardize. (Same in other methods, too. Looks like default indentation from an IDE that wasn't cleaned up.)
Is this really a check that's needed? If so, sounds more like an assertion type check.
Except there's a 3rd value here that's also returned.
Conventionally we're using $query, not $q, for query objects. Single-character variable names are almost never useful.
Period at the end of the line. (Same for other docblocks.)
Also, this description is wrong. recordHit() updates data. The text here suggests its just a reporting tool, which is not at all the case. Don't just duplicate the @return doc unless that's really *all* that happens.
Ibid.
As above, this method *modifies data*. The summary line should say why you'd call it, not replicate @return.
Don't return a DB object. That's breaking encapsulation. Return just the data you're asking for.
Looking at it again... is this even a relevant method for the interface? Seems like an internal implementation detail of the DB implementation (and thus should be only a protected method).
If the state system is always updated at the same time as the service, should it not be internalized into the service?
if ($storage->needsReset()) {
$storage->resetDayCount();
}
That needsReset is pulling from the state API under the hood is not this hook's business.
Comment is now redundant.
Same here. Let's hide the state API usage behind the service as well. Then ask how much should be shared between possible implementations of that service.
Seems this function should now be marked @deprecated? It's a one liner now, or should be. (return Drupal::service()->fetchViews());
Comment #79
timmillwoodThanks @Crell!
Implemented suggestions from #78.
Comment #80
Crell CreditAttribution: Crell at Palantir.net commentedDreditor says there's a stray whitespace on line 24.
@deprecated markers should indicate what to use instead.
This is now calling a method that is not on the interface, which means implicitly coupling it to a single implementation. You don't want that. :-)
Stepping back, what's the use of this function/method? How can it be described in a non-DB-dependent way?
Comment #81
timmillwood3. This is used in the block and the output passed into node_title_list. The query uses more than just the statistics' node_counter table.
Open for suggestions.
Comment #82
timmillwoodFixed 1&2
Added
statisticsTitleList
back to the interface. Developers overriding the interface will need to make sure it returns a database result object for node_title_list. I've added a @todo for me to rework this so that the minimum is needed from the service and a database result object for node_title_list is created elsewhere.Comment #83
Crell CreditAttribution: Crell at Palantir.net commented#82: Here's the problem. Say you write a MongoDB implementation of the storage class. Now statisticsTitleList returns... what? Having it return an SQL-related object makes no sense at all, and is probably not even feasible. That's not going to fly.
Tim: What is the data that a caller needs to get back from that method, and how would they use it? Specifically, can you describe it in a way that does not involve the word "result" or "database"?
Comment #84
timmillwoodHuge rework of the Statistics popular block to make it less dependant on the database.
Comment #85
dawehnerIs there a reason to not directly use $node->urlInfo()
Nitpick: Missing space after
'node_list'],
Needs some docs, if possible
If you want you can also use $request->server->get('REQUEST_TIME')
Comment #86
timmillwoodImplemented the suggestions from #85, apart from 4 because
$request->server->get('REQUEST_TIME')
looked to add a bigger overhead than just usingREQUEST_TIME
.Would be awesome to see this in before the next Beta next week.
Comment #88
timmillwoodThe test was looking for
\Drupal::l($node->label(), $node->urlInfo('canonical', ['language' => NULL]))
and now we're using node entity objects for the block links this wasn't working. Updated the block links to\Drupal::l($nodes[$count]->getTitle(), $nodes[$count]->urlInfo('canonical')
and the test to the same (removing the['language' => NULL]
option.Comment #89
timmillwoodComment #90
Crell CreditAttribution: Crell at Palantir.net commentedGetting super close! :-)
Space needed before {
Drupal::l() should be considered deprecated (even though not officially). The preferred way to pass around links is to use the Url object, which can safely be put into a render array and handles its own cache tag information.
And even baring that, \Drupal:: calls inside a class are a no-no. :-) Inject the generator service yourself.
Nitpick: Both implies 2 values, not 3. Also, Drupal uses an Oxford comma like all civilized people.
Typo on "strign".
Also, the "column to order by" is not descriptive. How do I know what the columns are? Column is an SQL-ism.
Instead, say the counter to order by, then list the valid options and what they mean. (That in the SQL implementation they happen to map to a column in a table is a coincidence.) I'd go a step further and then stick an assert() statement in the implementation to verify that $order is one of those allowed values. :-) (I blame AkiTendo.)
The docblock should explain what reset is and why it would be needed.
Comment #91
timmillwood@Crell - do you have any examples for 2.?
Comment #92
timmillwoodImplemented everything from #90
Comment #93
Crell CreditAttribution: Crell at Palantir.net commentedForgot to remove the commented out line. :-)
I've no idea of this is correct or not. :-)
Regarding returning false from a method: http://www.garfieldtech.com/blog/empty-return-values
If we're allowed to use assertions, that would be ideal. If not, throw \InvalidArgumentException.
Needs linebreak between first line and description. Plus the first line should be properly structured. Suggested:
"Returns true if it's been more than 24 hours since daily totals have been reset."
Comment #94
timmillwoodSuggestions from #93 (apart from 2. which ideally needs someone like Wim to review).
Comment #95
timmillwood@Crell is fussy when it comes to comments ;)
Comment #96
Crell CreditAttribution: Crell at Palantir.net commentedHey, it's Drupal's coding standard, not mine. :-) Looks good to me, modulo verifying the cache context info. RTBC aside from that. Nice work, Tim!
Comment #97
Fabianx CreditAttribution: Fabianx for Drupal Association commentedMissing doxygen.
Should this be per user.role as in the role is different or is this permissions based?
In that case should use user.permissions.
--
If we add the node list tag here, we could remove the tags from above as that one will override the other ones.
Do we really need the node list tags here?
Overall +1 to add cacheable metadata though :).
Comment #98
timmillwoodTaking on suggestions from #97
Thanks Fabian!
Comment #99
Crell CreditAttribution: Crell at Palantir.net commentedAaaaand scene. :-)
Comment #100
timmillwoodThanks Crell!
Fingers crossed for Wednesday's beta!
Comment #101
Fabianx CreditAttribution: Fabianx for Drupal Association commentednit nit nit nit - superfluous space at end of line, can be fixed at commit time.
RTBC + 1
Comment #102
timmillwoodUnassigning from me.
Comment #103
alexpottThe format for a deprecation is like this:
Needs to be indented and have space from end of line removed.
Constructs
Inject the request stack to get the request time. That makes this more unit testable. You could a have a protected method to get the request time from the current request.
Missing full stop at end of first line and needs a blank line after the first one.
Needs indenting. Plus IDs not ids.
The patch needs a general search for id/ids in documentation and replacing with ID/IDs.
Clean is an odd method name for deleting a specific node - how about deleteNode() and accept a node entity instead of just the nid - maybe an alternate storage would want to key by uuid. Hmmm I see all the other methods are accepting an nid. So leave it at that but this method should have a better name.
The 24 hours sounds like an implementation detail and not something that should be on the interface.
Comment #104
alexpottAfter discussing with @xjm and given #103.1 and the fact I can't see how this meets the criteria I'm going to postpone now. Since actioning the other feedback in #103 is taking resources away from fixing what we have to fix to get D8 out of the door.
I'm sorry @timmillwood I wish we'd done the beta evaluation earlier.
Comment #105
timmillwoodI will be working on the issues in #103 so switching this back to 8.0.x for future testing.
I'd still love to see this in 8.0.x as it's one the biggest improvement in the Statistics module ever.
Added a Beta phase evaluation
Comment #106
Wim LeersFound additional code style problems on top of those reported by Alex. And more severe problems, but nothing truly severe :)
Shouldn't this also be doing
getTranslationFromContext()
, to ensure you get the right translation?Suggests missing test coverage.
We use the short array notation for new code.
Should not be a space between
$items
and[]
.This looks correct; but I'd still like to see at least one integration test for it.
You can use
assertCacheTags()
andassertCacheContexts()
; they make it easy to check this :)(This is what you pinged me about for a review on Twitter — sorry for the delay!)
Strange indentation here.
Comment #107
timmillwoodwon't fix:
#103.9 - The aim of this function is to remove a node's view count if the node is deleted, so clean seemed right. I was also trying to steer away from using CRUD terms and node (because planning to make this work with all entities as soon as this patch is committed).
#103.10 - The reset is for the daily view count, so reseting at any other timeframe than 24 hrs would be odd.
Todo:
#103.5
#106.4
Comment #108
Wim LeersView statistics for all entity types? :)
Comment #109
timmillwoodYes! awesome, right?!
(might need to add an option to choose which entity types you want counted and which not)
#2532334: Count all content entities views in the Statistics module
Comment #110
timmillwoodAdding #103.5 and #106.4 to the patch in #107.
Comment #111
hussainwebFixing a trailing whitespace. I also wanted to see if we could reconsider adding this in 8.0.x. It will make migration much more simpler. I marked #2500521: Upgrade path for Statistics 7.x and 6.x as postponed on this issue. @alexpott, thoughts? I will work on a beta evaluation if you think it makes sense.
Comment #112
hussainwebI noticed that there was already a beta evaluation and updated it. I am going out on a limb and marking this as unfrozen as it helps migration (which is unfrozen). If that is not correct, please feel free to remove the unfrozen clause.
Comment #113
timmillwoodThanks for the support @hussainweb, I too would love to see this in 8.0.x, got loads more planned for Statistics in 8.1.x
Comment #114
Wim LeersThis comment is wrong, because it doesn't generate the render array for the entire block, it is merely a helper method for
build()
, which is the method that would fit this description.The
\n
between these shouldn't be there.$counts
feels like a strange variable name here.First off, it's
'contexts'
, and it needs to be an array.Second: where does the permission checking happen? I.e. I don't think we actually want this cache context? Perhaps node access grants are being checked somehow (I don't see where though?), and that's why? We have the
user.node_grants:view
cache contexts for that though. See https://www.drupal.org/developing/api/8/cache/contexts.Nit: s/module//
Indentation errors in these and other docblocks.
Don't method chain calls belong on new lines?
Please use
Cache::mergeTags()
instead ofarray_merge()
.Comment #115
timmillwoodUpdating with suggestions from #14, apart from #14.8 because
Cache::mergeTags()
only accepts 2 params, witharray_merge()
I can pass 4 in.Comment #116
Wim LeersThis is not yet updated.
RE: #14.8: In other tests, we then just call
Cache::mergeTags()
several times. Yes, annoying, yes, ugly, but that's because we chose to micro-optimize the performance ofCache::mergeTags()
for the most common case.Comment #117
alexpottHow exactly does it do this?
Comment #118
hussainweb@alexpott: Right now, there is no clean reusable way to write to node_counter table. The statistics module just directly runs queries or calls db_merge() to write. I started with doing the same in the migration destination handler and then realized that a storage service would be better. I then found this issue which does the same thing. See #2500521-4: Upgrade path for Statistics 7.x and 6.x for my comment there and the partial patch I have uploaded.
Comment #119
timmillwoodUpdating comment from #116
Adding annoying and ugly
Cache::mergeTags()
to test from #114.8.Comment #120
timmillwoodneeds reroll because of #2543554: Clean up Help & Statistics blocks
I'll take a look tomorrow (GMT), unless anyone else takes it before then.
Comment #121
hussainwebRerolled.
Comment #122
alexpott@hussainweb well we have a working d6 migration without this.
Comment #123
timmillwoodLooks like these are left over from the merge.
Comment #124
Wim LeersAs promised in #2543554: Clean up Help & Statistics blocks — which broke this — helping to resolve conflicts.
(No credit please.)
Comment #125
hussainweb@alexpott: The D6 migrations do not migrate the statistics themselves (which are stored in node_counter table).
I did a test migration from D6 to D8 just now and confirmed that the settings for Statistics module is migrated (which is d6_statistics_settings.yml), but not the data inside node_counter table itself. In fact, I did a grep on whole migrate code base and didn't find anything that would migrate data in node_counter table.
Comment #126
timmillwood@hussainweb - Because
recordHit($nid)
only records one count for the node id I'm not sure it'll help the migration, however we could add to the service to make it work better for migrations.Comment #128
phenaproximaThis blocks the migration path for Statistics 7.x to 8.x: #2500521: Upgrade path for Statistics 7.x and 6.x
Comment #131
timmillwoodThe issue looks to be related to the caching tests.
Comment #132
timmillwoodComment #133
timmillwoodPatch doesn't apply anymore ;(
Comment #134
kostyashupenkoRe-rolled https://www.drupal.org/node/1446932#comment-10193353
Comment #136
timmillwood@kostyashupenko++
Thanks so much for re-rolling this.
So what, it doesn't pass the tests, at least it now applies. This has been on my todo list for over 6 months!
I will try to find the time over the next week to get this passing so we can get it into 8.1.x
Comment #137
Wim LeersHere's a quick review for you. Mostly nits. Because mostly looking good already :)
Since https://www.drupal.org/node/2549139, we shouldn't inject the entity manager anymore.
So is this
int[]
?We can typehint:
array $nids
Better yet would be to use
\Drupal\Core\Render\RendererInterface::addCacheableDependency()
, then any entities that have max-age set will also have that work, and less manual code here.80 cols.
Is it allowed to return other things besides
totalcount
,daycount
andtimestamp
?If not, then shouldn't we introduce a value object rather than returning such arrays? That'd enforce alternative implementations to not return additional things.
Eh… wow :P
What about:
Why not just
statistics.storage
?Comment #139
timmillwoodFixed many of the suggestions in #137 but also wanted to retest against 8.2.x
Comment #141
timmillwoodFixing tests.
Comment #142
Wim Leers#139 addressed everything in #137 except points 4 and 6.
#141: interdiff is missing.
Comment #143
mparker17Here's an interdiff between 139 and 141.
Comment #144
timmillwoodImplementing points 4 & 6 from #137.
Comment #145
andypostnit
Comment #146
timmillwood@andypost - Thanks, sure that can be fixed on commit ;)
Although I'll provide a new patch ASAP.
Comment #147
dawehnerIts still kinda weird for me to introduce an interface which is not entity generic. Could we not provide this interface, but throw an exception for other entity types at least for now?
Comment #148
timmillwood@dawehner - It's always been my plan to make this thing entity generic, but baby steps, nodes first, then open it up. Although I think you make a good point about making at least the interface generic now.
Comment #149
timmillwoodI have addressed the item from #145.
I've also updated many elements such as replacing "node" with "entity", and "nid" with "id", but I think we are a little limited in how generic we can make this in the scope of this issue.
I would like to propose a follow up issue to replace the node_counter table with a statistics table, and make this fully generic. Then on each page load find a way to know which entities have been rendered and count the view of those.
Comment #150
timmillwoodComment #153
timmillwoodFixing broken test.
Comment #154
Manuel Garcia CreditAttribution: Manuel Garcia as a volunteer commentedCan we do this now since #2408013: Adding Assertions to Drupal - Test Tools. is now fixed?
Comment #155
dawehnerWhat about getting rid of the needsreset method and handle that interally inside $storage? So what just having a
reset() method which does both?
Comment #156
timmillwoodThis should address items from #154 and #155.
Comment #157
dawehnerI think what you would use is something like
assert('in_array(\'total_count\'...')
This allows the code to not run during runtime then.
Comment #158
timmillwood#157 resolved.
Comment #159
timmillwoodAs this issue in now approaching 4.5 years old it'd be great to get further reviews / RTBC.
Comment #160
dawehnerYou could load the entities by getting the entity storage directly.
You could use
CacheableMetadata::createFromObject($node)->applyTo($item);
and get rid of the renderedJust from the interface it is not obvious that this also changes state, but I guess this is fine? Still, its own DB update per get query. If we would separate it, we would potentially have less update queries.
nitpick: this isn't needed anymore.
I love how this interface is not aware of the entity type!
Some documentation what this object is could be helpful. I'm wondering whether we can come up with a slightly better name, maybe like StatisticsViewsResult, similar to AccessResult?
Is it okay to have this kind of "api change" on statistics.php? People might still pass along the nid
How would it not be instanceof the interface? Isn't it guaranteed to do so?
Note: the patch contains a lot of unnecessary changes like this one. You know, it just makes it a bit harder to review :)
Note: We create two request objects here, is this really what we want? The second one should not be needed actually
Should we have some fallback built in?
Comment #161
Wim Leers#160.2: disagreed, this is the best practice. It'd make sense maybe if it's in a critical path, but this is only called when rendering anyway, so the renderer service will definitely already be initialized.
Comment #162
BerdirReviewed separately from @dawehner, looks like we found many similar things.
Would it make sense to make this multiple or have a multiple version of this?
Especially for the views field, this could be useful if you don't have do 50 queries on a list with 50 nodes.
also, static caching might make sense, again to avoid multiple queries when showing day and totalcount in the same view.
I don't feel like setting the state belongs into this method, it's just an API, it should assume for what it is used (some search specific scale)
no longer needed.
This is all node specific, talking about entity here seems strange to me.
does this really need a return value?
I think this needs more description for what is expected from the implementation. you can't just always delete the data, it is your responsibility to check and only do it once per day. (or do nothing, if your backend does it automatically)
I think we no longer use interfaces for value objects?
Isn't this an API change? Doesn't seem necessary or related?
shouldn't use a fully qualified namespace here.
I don't think we can just remove a function like that?
clean() doesn't seem very self-explaining to me, sounds more like a general purpose garbage collection method that we have elsewhere.
We have fetchViews() (and recordHit, why not view?), so why not deleteViews($id)?
all those changes look unrelated?
this is a bit of a regression performance wise... bunch of additional services
Comment #163
BerdirWe have a specific list of nodes now in this method, so we no longer need the list cache tag *here*. We do however still need it in the block in general.
Comment #164
malleziePatch still applies, so starting to work on feedback.
Comment #160
Comment #162
#163
Not sure what to do here. Any pointers are welcome.
This last one, and the $nid -> $id change which could be a BC break are only two not fixed here according to me. Let's hope to finish this one before 8.2 closes. If i read this correctly we can change a lot of things afterwards when we get this one in. Tentatively moving to major therefore.
Comment #166
mallezieFix some tests.
Comment #167
timmillwoodI guess for #163 we just remove the cache tag?
Also $nid -> $id, I think it's safe to keep using $nid for now.
Comment #169
mallezieThanks cache tag and nid->id changes removed.
And a hunch for the last test fails. Not sure however, let's see.
Comment #171
mallezieAnd missed on nid -> id conversion.
This should pass. However the performance concern from Berdir (#162 point 13 is still open here). Not sure how much this is an issue?
Comment #173
mallezieLast one for today.
Fixes another test, and merges some docs. I can't figure out where those last fails are coming from, if someone could help here, that would be highly appreciated.
Comment #175
timmillwood@mallezie - I had a quick look and it seems as though for
StatisticsAdminTest
thenode_counter
table isn't getting setup, I see\Drupal\statistics\StatisticsDatabaseStorage::recordView
throwing an exception.No idea why, there doesn't seem to be any notable difference between this test and the other test.
Comment #176
BerdirRe #164:
1. Doing it later is an API addition and will be more complicated than just doing it now. But I don't have a strong opinion, it's not a regression to now I think.
4. I don't see how the interface can handle it with a single argument? Either you talk about node and just have an ID, or you support all entities (maybe throw an exception on anything but node) and have two arguments. You can't change it later.
Sorry, #163 was confusing. This is a bit tricky. Technically, you do need the list cache tag, just like node_title_list. So the code there was pretty much correct. merging in the id cache tags would be "more correct". Problem is, that means *every* time a node is created or saved or deleted, this block needs to be rebuilt. That's quite slow. Especially since we care more about changes in hits than new or updated nodes.
So what I think would be the best combination is only add the cache tags for the specific nodes and set the same max age as we do in statistics_node_links_alter(). That would have the following effect:
* The block is refreshed once per hour, reflecting hit changes
* The block is refreshed when the nodes that are currently shown are updated (so we get a title change for example) or deleted
* The block is *not* updated when new nodes are created or nodes other than the ones shown are updated.
However, this is not really related to this issue, so I'm fine with keeping the current behavior (list cache tag + ids) and deal with max age in a follow-up. (We definitely should create that follow-up though).
Comment #177
mallezieThanks for the reviews.
this patch fixes the test failures, and adds a fetch multiple views method. (Caching could be done in a followup if i'm not mistaken). And reverted the node_list caching.
Todo:
-Discuss the entity type genericness, since this is not done here. Thanks berdir fro pointing that out, this indeed needs to be done.
-Create followup for caching without node_list things
-Create followup for Static caching for fetch view method
I'll get on those next. (Just testing first).
Comment #178
mallezieYes. Green!
Comment #179
mallezieCould we rename StatisticsDatabaseStorage to NodeStatisticsDatabaseStorage? And get away for the entity agnostic approach that way?
I like the fact that the interface is not bound to an entity type. We can then just create a new backend for new (all entities for example) statistics and implement with an entity_count table for example.
Comment #180
mallezieRenamed to NodeStatisticsDatabaseStorage.
Created cache improvement issue: #2776223: Adjust caching for StatisticsPopularBlock
Created static caching issue: #2776225: Use static caching for fetching View
Comment #181
mallezieLeft a debug statement in.
Comment #182
BerdirI don't really get how exactly you plan to do this. The interface is still not using an entity type, which means using it for a generic implementation would be impossible. It's not just the implementation, also the interface that is node specific. If you want to rename that too, that would work. But then we have to be clear that there is no easy way to support other entity types.
The alternative approach would be to add the entity type argument to all relevant methods in the interface and throw an exception if it's not node. That would be much easier to replace with a more generic implementation. Also possibly by a contrib module instead of core.
I'm not saying that either option is better, but we have to decide between the two, what you have right now is kind of like a middle thing that I don't see how it would work exactly :)
Comment #183
dawehnerWell, you could have in the future a factory in the container which returns the instance you actually want to use. For now it would be great to have 'node' somehow as part of the service name, to indicate what this is actually about.
Comment #184
mallezieSo if i read this correctly we shoul rename the interface StatisticsStorageInterface to NodeStatisticsStorageInterface? (And adjust the entity specific comments?)
Comment #185
timmillwoodI wonder if we can update the interface to be generic, accepting either
entity_type_id
, andentity_id
, or justentity
. The service could stay node specific for now, but just use the generic interface?Comment #186
BerdirYes, one of those two options is what I was suggesting :)
But @dawehner has a point, that would work as well. Generic interface, node specific implementation (current patch), then we should rename the current service to statistics.storage.node. And later, we could add a factory as statistics.storage_factory or so, where you could do this: \Drupal::service('statistics.storage_factory')->getStorage('comment')->... and the generic implementation would get the entity type injected in the constructor.
So those are our options:
1) Node specific interface and implemention. later: both a new interface and implementation.
2) generic Interface with $entity_type_id and $id methods, currently only a node specific implementation, throw exception for other types. later: provide a generic implementation.
3) node specific implementation, generic interface without entity type argument. later: a factory that creates storages for an entity type.
And I guess we should rename the service to storage.node no matter what option we pick.
I like 1) the least, as it doesn't really offer a way forward. both 2 and 3 sounds fine to me. So now we have to decide ;)
hm, one feature we seem to lose here is the ability to read this data from the replicate db server if configured. To keep this, we'd have to inject two different connections, which has the downside that it would open the connection to the replica already. not sure if it's worth it.
nitpick: patch gets even a bit smaller if we do not change $nid to $id here.
Comment #187
timmillwood1) yes, not sure it's worth it either.
2) Good point!
Comment #188
mallezieI really like dawehners option the most. So option 3.
- Renamed the service here
- Made the patch smaller.
I think this is approaching readyness.
Comment #190
mallezieDid pass ;-)
Comment #191
timmillwoodI guess the final thing before RTBC is making the interface entity type generic.
Comment #192
BerdirNo, we don't have to do that with 3. The storage would be similar to entity storage, you request a storage for a given type from the storage factory. We just need add a factory and an implementation that receives the entity type id in the constructor.
Comment #193
timmillwoodok, think we're ready to go then!
Comment #194
alexpottThis is the only place that writes to the node_counnter table - it'd might be nice if the schema was here too. We've done this for quite a few other tables.
See \Drupal\Core\Cache\DatabaseBackend::schemaDefinition()
. In this instance we could just call it in the existing hook_schema in the install but then everything is kept together. Don't put the method on the interface. At some point we'll probably introduce an interface for this so we can discover services that have schema backends.I don't think we can move to on demand creation because of the views integration.
I'm not 100% certain about this - requires more discussion.
This change (removing the use of node_title_list()) is completely unrelated to the change afaics. It should be done in a separate issue.
Comment #195
Berdir@alexpott:
1. Fine with me as long as we manually create it for now.
2.
The change is necessary because node_title_list() does not receive ids, it receives a query result object. We can no longer use that.
Comment #196
timmillwoodSwitching $result for $nids to make it a bit clearer that why we're not using node_title_list().
Moving the schema from .install to the service.
Comment #198
timmillwoodLooks like we're too early to call
\Drupal::service('statistics.storage.node')->schemaDefinition()
in hook_schema.Comment #199
timmillwoodBased off patch in #188.
Switching $result for $nids to make it a bit clearer that why we're not using node_title_list().
Comment #200
mallezieHmm that's a strange failure.
First googling points to https://bugs.php.net/bug.php?id=50688
Lets do a retest first.
Comment #201
alexpott@mallezie that's not this issues fault. That's #2762549: Drupal\field\Tests\Update\FieldUpdateTest, Drupal\views\Tests\Update\EntityViewsDataUpdateTest and Drupal\comment\Tests\CommentFieldsTest fail on 8.1.x.
So re #194 - both points I made turned out to not be good. Sorry everyone. Unfortunately this missed the 8.2.0 beta deadline. Hopefully it'll be an early commit to 8.3.x.
@Berdir are you happy with the current direction? I was wondering why no rtbc in #192.
Comment #202
dawehnerFor some "obvious" reasons I don't disagree with the latest approach :)
It allows us to be flexible in the future, while being pragmatic now.
follow up: this could use
$node->toLink()->toRenderable()
Comment #203
BerdirYes, I am, RTBC++
Comment #204
mallezieSetting to 8.3 and created #2778167: Build link in Statistics block with toRenderable() as a follow-up, thanks for pointing that out dawehner.
Comment #205
alexpottCommitted 22ab6e5 and pushed to 8.3.x. Thanks!Coding standards fixed on commit.
Comment #206
alexpottDidn't push - I was just checking for a change record and I realised it didn't have one. Sorry. We need one becuase this is a change - and since we need that we should apply the fixes in #205 to the patch.
Comment #207
alexpottAlso the issue summary should reflect the final agreed solution. If the issue summary was up-to-date the CR would be easy.
Comment #208
timmillwoodComment #209
timmillwoodAdding change record: https://www.drupal.org/node/2778245
Comment #210
alexpottComment #211
alexpottOkay we have a change record and the issue summary has been updated.
Comment #213
alexpottCommitted ec59a51 and pushed to 8.3.x. Thanks!
Same fixes as in #205 applied on commit.
Comment #215
jibranUploaded the patch for 8.2.x so that I can use it with 8.2.x. Please ignore this comment.
Comment #216
jibranPlease ignore the comment.
Comment #217
webchickComment #218
webchickLooks like this might have introduced a critical: #2867493: Error: Call to a member function getTotalCount() on boolean in statistics_get()