Keep flag_counts current by manual in/decreasing of count. [#719616]

Comment	File	Size	Author
#66	flag_increment_last_updated-d6.patch	1.51 KB	quicksketch
#66	flag_increment_last_updated-d7.patch	900 bytes	quicksketch
#55	flag_increments_fix-d7.patch	975 bytes	quicksketch
#49	flag_increments_fix-d7.patch	1.01 KB	quicksketch
#44	flag_increments-d6.patch	4.32 KB	quicksketch
#44	flag_increments-d7.patch	4.59 KB	quicksketch
#43	flag_node_inccount.inc_.txt	2.48 KB	ralf.strobel
#42	flag_node_inccount.inc_.txt	2.48 KB	ralf.strobel
#41	flag_node_inccount.inc_.txt	2.66 KB	ralf.strobel
#40	flag_node_inccount.inc_.txt	2.8 KB	ralf.strobel
#39	flag_node_inccount.inc_.txt	2.72 KB	ralf.strobel
#31	719616-flag-count_2.patch	4.21 KB	dawehner
#28	719616-flag-count_2.patch	4.01 KB	andrewlevine
#25	719616-flag-count_1.patch	3.78 KB	andrewlevine
#17	719616-flag-count.patch	4.28 KB	dawehner
#11	719616-flag-count.patch	4.87 KB	dawehner
#5	719616-flag-anonymous_cleanup.patch	3.85 KB	dawehner

Comment #1

he/him

commented 20 February 2010 at 03:41

Hm, fascinating. I think you are correct. The Flag _update_count() method is responsible for updating counts and sure enough it updates the counts based on the current number of votes in the flag_content table. So yes, if anonymous user flags expire and are removed, the count will decrement.

I suppose the only way to fix this problem would be to estimate counts by incrementing by one on flag and decrementing by one on unflag. It's certainly possible, though this definitely lead to some discrepancies between doing a COUNT(*) on the flag_content table and what you'd get from the flag_counts table.

Log in or register to post comments

Comment #2

andrewlevine commented 22 February 2010 at 17:02

Assigned:

Unassigned

» andrewlevine

We could increment and decrement, but as you said that would inevitably lead to discrepancies when server problems occurred. Less so if we could use transactions...Here are 3 other possible solutions:

1. Don't wipe out entries from flag_content with sessions at all (or give the option not to). I believe that the reason that flag_session_api_cleanup existed at all was because there was worry that unneeded flags, possibly caused by search engines, would hurt performance of the table. Maybe, but I'm not sure it's the module developer's place to make that performance/data tradeoff. I could see [content_id/timestamp] (which is wiped out in flag_session_api_cleanup) being potentially useful data even without a user to associate it with. Certainly the total count is useful.
2. If we need to wipe out the data from flag_content, we should at least store the count. We can add an anon_count column to flag_counts. Then, in flag_session_api_cleanup before running the deletes, we get counts for each flag and update flag_counts with that number. Then when we run _update_count() we do the select count(*) from flag_content + anon_count.
3. Same situation as #2 except instead of updating anon_count in flag_session_api_cleanup, we increment/decrement anon_count in the same way you suggested.

Both 2 and 3 might create inaccurate anon counts, but it's better than deleting them all like now :) I don't think #1 is a bad choice either, and it's much simpler.

What do you think? I'll write the patch for whichever solution you prefer, as we need it here at Sony.

Log in or register to post comments

Comment #3

quicksketch

he/him

commented 23 February 2010 at 01:47

My feeling is that the only valuable information we get from anonymous flags once the user has left the site is the count. The user's anonymous flagging information is only important for the duration of that user's visit, such as if they are building a schedule of flagged events or a set of flagged products. Once their session expires, knowing their information is no longer particularly useful, since it will be unable to re-associate itself to the session if the user returns to the site.

Doing things like storing "anonymous count" separate from "authenticate count" presents a number of technical challenges, not to mention requiring updates to our views integrations and adding some very confusing configuration options. In Views, the administrator would have to choose whether they wanted anonymous, authenticated, or the total number of counts when doing sorts or filters. In our Actions integration, we'd need to set up different rules for anonymous vs. authenticate votes, etc. etc. It all seems like a lot of trouble.

So my inclination is that we keep things relatively simple. Basically I think we should move to simple increment/decrement system for the flag_counts table and accept that the flag_counts table won't always be identical to COUNT(*) of the flag_content table. Anonymous flag_content should still be cleaned up as they are currently, according the settings of the Session API module.

Log in or register to post comments

Comment #4

andrewlevine commented 12 April 2010 at 12:30

Assigned:

andrewlevine

» Unassigned

Hey, sorry, I actually don't have time to get around to this at the moment. I'm marking it unassigned, but will reassign it to myself if I can get back to this before someone else does.

Log in or register to post comments

Comment #5

dawehner

German

commented 28 May 2010 at 16:01

Status:

Active

» Needs review

Status	File	Size
new	719616-flag-anonymous_cleanup.patch	3.85 KB

I vote for 1) I have the usecase that we store additional stuff per flag with the flag_note module.
The usecase is to let mark comments as bad, even for anonymous users.

So this should be configurable per flag and sure per default it should be active.

So here is a initial patch. Its more a design-review than a code review status.

TODO:

* Write a test
* Figure out a better warning message for users which want enable this feature.

Log in or register to post comments

Comment #6

dawehner

German

commented 28 May 2010 at 19:55

Status:

Needs review

» Needs work

But we should maintain the flag count(for cleanup-sessions, too).
This might be configurable, too.

Log in or register to post comments

Comment #7

quicksketch

he/him

commented 28 May 2010 at 20:06

I don't like the idea of putting this decision on administrators. Even as the author of the module, using this option gives me the feeling of unknown consequences. If you wanted to approach the problem this way (not cleaning up expired sessions), you could set Session API to never expire sessions and then Flag would never clean up its related information. Of course this would be site-wide rather than flag-specific, but I really don't like the idea exposing administrators to such a confusing option.

Log in or register to post comments

Comment #8

dawehner

German

commented 29 May 2010 at 07:45

You are right, this might confuse a lot of users.

Additional your workaround helps.

Would it be possible to expose this option to developers: Use the change in session_api_cleanup but don't build an ui. It would be easy
to build an extra module which extends the ui to expose this if a site needs this.

Log in or register to post comments

Comment #9

quicksketch

he/him

commented 29 May 2010 at 11:00

Are you looking for a flag-specific cleanup or a site-wide option? I'd be fine with a simple variable_get() call to exclude flags from clean-up.

All told though, I'm still more inclined to have a -/+1 increment system for the flag_counts table (rather than depending on COUNT() SQL). Ultimately I think this would be a better solution so we're not keeping tons of data in the flag_content table for no reason. If a Session API session has been removed, we don't have *anything* of value in the flag_content table, not even an IP address (though I suppose you still have the flagged-time). If you were building a high-traffic site with a lot of anonymous flagging going on you would want either: A) to keep all session information including the IP address of people that flagged (for completeness) or B) not keep the flagged time or the IP address for performance reasons (keeping the table size smaller). Right now we can already accomplish A but not B.

With this option of a hidden variable, it's sort of a half-fix, where the IP Addresses are discarded (keeping the Session API table trim), but the Flag time is kept forever (meaning the flag_content table could grow to millions of rows). If you've got a DB that can handle millions of rows in the flag_content table, you could probably manage to keep the Session API information also. Considering people are likely to flag multiple pieces of content, the Session API table would be smaller in size than the flag_content table, meaning flag_content would likely be your performance bottleneck.

Log in or register to post comments

Comment #10

dawehner

German

commented 29 May 2010 at 11:34

I can describe the usecase: We use flags for dirrent kind of stuff.

- One is, i like this thing, than the cleanup + flag_counts table-solution works fine
- The second one is: let the users flag bad comments, and write a note why this comment is bad. The note is stored with flag_note. Flag_note alters the schema to add the note-column to the flag_contents table. Once the moderators looked at the flag entry, we remove the item from the flag_content table.

So a site-wide variable is sadly not the best solution. Sure we could disable it, and cleanup flag_content manually which we need to cleanup. So i will not complain about a site-wide-option.

We care about performance, too.

I think, too, that there should be the increment system. I will do this patch.

Log in or register to post comments

Comment #11

dawehner

German

commented 29 May 2010 at 12:19

Status	File	Size
new	719616-flag-count.patch	4.87 KB

I did quite some progress.

I don't set it to NR, because it needs more manual testing. It would be cool if you could comment on the code already.

Log in or register to post comments

Comment #12

quicksketch

he/him

commented 29 May 2010 at 19:50

Nice, looking great so far. It looks like this patch currently maintains the exact current behavior. I think this approach is much more comprehensive than the original approach in #5, though I still don't think it would be a good idea to expose turning off clean-up as an administrative option. I think we could consider using your previous suggestion of making it a flag property though (like $flag->session_cleanup), only this property would have to be added through either hook_flag_default_flags() or hook_flag_alter(). So the option would be there, but just not in the UI.

Log in or register to post comments

Comment #13

dawehner

German

commented 29 May 2010 at 23:00

Its cool that you might accept a patch which is not exposed as UI. I might consider a extra custom module to make an UI, but this does not matter here.

+  // Update the count of each flag.
+  $result = db_query("SELECT count, fid, content_id FROM {flag_content} WHERE sid IN (' . implode(',', $outdated_sids) . ')
+    GROUP BY fid, content_id");
+  $counts = array();
+  while ($row = db_fetch_object($result)) {
+    $counts[$row->fid][$row->content_id] += $row->count;
+  }
+  foreach ($counts as $fid => $ids) { 
+    foreach ($ids as $content_id => $count) {
+      $flag = flag_get_flag(NULL, $fid);
+      $flag->_decrease_count($content_id, $count);
+    }
+  }

Do we really have to do it here? Isn't it enough to just delete the the stuff from flag_content?

Log in or register to post comments

Comment #14

andrewlevine commented 30 May 2010 at 03:59

@dereine
One suggestion. The patch in #11 will be prone to race conditions on high traffic sites because the increment and decrement functions do a SELECT and then an UPDATE based on that select. It is possible with concurrent requests that the SELECT part would happen in 2 concurrent requests before the UPDATE happened in either, causing incorrect counts.

Instead, we should do something like UPDATE {flag_counts} SET count = count + %d .... so the database will manage the race conditions.

Log in or register to post comments

Comment #15

andrewlevine commented 30 May 2010 at 04:17

@dereine / @quicksketch.
I also don't completely understand why in the patch we are decreasing the counts of flags when stale sessions are being removed. We want to keep track of those counts even though the user never logged in...right?

If I'm not totally off here, we should just be able to remove the changes to flag_session_api_cleanup in this patch.

Log in or register to post comments

Comment #16

dawehner

German

commented 30 May 2010 at 07:41

@adrewlevine

You are right see #13

Log in or register to post comments

Comment #17

dawehner

German

commented 30 May 2010 at 08:15

Status	File	Size
new	719616-flag-count.patch	4.28 KB

Here is a new patch with the things from #14 and #15.

Log in or register to post comments

Comment #18

andrewlevine commented 30 May 2010 at 15:19

Status:

Needs work

» Needs review

This patch looks good to me. I will test it as soon as I can.

It doesn't solve your flag_note/session_api problem. Shall we open another issue to work on that?

Log in or register to post comments

Comment #19

dawehner

German

commented 30 May 2010 at 16:29

Title:

Unwanted deletion of anonymous (stale session) flags?

» Keep flag_counts current by manual in/decreasing of count.

We have a patch already, see above :) Here is the new issue.

Renaming this issue .

#813342: Make cleaning of flag_content configurable

Log in or register to post comments

Comment #20

andrewlevine commented 30 May 2010 at 17:11

Status:

Needs review

» Needs work

Changing this to needs work since we don't have an actual patch. But I think we have the right idea of what the right way to do this is now: Make a per-flag API option (not available in the UI) that determines whether or not the flags are wiped based on expiring session_api sessions.

Log in or register to post comments

Comment #21

andrewlevine commented 30 May 2010 at 17:11

Status:

Needs work

» Needs review

Whoops! i got my issues mixed up...sorry.

Log in or register to post comments

Comment #22

andrewlevine commented 30 May 2010 at 19:12

Status:

Needs review

» Reviewed & tested by the community

Tested this in a very simple scenario with one flag type with both anonymous and authenticated users. Clicked flag and unflag multiple times in two different sessions and the numbers added up correctly in the counts table. I think this one is ready to go.

Log in or register to post comments

Comment #23

andrewlevine commented 30 May 2010 at 19:19

Also tested with two flags on the same node type. No problems

Log in or register to post comments

Comment #24

quicksketch

he/him

commented 30 May 2010 at 19:23

Status:

Reviewed & tested by the community

» Needs work

There's something a bit fishy about our decrement count function:

  function _decrease_count($content_id, $number = 1) {
    db_query("UPDATE {flag_counts} SET count = count - %d WHERE fid = %d AND content_id = %d", $number, $this->fid, $content_id);
    if (!db_affected_rows()) {
      db_query("INSERT INTO {flag_counts} (fid, content_type, content_id, count) VALUES (%d, '%s', %d, %d)", $this->fid, $this->content_type, $content_id, $number);
     }

Why should we be inserting a row if no rows were affected? This means that if flag's count was already 0 (no row in the database), calling _decrease_count() would set the flag count to 1.

Log in or register to post comments

Comment #25

andrewlevine commented 30 May 2010 at 20:36

Status:

Needs work

» Needs review

Status	File	Size
new	719616-flag-count_1.patch	3.78 KB

@quicksketch, you are right. I have removed this line because it makes no sense. I actually also removed the lines that delete rows with a count of 0, because running this DELETE query on every single unlike is a big price to pay just to delete flags that have been liked and then unliked to 0.

If we want, we can run that delete query in a cron hook.

Log in or register to post comments

Comment #26

andrewlevine commented 30 May 2010 at 20:41

Just want to make one more comment...there is a race condition here:

  function _increase_count($content_id, $number = 1) {
    db_query("UPDATE {flag_counts} SET count = count + %d WHERE fid = %d AND content_id = %d", $number, $this->fid, $content_id);
    if (!db_affected_rows()) {
      db_query("INSERT INTO {flag_counts} (fid, content_type, content_id, count) VALUES (%d, '%s', %d, %d)", $this->fid, $this->content_type, $content_id, $number);
    }
  }

If two concurrent likes run the update and no rows are affected, then the two inserts will run and one will fail with a duplicate key error. IMHO this tradeoff is OK because the race conditions will be rare, so the patch is still committable but we may want to make an issue about the race conditions so we can solve it with database-dependent queries later.

Log in or register to post comments

Comment #27

quicksketch

he/him

commented 30 May 2010 at 21:33

I think the immediate cleanup of empty flag counts is rather important. For the sake of data consistency, a piece of content that has no flag counts should be identical to a piece of content that has never been flagged at all. Otherwise we'll get into views and queries that are returning unexpected things because the count could be NULL or 0, whereas currently a piece of content that has a count of 0 will always be NULL.

Log in or register to post comments

Comment #28

andrewlevine commented 30 May 2010 at 22:46

Status	File	Size
new	719616-flag-count_2.patch	4.01 KB

I'm not entirely sure I agree because you're not really going to have consistency without transactions (I believe there is already an extra filter in the views counts handler filtering out 0 because of this) and it's a significant performance difference, but that's a discussion for another issue :)

I have put the DELETE statement back in with this patch

Log in or register to post comments

Comment #29

dawehner

German

commented 31 May 2010 at 05:46

+    if (db_affected_rows()) {
+      // Delete rows with count 0, to save some space in the database.
+      db_query("DELETE FROM {flag_counts} WHERE fid = %d and content_id = %d AND count = %d", $this->fid, $content_id, 0);
+     }

I'm not sure why we have to check for affected rows here. I think the decreasing is only executed, when there is something flagged. So db_affected_rows will always be true.

Log in or register to post comments

Comment #30

andrewlevine commented 31 May 2010 at 15:14

I put that in only to keep the logic the same as the original patch. I suppose it's good to prevent the DELETE query from happening after a failed UPDATE query, but it's unnecessary really.

Let's take it out.

Log in or register to post comments

Comment #31

dawehner

German

commented 31 May 2010 at 17:13

Status	File	Size
new	719616-flag-count_2.patch	4.21 KB

Here is the new patch.

Log in or register to post comments

Comment #32

mgriego commented 29 November 2010 at 23:28

I've looked at this issue as well as the follow up issue #813342: Make cleaning of flag_content configurable. I have to say that, for our purposes, both would do the trick, but I like this option better as it allows for cleaning up old/stale data. Of course, we only really care about the counts for the flags in question, so the anonymously-created data becomes completely extraneous once the originating session is gone. There is one thing I would recommend, though. Like others, I am concerned about consistency of data, and the last posted patch still has me concerned about race conditions on a sufficiently-busy site. As such, I think it would be prudent to wrap the query calls in _increase_count() and _decrease_count() with calls to the core locking mechanism (lock_acquire()). Doing that would provide transactional consistency of the flag counts during simultaneous flag/unflag operations.

I can only really see one possible use case where one might want to keep the actual data around (a la issue #813342: Make cleaning of flag_content configurable), and that is when one is interested in time data. Once the originating session has expired, the only data that I can see that would possibly matter is the timestamp on an individual row in case the site builder, for instance, wanted to be able to provide an analysis of how often a flag was flagged during a particular time period. Other than that, the actual row-level data becomes completely unnecessary, and the patches here (with the addition I mentioned) handle the remaining use cases.

If interested, I can put together an updated patch that uses the locking mechanism, if it's available, to provide some data integrity.

Log in or register to post comments

Comment #33

mooffie commented 8 March 2011 at 16:28

I haven't thoroughly read this discussion, but there's one more thing to consider:

Some day we want to add a last_flagged (and possibly first_flagged) timestamp column to the statistics table. By "statistics table" I mean {flag_content} (It won't hurt to rename this table!). Currently, the use of SQL aggregate functions makes the implementation of this easy (We could simply replace "SELECT COUNT(*)" by "SELECT COUNT(*), MAX(timestamp), MIN(timestamp)", in the _update_count() method), but the proposed patch might complicate matters.

Log in or register to post comments

Comment #34

garbo commented 1 April 2011 at 17:29

Hi, I want to make a case to have this as an option available for site-developers. I use the flags module as a voting mechanism (thumbs up if a user likes a piece of content). These votes should never be reset (or at least not before the end of the contest).

And also in the case flags is being used a a Digg-like system I can imagine that flag counts should never be reset.

At least it should be clearly communicated to the developer that the flagcount will be reset when anonymous sessions expire. It came as a shock to me.

BTW, I manualy applied the patch from #32 to flag 6.x-2.x-dev and it works as desired for me.

Log in or register to post comments

Comment #35

aschiwi commented 27 May 2011 at 10:33

#31 fixes a huge problem we were having with anonymous flagging. Thanks!! Counts are now correct. Before, counts on some nodes were not increasing correctly.

Log in or register to post comments

Comment #36

ralf.strobel commented 9 July 2011 at 21:12

I would like to voice my support for an in-/decrementing solution (at least as an option) for two reasons:

1) Slight discrepancies between the count table and the content table are neglectable for larger, Digg-like, sites. They can even be desired. We are, for instance, working on a system that will initialize the count of a voting flag with a value generated from node context.

2) As of now, every flag/unflag triggers a count query, which is quite expensive.

Log in or register to post comments

Comment #37

ralf.strobel commented 9 July 2011 at 21:31

The same problem is still present in the latest 7.x development version, by the way. So you might want to change the version of this issue and eventually backport the solution.

Log in or register to post comments

Comment #38

quicksketch

he/him

commented 9 July 2011 at 22:46

The 2.x versions of Flag are maintained in-sync, so it doesn't matter which Drupal version the patch is targeted at, it will be applied to both branches at the same time.

Log in or register to post comments

Comment #39

ralf.strobel commented 10 July 2011 at 21:34

Status	File	Size
new	flag_node_inccount.inc_.txt	2.72 KB

What I did for now to solve the problem for us (7.x), was to write a child class of flag_node (attachment) that overrides the necessary functions. We can use that as a base for our own flag classes.

This way I didn't have to patch the actual module code, but I still don't feel very good about it, since overriding low level functions is not quite best practice either. So I'm hoping this behavior will become the default eventually.

By the way, once the object oriented code of flag.inc becomes php5 compliant, you should make functions like _flag(), _unflag() and _update_count() protected, not private as the documentation currently labels them. You'll never know who wants to override them. Take me as an example.

Log in or register to post comments

Comment #40

ralf.strobel commented 17 July 2011 at 07:47

Status	File	Size
new	flag_node_inccount.inc_.txt	2.8 KB

Minor improvement to _decrease_count(). The new version is safe to use with numers lower than the current db value.

Log in or register to post comments

Comment #41

ralf.strobel commented 21 July 2011 at 06:29

Status	File	Size
new	flag_node_inccount.inc_.txt	2.66 KB

Another little improvement. _increase_count() now uses the more favorable db_merge().

Log in or register to post comments

Comment #42

ralf.strobel commented 27 July 2011 at 21:19

Status	File	Size
new	flag_node_inccount.inc_.txt	2.48 KB

I've found a better method to override the default node class. It's kind of a hack but still works better than declaring a new fake entity type like I did before. Instructions in the updated class file.

Log in or register to post comments

Comment #43

ralf.strobel commented 27 July 2011 at 21:29

Status	File	Size
new	flag_node_inccount.inc_.txt	2.48 KB

Arrg... found another error. You know, this would be easier if you could edit file attachments along with the post content. :)

Log in or register to post comments

Comment #44

quicksketch

he/him

commented 26 September 2011 at 04:32

Status:

Needs review

» Fixed

Status	File	Size
new	flag_increments-d7.patch	4.59 KB
new	flag_increments-d6.patch	4.32 KB

I've committed this patch which makes the following adjustments/merging:

- I've used consistent verbiage across the documentation/phpDoc between the D6 (andrewlevine/dereine) and D7 (ralf.strobel) patches.
- I *removed* the change from COUNT(*) to SUM(count) in the D6 patch, because in my testing with a few thousand rows, COUNT(*) on the flag_content table for a particular FID is still faster (about 3x) than SUM(count). My results are by no means conclusive, but I think that change should probably be separate anyway.
- I added adjustments to the hook_user_login() code. Otherwise anonymous users that logged in as authenticated users would still have their flagging count as both an authenticated and anonymous user, making it possible to artificially inflate a vote by repeated flagging, logging in and out, then flagging again.

Other than that, I started with the patch for D6 and then used the exact logic from ralf.strobel's .inc file for D7, so we should have introduced a minimal amount of untested code. Of course I tested it myself also, but it could use more testing before the next release. Right now is a great time for testing though, as so many people have been looking at pushing out the 2.0 final.

Log in or register to post comments

Comment #45

ralf.strobel commented 26 September 2011 at 05:25

There was an important reason why I did the delete query first in _decrease_count() and then only run the update query conditionally:

If $number is bigger than the current count and you run the update first, you will get an out of bounds error from MySQL, because the result is negative!

Log in or register to post comments

Comment #46

quicksketch

he/him

commented 26 September 2011 at 05:45

Status:

Fixed

» Needs work

There was an important reason why I did the delete query first in _decrease_count() and then only run the update query conditionally:

Ah, good call @ralf.strobel. Sorry I forgot I made that adjustment to the code also. The trouble I have with the db_delete() query is that there is no defined return value for db_delete() returning FALSE or NULL if the query did not delete any rows. That lack of requirement (see http://api.drupal.org/db_delete) made me nervous about depending on it for logic like this.

Unfortunately I couldn't find anything like db_rows_affected() for D7, so I went with the update and then delete approach. You're right it's flawed though. Can you think of another fix?

Log in or register to post comments

Comment #47

quicksketch

he/him

commented 26 September 2011 at 05:56

Hm, well db_delete() doesn't return TRUE or FALSE but the return of the execute() method does (a nice summary at http://drupal.stackexchange.com/questions/10643/how-can-i-set-a-message-...). Looks like your approach is okay. Drupal 7 did the same thing (though for db_update) when they killed db_affected_rows in #509122: RIP db_affected_rows().

Log in or register to post comments

Comment #48

ralf.strobel commented 26 September 2011 at 06:03

I didn't realize the return value was not always defined across different databases.

You can just leave out the if-condition and my code will still work. If rows with condition('count',$number,'<=') get deleted first, then the following update will find nothing wrong to update.

Also I noticed you're not using a transaction in _flag(). I'm not saying it's necessary, but it might be good practice, since you're updating different tables that have to stay consistent. It might even speed up the combined delete/update query because SQL only has to walk the index once.

Log in or register to post comments

Comment #49

quicksketch

he/him

commented 26 September 2011 at 06:05

Status:

Needs work

» Needs review

Status	File	Size
new	flag_increments_fix-d7.patch	1.01 KB

How does this look for a fix? It should be the same logic as your original patch.

Log in or register to post comments

Comment #50

ralf.strobel commented 26 September 2011 at 06:07

Status:

Needs review

» Needs work

Well, the reference of DeleteQuery::execute() says. "The return value is dependant on the database connection."

I have not worked with anything except MySQL, where my approach definitely works. But I cannot tell you what the return is in Postgre or SQlite.

Log in or register to post comments

Comment #51

quicksketch

he/him

commented 26 September 2011 at 06:07

Status:

Needs work

» Needs review

Also I noticed you're not using a transaction in _flag(). I'm not saying it's necessary, but it might be good practice, since you're updating different tables that have to stay consistent. It might even speed up the combined delete/update query because SQL only has to walk the index once.

Yes! I saw you'd included db_transaction() in your patches but I didn't quite understand what the full impact was. Can you open a new issue for that one, with an explanation? I'm sorry I didn't realized I'd cut so much out of your initial patch.

Log in or register to post comments

Comment #52

quicksketch

he/him

commented 26 September 2011 at 06:11

Well, the reference of DeleteQuery::execute() says. "The return value is dependant on the database connection."

Hmm you're right. Not sure what to do here... it looks like core doesn't actually define specific return values in any of the defined database engines (MySQL, pgSQL, or sqlLite). I've confirmed your solution works in MySQL, but if we can find a guaranteed solution that would be optimal...

Log in or register to post comments

Comment #53

quicksketch

he/him

commented 26 September 2011 at 06:15

You can just leave out the if-condition and my code will still work. If rows with condition('count',$number,'<=') get deleted first, then the following update will find nothing wrong to update.

That sounds like a safe alternative. With any non-global flag, decrementing the count probably isn't going to delete the row anyway. With a global flag, it's probably not a huge deal to do the extra query. Does that sound like a good tradeoff to you?

Log in or register to post comments

Comment #54

ralf.strobel commented 26 September 2011 at 06:18

Regarding your last patch:
You weren't using the $arguments param in expression() - because my code didn't - sorry.
Other than that, with DeleteQuery::execute() possibly not having a reliable result, I would just go for this:

  function _decrease_count($content_id, $number = 1) {
    db_delete('flag_counts')
    ->condition('fid', $this->fid)
    ->condition('content_id', $content_id)
    ->condition('count',$number,'<=')
    ->execute();
   
    db_update('flag_counts')
    ->expression('count', 'count - :inc', array(':inc' => $number))
    ->condition('fid', $this->fid)
    ->condition('content_id', $content_id)
    ->execute();
  }

Log in or register to post comments

Comment #55

quicksketch

he/him

commented 26 September 2011 at 06:22

Status:

Needs review

» Fixed

Status	File	Size
new	flag_increments_fix-d7.patch	975 bytes

You weren't using the $arguments param in expression() - because my code didn't - sorry.

Yep, I caught it the first time but didn't after I wanted to ensure exact code structure the second time. Nice catch on your part. :)

Other than that, with DeleteQuery::execute() possibly not having a reliable result, I would just go for this:

Sounds great. I've committed the attached patch to the 7.x branch. I adjusted code formatting slightly to match coding standards.

Log in or register to post comments

Comment #56

ralf.strobel commented 26 September 2011 at 06:23

Status:

Fixed

» Needs review

Regarding the transaction: I don't think this deserves another issue. Just look at how transactions work in D7 and my line should be perfectly clear to you: http://drupal.org/node/355875

Basically what my line does is make sure that all queries executed until the end of that function (including subfunction calls) are passed to the database and executed in one batch. If there is an error with one query, then nothing gets executed at all, so that all tables stay consistent. Plus, like I said, it also speeds up multiple queries affecting the same rows because indexes will only have to be looked up once and the results only are written once.

Log in or register to post comments

Comment #57

ralf.strobel commented 26 September 2011 at 06:27

Looks like it should work now.

Just two minor improvements on readability:
- $deleted is no longer required
- Your comment right above that doesn't really reflect the exact purpose of the delete query any more. You should definitely mention the possible out of bounds error before someone else wonders and changes it back.

Log in or register to post comments

Comment #58

quicksketch

he/him

commented 26 September 2011 at 06:31

Regarding the transaction: I don't think this deserves another issue. Just look at how transactions work in D7 and my line should be perfectly clear to you: http://drupal.org/node/355875

Hm, in http://drupal.org/node/355875, $transaction->rollback(); is used upon exceptions. But I suppose it's not needed in your patch because we don't do any try/catch constructs to begin with?

Log in or register to post comments

Comment #59

ralf.strobel commented 26 September 2011 at 09:50

It's a good point that I might not have paid enough attention to, since I use transactions primarily for a possible gain in performance (after all you are making the database's job easier) and not so much for integrity protection and exception handling.

From how transactions work when using direct SQL, I assumed that if one query within the transaction fails, all will fail, and nothing gets committed. You just still get an error message if you don't catch it. But I'm not sure about this at all. I'll post a comment on the reference page I linked earlier.

In the mean time, I don't think my code is wrong. If we don't expect there could be any exceptions, then we can probably just use the transaction without any exception handling, simply for better performance.

Log in or register to post comments

Comment #60

ralf.strobel commented 26 September 2011 at 09:56

Ok, I have move this forth and back in my head for a while and I was wrong...

It still absolutely makes sense to use a transaction in _flag() and _unflag(), because we'll be making updates to several tables that always need to be consistent.

But we SHOULD use the try-catch-rollback construction. In fact you should always do that when using a transaction and I'm going to change all my code where I use them without. The reason is that through a transaction you're saying "all of these changes need to be committed together and if ANYTHING goes wrong, rather write nothing at all". And the only way to ensure that "anything", not just queries but all the code in between, is by using a try-catch.

Log in or register to post comments

Comment #61

quicksketch

he/him

commented 26 September 2011 at 15:14

Status:

Needs review

» Fixed

Okay, well at this point it's taken some work to just figure out what needs to be done to get transactions working, so let's make it a separate issue after all. :)

I've already made a few commits for this particular issue, so new issues to track further changes would be preferable.

Log in or register to post comments

Comment #62

ralf.strobel commented 26 September 2011 at 15:39

Done...
#1291170: Use transaction for flagging in flag()

Log in or register to post comments

Comment #63

ralf.strobel commented 28 September 2011 at 05:56

Status:

Fixed

» Active

Ok, another real bug relating this issue...

_increase_count() and _decrease_count() do not update the new "last_updated" table column.

I think you had it in before, but we forgot about it during the last few fixes.

Log in or register to post comments

Comment #64

ralf.strobel commented 2 October 2011 at 04:35

Here's what it should look like...

function _increase_count($content_id, $number = 1) {
    db_merge('flag_counts')
      ->key(array(
        'fid' => $this->fid,
        'content_id' => $content_id,
      ))
      ->fields(array(
        'content_type' => $this->content_type,
        'count' => $number,
        'last_updated' => REQUEST_TIME,
      ))
      ->updateFields(array(
        'last_updated' => REQUEST_TIME,
      ))
      ->expression('count', 'count + :inc', array(':inc' => $number))
      ->execute();
  }

  function _decrease_count($content_id, $number = 1) {
    // Delete rows that would result in count <= 0, for data consistency and space-saving.
    $deleted = db_delete('flag_counts')
      ->condition('fid', $this->fid)
      ->condition('content_id', $content_id)
      ->condition('count', $number, '<=')
      ->execute();

    // Update the count with the new value otherwise.
    db_update('flag_counts')
      ->expression('count', 'count + :inc', array(':inc' => $number))
      ->fields(array(
        'last_updated' => REQUEST_TIME,
      ))
      ->condition('fid', $this->fid)
      ->condition('content_id', $content_id)
      ->execute();
  }

Log in or register to post comments

Comment #65

quicksketch

he/him

commented 2 October 2011 at 06:01

Oh man, we just can't get this right.

In your code (and in Git), we have this incredible error in _decrease_count():

    db_update('flag_counts')
      ->expression('count', 'count + :inc', array(':inc' => $number))
      ->fields(array(
        'last_updated' => REQUEST_TIME,
      ))

Spot it? How about that we're ADDING the count on decrement()? Oh man... unbelievable...

Log in or register to post comments

Comment #66

quicksketch

he/him

commented 2 October 2011 at 06:19

Status:

Active

» Fixed

Status	File	Size
new	flag_increment_last_updated-d7.patch	900 bytes
new	flag_increment_last_updated-d6.patch	1.51 KB

Okay, another day another patch. I've given this one a lot more testing and inspecting the database to make sure everything works as expected. Let's file new issues for further bugs, this one has gotten messy.

Log in or register to post comments

Comment #67

ralf.strobel commented 2 October 2011 at 07:21

Ouch, yeah... hadn't seen that one. The fact that we delete rows with count 0 masks the increment issue for most scenarios.

Log in or register to post comments

Comment #68

16 October 2011 at 07:21

Status:

Fixed

» Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Log in or register to post comments

Keep flag_counts current by manual in/decreasing of count.

Comments