Closed (fixed)
Project:
Drupal.org site moderators
Component:
Content moderation
Priority:
Normal
Category:
Feature request
Assigned:
Unassigned
Reporter:
Created:
1 Dec 2009 at 20:12 UTC
Updated:
16 Dec 2009 at 21:20 UTC
I've noticed over the past several months that there are still quite a few test book pages being entered on drupal.org. Many without any text at all.
It may be of benefit to set a minimum character limit on book pages in an attempt to reduce the pratice of these test users.
Thoughts?
Comments
Comment #1
silverwing commented+1
I've deleted many, many of these myself and I would be all for making it harder to post test pages.
Comment #2
gregglesWe could do 20 words as a minimum number?
However, I wanted to see what numbers might be of a right size and found a lot of nodes with low word counts (my test for "word count" isn't great, but it mostly works).
mysql> SELECT concat('http://drupal.org/node/', n.nid), LENGTH(body) - LENGTH(REPLACE(body, ' ', '')) FROM node_revisions nr inner join node n on n.nid = nr.nid AND n.vid = nr.vid where type = 'book' order by LENGTH(body) - LENGTH(REPLACE(body, ' ', '')) asc limit 50;
+------------------------------------------+-----------------------------------------------+
| concat('http://drupal.org/node/', n.nid) | LENGTH(body) - LENGTH(REPLACE(body, ' ', '')) |
+------------------------------------------+-----------------------------------------------+
| http://drupal.org/node/402290 | 0 |
| http://drupal.org/node/417694 | 0 |
| http://drupal.org/node/206130 | 0 |
| http://drupal.org/node/31595 | 0 |
| http://drupal.org/node/396570 | 0 |
| http://drupal.org/node/384880 | 0 |
| http://drupal.org/node/557370 | 0 |
| http://drupal.org/node/46641 | 0 |
| http://drupal.org/node/443536 | 0 |
| http://drupal.org/node/380194 | 0 |
| http://drupal.org/node/480892 | 0 |
| http://drupal.org/node/553824 | 0 |
| http://drupal.org/node/386314 | 0 |
| http://drupal.org/node/394118 | 0 |
| http://drupal.org/node/591710 | 0 |
| http://drupal.org/node/385144 | 0 |
| http://drupal.org/node/390568 | 0 |
| http://drupal.org/node/99612 | 0 |
| http://drupal.org/node/394126 | 0 |
| http://drupal.org/node/379716 | 0 |
| http://drupal.org/node/401780 | 0 |
| http://drupal.org/node/400818 | 0 |
| http://drupal.org/node/410500 | 0 |
| http://drupal.org/node/421616 | 0 |
| http://drupal.org/node/46651 | 0 |
| http://drupal.org/node/628292 | 0 |
| http://drupal.org/node/393506 | 0 |
| http://drupal.org/node/394450 | 0 |
| http://drupal.org/node/417188 | 0 |
| http://drupal.org/node/416698 | 0 |
| http://drupal.org/node/400856 | 0 |
| http://drupal.org/node/451238 | 0 |
| http://drupal.org/node/46635 | 0 |
| http://drupal.org/node/389228 | 0 |
| http://drupal.org/node/411734 | 0 |
| http://drupal.org/node/506068 | 0 |
| http://drupal.org/node/111022 | 0 |
| http://drupal.org/node/557322 | 1 |
| http://drupal.org/node/44895 | 1 |
| http://drupal.org/node/415078 | 1 |
| http://drupal.org/node/23192 | 1 |
| http://drupal.org/node/227210 | 1 |
| http://drupal.org/node/573150 | 1 |
| http://drupal.org/node/22288 | 1 |
| http://drupal.org/node/262 | 1 |
| http://drupal.org/node/448456 | 1 |
| http://drupal.org/node/630552 | 1 |
| http://drupal.org/node/405796 | 1 |
| http://drupal.org/node/88197 | 2 |
| http://drupal.org/node/420786 | 2 |
+------------------------------------------+-----------------------------------------------+
Comment #3
gregglesAnother query based on character count:
mysql> SELECT concat('http://drupal.org/node/', n.nid), LENGTH(body) FROM node_revisions nr inner join node n on n.nid = nr.nid AND n.vid = nr.vid where type = 'book' order by LENGTH(body) asc limit 50;
+------------------------------------------+--------------+
| concat('http://drupal.org/node/', n.nid) | LENGTH(body) |
+------------------------------------------+--------------+
| http://drupal.org/node/506068 | 0 |
| http://drupal.org/node/111022 | 0 |
| http://drupal.org/node/553824 | 0 |
| http://drupal.org/node/386314 | 0 |
| http://drupal.org/node/394118 | 0 |
| http://drupal.org/node/416698 | 0 |
| http://drupal.org/node/379716 | 0 |
| http://drupal.org/node/401780 | 0 |
| http://drupal.org/node/394126 | 0 |
| http://drupal.org/node/385144 | 0 |
| http://drupal.org/node/46635 | 0 |
| http://drupal.org/node/591710 | 0 |
| http://drupal.org/node/557370 | 0 |
| http://drupal.org/node/443536 | 0 |
| http://drupal.org/node/46641 | 0 |
| http://drupal.org/node/396570 | 0 |
| http://drupal.org/node/400818 | 0 |
| http://drupal.org/node/410500 | 0 |
| http://drupal.org/node/390568 | 0 |
| http://drupal.org/node/411734 | 0 |
| http://drupal.org/node/400856 | 0 |
| http://drupal.org/node/480892 | 0 |
| http://drupal.org/node/628292 | 0 |
| http://drupal.org/node/380194 | 0 |
| http://drupal.org/node/394450 | 0 |
| http://drupal.org/node/417188 | 0 |
| http://drupal.org/node/46651 | 0 |
| http://drupal.org/node/421616 | 0 |
| http://drupal.org/node/417694 | 0 |
| http://drupal.org/node/448456 | 1 |
| http://drupal.org/node/22288 | 1 |
| http://drupal.org/node/227210 | 1 |
| http://drupal.org/node/23192 | 1 |
| http://drupal.org/node/7176 | 3 |
| http://drupal.org/node/393506 | 3 |
| http://drupal.org/node/451238 | 4 |
| http://drupal.org/node/384880 | 5 |
| http://drupal.org/node/557322 | 6 |
| http://drupal.org/node/262 | 11 |
| http://drupal.org/node/630552 | 12 |
| http://drupal.org/node/415078 | 13 |
| http://drupal.org/node/69725 | 14 |
| http://drupal.org/node/639994 | 17 |
| http://drupal.org/node/573150 | 17 |
| http://drupal.org/node/575276 | 18 |
| http://drupal.org/node/489662 | 19 |
| http://drupal.org/node/299562 | 20 |
| http://drupal.org/node/299563 | 20 |
| http://drupal.org/node/299564 | 20 |
| http://drupal.org/node/299565 | 20 |
+------------------------------------------+--------------+
The reason I did word count initially is because word count is a feature of Drupal core that we could implement now, while character count is something we would need to alter drupalorg.module to fix.
Comment #4
vm commentedMy apologies, I actually meant word count and not character count as I knew there was a core feature that would cover this issue.
I suppose what has to be asked is .... is a document that only has 20 - 30 words a document worth having added?
Old documentation wouldn't be affected by this change only new documentation correct?
As a measuring stick, take this comment as an example which has over 30 words, could useful documentation be created in under 30 words?
Comment #5
sepeck commentedChoices are 0, 1, 10, 25, 50, 75, 100, 125, 150, 175 and 200.
I set it to 10 for right now. That will prevent blank pages and 'This is a test' messages.
Comment #6
sepeck commentedchanging status. let me know if it needs to be removed or go higher.
Comment #7
WorldFallz commentedI clean these up a lot too. This is a great idea! The shortest legit pages I know of are the ones that merely link to a screencast somewhere (ie http://drupal.org/node/289310)-- even those have more than 20 words. Its really hard to think of a valid page that would only have 20 words.
Comment #8
WorldFallz commentedsorry... looks like we crossposted.
Comment #9
avpadernoThe problem is that now it is not possible to delete a book page if the body text doesn't contain at least 10 words. I was deleting all those test pages, but I had to enter 10 words of two characters (I entered random text, really).
Would not be better to change the minimum number of words after deleting the test pages?
Comment #10
vm commentedI just took a stroll through greggles list of 0 word count nodes. Now that there is a word limit to delete the nodes, one must enter the proper amount of words.
It may be worth running a query to delete them?
Comment #11
avpadernoComment #12
vm commentedheh, and now I crossposted over Kiam.
Comment #13
vm commentedrunning a query may not be the best idea as it looks like a few of the pages could be landing pages which only offer links to child pages.
Sepeck, if you can switch this back to 0. Kiam and I can clean up the list Greggles has then we can switch it back.
Kiam you start at the top and I'll start at the bottom?
Comment #14
avpadernoIt seems that this report is the most active, at the moment. :-)
Comment #15
avpadernoI have already started from the top; thank God you didn't proposed vice versa. :-)
Ok, then. Let the dance start!
Comment #16
vm commentedI've just copied ten words from my comment and am pasting it in the body to get rid of them.
Sepeck never mind switching the limit.
Comment #17
vm commentedKiam I think were overlapping which must mean we got all of those which should have been deleted.
Pages which had no documentation and were blank parent pages linking to child pages, I did not remove.
Comment #18
avpadernoThe list given from greggles has been scanned. The left book pages are the ones with child pages.
Comment #19
vm commented^5 Kiam
Thanks greggles and sepeck. Let's see how the word limit works out over the next few days. Will reopen if needed.
Comment #20
webernet commentedFYI: #361106: Reduce the minimum word limit?
Comment #21
gregglesMaybe the long term solution is better spam fighting tools.
Comment #22
avpadernoIs there any reason why Mollom is used on g.d.o, and not on d.o?
Comment #23
gregglesre #22 - killes really dislikes black boxes.
To expand on my comment in #21, we could create a view or a page on d.o that shows book nodes in a table sortable by the number of characters and the date posted.
Comment #24
avpadernoViews solution is good. Rather than being limited to book pages, it could be made for all the content types, and have an exposed filter that allows to filter out the list basing on the content type of the nodes.
Actually, it could be good to have a more generic view that would allow to easily catch spam posts. I am not sure how this could be achieved, but we could start using a view with filters with default values that would allow to catch some spam messages (i.e. the content length is not higher than X characters, the post has been committed in the last 10 minutes, etc...).
Comment #25
vm commentedI'm not at all opposed to a view maintainers can pull up which may have a word count or something that would help keep up with the test pages being posted.
After reading the discussion in #361106: Reduce the minimum word limit? , I understand the arguments both Lee and Webchick make with reference to feeling as they have to uyse more nodes then necessary, especially when, by design, they want a blank parent menu item with child menu items.
Finding a middle of the road for all involved would certiainly be the prudent thing to do.
Comment #26
WorldFallz commentedsee #421676: Implement view for orphan book pages for a related effort aimed at orphan book pages
Comment #27
vm commentedorphaned pages help too but some of these 0 word test pages were child pages thrown in as child pages arbitrairily.
Comment #28
silverwing commentedCertainly not perfect :) Someone just posted a test page that read:
Comment #29
vm commentedyea I guess it will be an issue with test pages regardless of the word limit set. Users can just paste Loren Ipsum to get around any setting. Not sure how to handle this going forward but at least, with the aid of greggles list, we were able to get the pages that have been missed in the past.
Comment #30
sepeck commentedImmediately block that user. There is a warning message regarding test posts on book pages.