Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
The IP address blocking mechanism at admin/config/people/ip-blocking needs to be converted to use the new configuration system.
Tasks
- Create a default config file and add it to the system module.
- Convert the admin UI in system.admin.inc to read to/write from the appropriate config.
- Convert any code that currently accesses the {blocked_ips} table to use the config system.
- Write an upgrade path from D7 -> D8 to migrate existing data to the new system and drop the {blocked_ips} table.
This would be a good task for someone wanting to get up to speed on how the new config system works. Feel free to ping me on IRC if you need help.
Comment | File | Size | Author |
---|---|---|---|
#44 | 1479466-44.patch | 12.97 KB | amateescu |
#41 | 1479466-41.patch | 12.95 KB | gdd |
#34 | ip-blocking-1479466-34.patch | 13.16 KB | marcingy |
#31 | ip-blocking-1479466-31.patch | 12.93 KB | naxoc |
#28 | 1479466-26-ip_address.patch | 12.96 KB | marcingy |
Comments
Comment #1
kika CreditAttribution: kika commentedShould it be part of #1161486: Move IP blocking into its own module ?
Comment #2
gddComment #3
naxoc CreditAttribution: naxoc commentedI'll give this a try while here at Drupalcon
Comment #4
naxoc CreditAttribution: naxoc commentedOK, here is the patch. I updated the tests and made all the places that were using the blocked_ips table use the config file.
I still haven't written an upgrade path. I'll do that as soon as possible.
Comment #5
Anonymous (not verified) CreditAttribution: Anonymous commentednice work!
a couple of things:
- we need a comment somewhere explaining why we prepend the ip keys with 'ip_'
- could probably use blocked_ip_load($hash) in a few places to simplify the code a bit
Comment #6
chx CreditAttribution: chx commented+ $config->clear('blocked_ips.ip_' . $form_state['values']['blocked_ip']);
thanks
Comment #7
gddTagging
Comment #8
naxoc CreditAttribution: naxoc commentedHere is a new patch that uses bloked_ips_load much more so the code is less cluttered.
It uses concatenation instead of the {} as noted in #6.
There is now an upgrade path.
I added a comment about the prefixes in blokced_ips_load, but I really couldn't find a good place to put it. It can probably be moved to a better place but I am not sure where.
Comment #10
Anonymous (not verified) CreditAttribution: Anonymous commentedthis looks ready to go to me. if tests pass, i think it's RTBC.
Comment #11
naxoc CreditAttribution: naxoc commentedWoops. The update function broke when there was no data. New patch for the test bot.
Comment #12
aspilicious CreditAttribution: aspilicious commentedNormally there should be a one line summary.
Retrieves
Nice work btw
Comment #13
naxoc CreditAttribution: naxoc commentedThanks :) Here is a new patch with the comments corrected.
Comment #14
Anonymous (not verified) CreditAttribution: Anonymous commentedalllrighty, this looks good to go.
Comment #15
chx CreditAttribution: chx commentedSorry for not reviewiing this more throughly earlier but md5 has been banned from core (I disagree with that choice but that ship has sailed). Also, there's no comment on why are we hashing in the first place.
Comment #16
Anonymous (not verified) CreditAttribution: Anonymous commentedyep, i think chx is right, we prolly don't need the hashing.
as long as the ip address value is always treated as an xml value, it should be fine.
Comment #17
Anonymous (not verified) CreditAttribution: Anonymous commentedoh blurgh. so i just went back to look at fixing this, and the reason we md5() is because we use the ip address value in the element name in the xml.
sooo, i think we just need to swap out '.' to '_'.
Comment #18
Anonymous (not verified) CreditAttribution: Anonymous commentedalrighty, here's a patch that converts '.' to '_' in ip addresses when using the value in an xml element name.
Comment #20
Tor Arne Thune CreditAttribution: Tor Arne Thune commented$hash should probably be changed to something else, or does the term still apply when just replacing a character with another?
Also here.
Comment #21
Anonymous (not verified) CreditAttribution: Anonymous commentedupdated patch addresses the comments in #20 and should fix the test fails in #18.
not sure we need to ship an empty blocked ips list - we can just check the return value.
Comment #22
Anonymous (not verified) CreditAttribution: Anonymous commentedahem
Comment #23
wamilton CreditAttribution: wamilton commentedgrep "str_replace('\.', '_'" $(git ls-files -m)
gives us:
Escaping and prefixing and undoing those things on read/write is bound to rear its head for everyone, so establishing the convention of encapsulating those as helpers seems like a good idea, not to mention just not maintaining this in several places should the needs for writing to the config xml change.
Comment #24
marcingy CreditAttribution: marcingy commentedReroll adding a helper function so as manipulation of ip adresses are centralised.
Comment #26
marcingy CreditAttribution: marcingy commentedKeeping up with d8 commits
Comment #27
marcingy CreditAttribution: marcingy commentedComment #28
marcingy CreditAttribution: marcingy commentedResubmitting for the bot
Comment #29
gddI'm happy with this now, thanks for all the work everyone.
Comment #30
Lars Toomre CreditAttribution: Lars Toomre commentedThis new comment does not make sense as written. Perhaps there is a 'with' missing before 'its value prefixed'? As is the first sentence needs to be reworked.
Comment #31
naxoc CreditAttribution: naxoc commentedHere is a reroll with a "with" in the comment.
Comment #32
gddand we're back
Comment #33
chx CreditAttribution: chx commentedSorry but system_convert_config_ip_address is not cutting it. config()->convertIp(), anyone?
Comment #34
marcingy CreditAttribution: marcingy commentedNew version of patch moving helper function into the object rather than adding it to system module.
Comment #35
gddI don't understand why we want this as part of the config object. It is almost certainly only going to be used for this one use case which is tied to system module, so it should reside there. If we are insistent on pulling it out of system module, I think it makes more sense to just make a helper function config_convert_ip() in config.inc.
Comment #36
Lars Toomre CreditAttribution: Lars Toomre commentedHow are subsequent blocked IP addresses going to be written out to the configuration xml files (regardless of source)? Presumably this helper function will be needed each time that event occurs, no? This thinking leads me to believe it should be part of the config subsystem rather the system module.
Comment #37
Anonymous (not verified) CreditAttribution: Anonymous commentedi'm with heyrocker on this one. as a general rule, it's on the caller of the config system to make the names for values valid.
this issue does make it clear that our file format is impinging on the use of our API in really annoying ways, but lets fix that for real elsewhere. i think #31 is RTBC.
Comment #38
gddIf it turns out that we need ip address munging in other places than for blocked ip addresses, I'm happy to move it into the config system. However since for now it only has this one use case, it should remain with the caller.
Comment #39
kika CreditAttribution: kika commentedNote that #1161486: Move IP blocking into its own module is still on horizon.
Comment #40
amateescu CreditAttribution: amateescu commentedAfter a quick chat with @beejeebus on IRC (thanks for taking the time to prove me wrong :D), I also agree now that the approach from #31 is a good temporary solution, until we explore the idea of changing the file format.
However, there are two tiny issues that prevent from rtbc-ing the patch from #31:
Documentation doesn't match the code.
Missing data types.
Comment #41
gddRerolled to above comments.
Comment #42
amateescu CreditAttribution: amateescu commentedI don't see a reason for the patch from #41 to fail the testbot, so RTBC :)
Comment #43
Lars Toomre CreditAttribution: Lars Toomre commented$ip and string are reversed. It should be '@param string $ip'. Also the @return immediately below has no type hinting. I think it should be 'string|false'.
Comment #44
amateescu CreditAttribution: amateescu commentedSigh, that's what you get for not reading the whole patch :)
Comment #45
chx CreditAttribution: chx commentedLet's go with it but I so hope there will be a proper followup getting rid of system_convert_config_ip_address(). That's so ugly.
Comment #46
Anonymous (not verified) CreditAttribution: Anonymous commentedyep, this is ready to fly.
@chx - https://drupal.org/node/1470824#comment-5781670
Comment #47
catchAre we sure it's OK to switch from a querying one IP address in SQL to storing them in a big array? This runs on cached pages so this could be a fairly large hit if a site has lots of IP addresses blocked in their database. Also do any of the contrib IP blocking modules like bad behaviour use this, that'd be more likely to lead to a long list of IP addresses than entering them through the UI.
@chx, I'd love for #1161486: Move IP blocking into its own module then subsequently to contrib to be the followup.
Comment #48
gddI just looked at Bad Behavior and it doesn't use this, I'm not sure what other contrib projects there are that would also potentially use the table. The only other one I found was the httpbl module (http://drupal.org/project/httpbl).
This will undoubtedly be a regression of some sort. Aside from anything else, we will have to already do one query to get the list out of the active store, even if we cache it. Then we have to pull a value out of the resultant array. How bad it will be I'm not sure and I guess we should find out, but I'm not sure what our options are otherwise (other than just leaving it in the table, which seems to be beside the point.)
Comment #49
Damien Tournoud CreditAttribution: Damien Tournoud commentedIf we consider that the list of potentially blocked IP addresses is unbounded, the list doesn't pass one of the litmus test for configuration. If we consider it is a small list of blocked IP *patterns* (or ranges), it passes the test for configuration.
We should re-introduce the ability to block/allow ranges of IP addresses (we had that in D6, right?) if we want to convert this system to configuration. Hash-based lookup of configuration keys are a complete no-go for me.
Comment #50
gddThat makes sense, and it follows the kind of thinking we have in #1553358: Figure out a new implementation for url aliases - pathauto patterns are config, 1:1 URL aliases are not. However then the question is what do we do with this? It does make sense to be able to block IP ranges, however we will still have to support the unbounded list of single IP addresses, and people will want to move this data between servers. In the case of URL aliases, we are talking about moving the 1:1 aliases into fields on content and it becomes a content staging problem. I don't know what we do with the IP addresses because we don't want it to be either system.
Comment #51
catchI've opened #1570102: [Policy] Deprecate Ban module. We need to worry about this issue for path aliases, but I think it's OK to just punt on this altogether for core.
Comment #52
webchickLooking at http://drupal.org/admin/user/rules I can tell you it's a mix of patterns like http://33drugs.% and raw IP values. About 50/50, at a glance. As a user, it's very natural to fill things in that way through the same textbox, and not have to think about whether you're doing a pattern or a "real" value, just as you wouldn't consider "node/%" and "node/8" to be two different data types. So storing part of these values in one place and another part in another place seems like a bad idea to me unless it's very behind the scenes and not noticeable by users.
So basically, I think the safe assumption for this information is "unbounded list," which probably means a table. I don't really understand what that means in terms of configuration, though. Does that mean we'd have to convert these entries to entities..? Egad. :\
Comment #53
catchhttp://33drugs.%
This was removed from D7 in #228594: UMN Usability: split access rules into an optional module, core only supports individual IP addresses now.
Comment #54
cosmicdreams CreditAttribution: cosmicdreams commentedtagging
Comment #55
catchSorry removing the tag, there's nothing novice about this issue anymore.
Comment #56
gddNote that regardless of whether or not this functionality is dumped from core, the config system will still have to worry about the problem. How will we store unbounded lists of values? If it is determined that the config system is not appropriate for this data, then what is the alternative? As webchick points out in #1570102: [Policy] Deprecate Ban module, drupal.org uses this feature extensively. We still need to support a performant alternative. In response to #52, I don't think anyone is going to argue that these should become entities, that is not going to help anything. I don't have a lot of other thoughts though. I've been trying to think through possibilities of being able to use the config system but have data stored elsewhere, but I haven't gotten anywhere with that train of thought yet. It would involve having config data whose canonical storage is somewhere other than the active store, and this is going to introduce a lot of extra complexity.
Comment #57
catchYes the problem will need to be resolved either way.
We already have #1175054: Add a storage (API) for persistent non-configuration state open for 'system state', which shows there is plenty of things that fits into neither the configuration system nor the entity system, so I think it's not ever going to be accurate to say that things are configuration vs. entities and nothing else ever. However that's supposed to be a simple key/value store so unlikely to be suited to storing this stuff either. Well you could potentially do it with lots of individual entries if you don't need pattern matching, but it'd be a bit ugly.
While I wouldn't want to actually convert these to entities, they sort-of-are conceptually what entities represent - a blocked IP address represents a 'thing', in the same way that a user account does. And we already store IP addresses along with comments, so it's not impossible you'd want to centralize that IP address information somewhere, then do things like reference it to users, comments, and have a property or field as to whether the IP address is blocked or not. Slightly playing devils advocate but since webchick brought it up :P
Comment #58
gddI was talking with merlin yesterday about a different issue that would have a solution which would also work here. We are currently planning on architecting the configuration system so that it holds multiple implementations of a standard storage interface. The active store in sql and the files on the file system would be the two default implementations, but people could write others, for instance active store in Redis or whatever. Another thing you could is essentially write your own implementation of what the active store is. For instance, for this situation, we could write a SingleTableStorage to manage storing the data across multiple rows instead of a BLOB in one row. When we need it, this storage interface would be injected into the config object to be used as the active store. We would also need a way for these implementations to write their own import/export routines and make them available (through a hook or registry.) Merlin was talking about this as a way to continue to store and query Views across multiple tables rather than interacting with a gigantic array. I'm sure there could be other use cases.
Thoughts?
Comment #59
catchDoesn't this remove pluggability then?
Comment #60
gddYes, for everything not in the default active store, but then again those things wouldn't really map to that kind of alternate storage anyways and we are obviously uncovering places for which this default storage model isn't really appropriate.
Comment #61
Anonymous (not verified) CreditAttribution: Anonymous commentedi don't get the problem with unbounded values, at least in terms of our config implementation options. (should they ever go in config? don't know, but not a problem i want to address here.)
i get that we're forced to think about file scope because that's been baked into the public API, so it seems obvious to worry about loading a large gob of data in with a single config('foo.bar') call. but we have a layer of indirection in the returned config object, so the active store doesn't have to store things 'per file' at all. i've got a half baked backend that does away with the 'store blobs per file', instead storing each value with a single key, so lazy loading is baked in. it jars a bit with the public API, but is really just some simple wrappers around key - value storage ideas.
any code that cares about how values in the active store are persisted is doing it wrong. a single blob of XML vs a single blob of serialized php vs many rows across n tables - none of this is should be visible from the public API.
Comment #62
Crell CreditAttribution: Crell commentedbeejeebus: By unbounded values, we mean things that could have hundreds or thousands of records. A CMI file with hundreds or thousands of lines in it is a non-starter.
Comment #63
Anonymous (not verified) CreditAttribution: Anonymous commentedok, i guess i misunderstood. i thought we were talking about the active store, not files, seeing as reading from files is completely irrelevant to the concerns in #47. sorry for the noise.
Comment #64
gddNo beejeebus is right, this has absolutely nothing to do with the files, which are simply a transport mechanism. There's no reason why we should care if the files have 1,000 entries.
Also note that this code I suggested doesn't care about how the data is stored, and it wouldn't affect the public API at all (except in that you have to inject one storage mechanism over another.)
I find the idea of making the active store map more to a k/v store than a big blob intriguing, although it will make assembling and exporting the files a bigger pain in the ass. I think the more important question is, does that answer all our problems? Or is it a fact of life that some kinds of storage are appropriate for some situations, while others are important for other situations. For instance, merlin wants to be able to store views across multiple normalized tables instead of a k/v store, but still use CMI. I'm starting to think that there is no one-size-fits-all active store implementation.
Comment #65
Anonymous (not verified) CreditAttribution: Anonymous commentedyeah, key-value storage totally adds some complexity to building up the files. it's not too bad though, because all you need to do is get all the values for a given file back into $data, then the existing encode and write-to-file stuff Just Works.
don't know if it answers all of our problems, but its not intended to. it's an implementation that will live in contrib and use the nice pluggable interface that we'll ship D8 config stuff with :-)
Comment #66
gddI had a new idea for this which I think will work a lot better without significant changes to the system
1) Unbounded lists are stored in a keyed array in CMI
2) When configuration is imported, this keyed array is loaded as-is into the active store.
3) Whatever module is handling this (either its own module or system.module) responds to this config load by reading this array and writing it out to a dedicated database table that puts each address in its own row.
4) This table is used for lookups at runtime.
5) When new addresses are added they are saved to this table and into the array in the active store.
6) When config is exported we just write this array out to YAML.
If it becomes a performance issue, I could see the values only being written to the active store at export time, but I'd prefer they writethrough as changes are made.
So what we are doing here is providing an additional storage mechanism for speedy lookups, but still using the config system for moving this configuration between systems.
I think this is a good option and could serve as a model for other systems that want to work with similar data.
Thoughts?
Comment #67
catchThe first limit that's likely to be run into would be max_allowed_packet, that's probably quite a lot of IP addresses so we can see how it goes.
It's not going to be enough to handle path aliases at all since there can be millions of those but if we're calling those content that's OK I think.
Comment #68
Crell CreditAttribution: Crell commentedQuestion: Is there a reason that blocked IPs should be treated as config, not as DB-stored content? (Or, rather, non-CMI information. IP blocking should be a service and an event, and therefore pluggable.)
Comment #69
sunTBH, I'd suggest to simply punt on this for now. :)
If we'll introduce IP blocking patterns (using wildcards), then that would definitely be a simple config object, but AFAIK we don't support that feature currently. (The entire IP blocking functionality is relatively poor; we should investigate whether it would make sense to replace it with Symfony's Firewall component.)
But otherwise, I do not see a problem with retaining the blocked_ips table; especially as this entire functionality is very special and acts at a very early bootstrap level. We can see whether the introduction of generic key/value stores might help with this.
Comment #70
andypostIP blocking is executed in early boostrap level so only config is avaliable. Currently only hook_boot allows to hook-in to early_page_cache so better to remove the ban module #1161486-45: Move IP blocking into its own module to contrib in #1570102: [Policy] Deprecate Ban module
Suppose this issue should be postponed till one of issues are landed or kernel would allow to use ban with page cache
Comment #71
sun#1161486: Move IP blocking into its own module landed.
I'd suggest to call this issue won't fix, for the following reasons:
Contrary to that:
Let's cover the 80% use-case and let's ensure it works.
And for the sake of sanity, let's please shift focus onto the actual problems with that functionality:
#1559780: Remove ban_ip_action() from core. It blocks the currently logged-in user. (and potentially, your entire team)
Comment #72
gddOK, works for me. There are still some variables around this functionality that will actually need to be converted to config/state though (assuming the whole thing doesn't get refactored or removed.)
Comment #72.0
gddAdded a task to convert existing code