The IP address blocking mechanism at admin/config/people/ip-blocking needs to be converted to use the new configuration system.

Tasks

  • Create a default config file and add it to the system module.
  • Convert the admin UI in system.admin.inc to read to/write from the appropriate config.
  • Convert any code that currently accesses the {blocked_ips} table to use the config system.
  • Write an upgrade path from D7 -> D8 to migrate existing data to the new system and drop the {blocked_ips} table.

This would be a good task for someone wanting to get up to speed on how the new config system works. Feel free to ping me on IRC if you need help.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

kika’s picture

gdd’s picture

Issue tags: +Configuration system
naxoc’s picture

Assigned: Unassigned » naxoc

I'll give this a try while here at Drupalcon

naxoc’s picture

Status: Active » Needs review
FileSize
11.71 KB

OK, here is the patch. I updated the tests and made all the places that were using the blocked_ips table use the config file.

I still haven't written an upgrade path. I'll do that as soon as possible.

Anonymous’s picture

Status: Needs review » Needs work

nice work!

a couple of things:

- we need a comment somewhere explaining why we prepend the ip keys with 'ip_'
- could probably use blocked_ip_load($hash) in a few places to simplify the code a bit

chx’s picture

+ $config->clear('blocked_ips.ip_' . $form_state['values']['blocked_ip']);

thanks

gdd’s picture

Issue tags: +Config novice

Tagging

naxoc’s picture

Status: Needs work » Needs review
FileSize
12.56 KB

Here is a new patch that uses bloked_ips_load much more so the code is less cluttered.

It uses concatenation instead of the {} as noted in #6.

There is now an upgrade path.

I added a comment about the prefixes in blokced_ips_load, but I really couldn't find a good place to put it. It can probably be moved to a better place but I am not sure where.

Status: Needs review » Needs work

The last submitted patch, ip-blocking-1479466-8.patch, failed testing.

Anonymous’s picture

Status: Needs work » Needs review

this looks ready to go to me. if tests pass, i think it's RTBC.

naxoc’s picture

Woops. The update function broke when there was no data. New patch for the test bot.

aspilicious’s picture

+++ b/core/modules/system/system.installundefined
@@ -1761,6 +1761,22 @@ function system_update_8004() {
+ * Migrates the entries from blocked_ips into a configuration file and deletes
+ * the blocked_ips table.
+ */

Normally there should be a one line summary.

+++ b/core/modules/system/system.moduleundefined
@@ -1705,16 +1705,24 @@ function system_stream_wrappers() {
+ * Retrieve a blocked IP address.

Retrieves

Nice work btw

naxoc’s picture

FileSize
12.56 KB

Thanks :) Here is a new patch with the comments corrected.

Anonymous’s picture

Status: Needs review » Reviewed & tested by the community

alllrighty, this looks good to go.

chx’s picture

Status: Reviewed & tested by the community » Needs work

Sorry for not reviewiing this more throughly earlier but md5 has been banned from core (I disagree with that choice but that ship has sailed). Also, there's no comment on why are we hashing in the first place.

Anonymous’s picture

yep, i think chx is right, we prolly don't need the hashing.

as long as the ip address value is always treated as an xml value, it should be fine.

Anonymous’s picture

oh blurgh. so i just went back to look at fixing this, and the reason we md5() is because we use the ip address value in the element name in the xml.

sooo, i think we just need to swap out '.' to '_'.

Anonymous’s picture

Status: Needs work » Needs review
FileSize
12.43 KB

alrighty, here's a patch that converts '.' to '_' in ip addresses when using the value in an xml element name.

Status: Needs review » Needs work

The last submitted patch, 1479466-18-ip_address.patch, failed testing.

Tor Arne Thune’s picture

+++ b/core/modules/system/system.admin.incundefined
@@ -1359,12 +1359,15 @@ function system_modules_uninstall_submit($form, &$form_state) {
+  $blocked_ips = config('system.blocked_ips')->get('blocked_ips');
+  if (is_array($blocked_ips)) {
+    foreach ($blocked_ips as $hash => $ip) {
+      $rows[] = array(
+        $ip,
+        // Use a substring of the key - it is prefixed with 'ip_'.
+        l(t('delete'), 'admin/config/people/ip-blocking/delete/' . substr($hash, 3)),
+      );
+    }

$hash should probably be changed to something else, or does the term still apply when just replacing a character with another?

+++ b/core/modules/system/system.moduleundefined
@@ -1705,16 +1705,26 @@ function system_stream_wrappers() {
+ * An IP address is stored in the configuration file with a hash of its value
+ * prefixed with 'ip_' as key. The prefix is necessary because XML tags cannot
+ * start with numbers and hashes sometimes do start with a number.

Also here.

Anonymous’s picture

FileSize
12.5 KB

updated patch addresses the comments in #20 and should fix the test fails in #18.

not sure we need to ship an empty blocked ips list - we can just check the return value.

Anonymous’s picture

Status: Needs work » Needs review

ahem

wamilton’s picture

grep "str_replace('\.', '_'" $(git ls-files -m)

gives us:

core/modules/system/system.admin.inc:  $config->set('blocked_ips.ip_' . str_replace('.', '_', $ip), $ip);
core/modules/system/system.admin.inc:  $config->clear('blocked_ips.ip_' . str_replace('.', '_', $form_state['values']['blocked_ip']));
core/modules/system/system.install:      $config->set('blocked_ips.ip_' . str_replace('.', '_', $row->ip), $row->ip);
core/modules/system/system.module:  $escaped_ip = str_replace('.', '_', $ip);

Escaping and prefixing and undoing those things on read/write is bound to rear its head for everyone, so establishing the convention of encapsulating those as helpers seems like a good idea, not to mention just not maintaining this in several places should the needs for writing to the config xml change.

marcingy’s picture

FileSize
12.96 KB

Reroll adding a helper function so as manipulation of ip adresses are centralised.

Status: Needs review » Needs work

The last submitted patch, 1479466-24-ip_address.patch, failed testing.

marcingy’s picture

FileSize
12.96 KB

Keeping up with d8 commits

marcingy’s picture

Status: Needs work » Needs review
marcingy’s picture

FileSize
12.96 KB

Resubmitting for the bot

gdd’s picture

Status: Needs review » Reviewed & tested by the community

I'm happy with this now, thanks for all the work everyone.

Lars Toomre’s picture

Status: Reviewed & tested by the community » Needs work
+++ b/core/modules/system/system.moduleundefined
@@ -1705,16 +1705,27 @@ function system_stream_wrappers() {
+ * An IP address is stored in the configuration file its value prefixed with
+ * 'ip_' as key, and '.' replaced with '_'. The prefix is necessary because

This new comment does not make sense as written. Perhaps there is a 'with' missing before 'its value prefixed'? As is the first sentence needs to be reworked.

naxoc’s picture

Status: Needs work » Needs review
FileSize
12.93 KB

Here is a reroll with a "with" in the comment.

gdd’s picture

Status: Needs review » Reviewed & tested by the community

and we're back

chx’s picture

Status: Reviewed & tested by the community » Needs work

Sorry but system_convert_config_ip_address is not cutting it. config()->convertIp(), anyone?

marcingy’s picture

Status: Needs work » Needs review
FileSize
13.16 KB

New version of patch moving helper function into the object rather than adding it to system module.

gdd’s picture

I don't understand why we want this as part of the config object. It is almost certainly only going to be used for this one use case which is tied to system module, so it should reside there. If we are insistent on pulling it out of system module, I think it makes more sense to just make a helper function config_convert_ip() in config.inc.

Lars Toomre’s picture

How are subsequent blocked IP addresses going to be written out to the configuration xml files (regardless of source)? Presumably this helper function will be needed each time that event occurs, no? This thinking leads me to believe it should be part of the config subsystem rather the system module.

Anonymous’s picture

i'm with heyrocker on this one. as a general rule, it's on the caller of the config system to make the names for values valid.

this issue does make it clear that our file format is impinging on the use of our API in really annoying ways, but lets fix that for real elsewhere. i think #31 is RTBC.

gdd’s picture

If it turns out that we need ip address munging in other places than for blocked ip addresses, I'm happy to move it into the config system. However since for now it only has this one use case, it should remain with the caller.

kika’s picture

Note that #1161486: Move IP blocking into its own module is still on horizon.

amateescu’s picture

Status: Needs review » Needs work

After a quick chat with @beejeebus on IRC (thanks for taking the time to prove me wrong :D), I also agree now that the approach from #31 is a good temporary solution, until we explore the idea of changing the file format.

However, there are two tiny issues that prevent from rtbc-ing the patch from #31:

+++ b/core/modules/system/system.module
@@ -4020,3 +4031,16 @@ function system_admin_paths() {
+ * Helper function to convert _ to . to allow ips to be stored in XML.
...
+ * @return
+ *   A string with underscores converted to periods.

Documentation doesn't match the code.

+++ b/core/modules/system/system.module
@@ -4020,3 +4031,16 @@ function system_admin_paths() {
+ * @param $ip
+ *   String ip address that needs to be converted.
+ *
+ * @return
+ *   A string with underscores converted to periods.

Missing data types.

gdd’s picture

Status: Needs work » Needs review
FileSize
12.95 KB

Rerolled to above comments.

amateescu’s picture

Status: Needs review » Reviewed & tested by the community

I don't see a reason for the patch from #41 to fail the testbot, so RTBC :)

Lars Toomre’s picture

+ * @param $ip string

$ip and string are reversed. It should be '@param string $ip'. Also the @return immediately below has no type hinting. I think it should be 'string|false'.

amateescu’s picture

FileSize
12.97 KB

Sigh, that's what you get for not reading the whole patch :)

chx’s picture

Let's go with it but I so hope there will be a proper followup getting rid of system_convert_config_ip_address(). That's so ugly.

Anonymous’s picture

yep, this is ready to fly.

@chx - https://drupal.org/node/1470824#comment-5781670

catch’s picture

Status: Reviewed & tested by the community » Needs review

Are we sure it's OK to switch from a querying one IP address in SQL to storing them in a big array? This runs on cached pages so this could be a fairly large hit if a site has lots of IP addresses blocked in their database. Also do any of the contrib IP blocking modules like bad behaviour use this, that'd be more likely to lead to a long list of IP addresses than entering them through the UI.

@chx, I'd love for #1161486: Move IP blocking into its own module then subsequently to contrib to be the followup.

gdd’s picture

I just looked at Bad Behavior and it doesn't use this, I'm not sure what other contrib projects there are that would also potentially use the table. The only other one I found was the httpbl module (http://drupal.org/project/httpbl).

This will undoubtedly be a regression of some sort. Aside from anything else, we will have to already do one query to get the list out of the active store, even if we cache it. Then we have to pull a value out of the resultant array. How bad it will be I'm not sure and I guess we should find out, but I'm not sure what our options are otherwise (other than just leaving it in the table, which seems to be beside the point.)

Damien Tournoud’s picture

If we consider that the list of potentially blocked IP addresses is unbounded, the list doesn't pass one of the litmus test for configuration. If we consider it is a small list of blocked IP *patterns* (or ranges), it passes the test for configuration.

We should re-introduce the ability to block/allow ranges of IP addresses (we had that in D6, right?) if we want to convert this system to configuration. Hash-based lookup of configuration keys are a complete no-go for me.

gdd’s picture

Status: Needs review » Needs work

That makes sense, and it follows the kind of thinking we have in #1553358: Figure out a new implementation for url aliases - pathauto patterns are config, 1:1 URL aliases are not. However then the question is what do we do with this? It does make sense to be able to block IP ranges, however we will still have to support the unbounded list of single IP addresses, and people will want to move this data between servers. In the case of URL aliases, we are talking about moving the 1:1 aliases into fields on content and it becomes a content staging problem. I don't know what we do with the IP addresses because we don't want it to be either system.

catch’s picture

Issue tags: -Config novice

I've opened #1570102: [Policy] Deprecate Ban module. We need to worry about this issue for path aliases, but I think it's OK to just punt on this altogether for core.

webchick’s picture

Looking at http://drupal.org/admin/user/rules I can tell you it's a mix of patterns like http://33drugs.% and raw IP values. About 50/50, at a glance. As a user, it's very natural to fill things in that way through the same textbox, and not have to think about whether you're doing a pattern or a "real" value, just as you wouldn't consider "node/%" and "node/8" to be two different data types. So storing part of these values in one place and another part in another place seems like a bad idea to me unless it's very behind the scenes and not noticeable by users.

So basically, I think the safe assumption for this information is "unbounded list," which probably means a table. I don't really understand what that means in terms of configuration, though. Does that mean we'd have to convert these entries to entities..? Egad. :\

catch’s picture

http://33drugs.%
This was removed from D7 in #228594: UMN Usability: split access rules into an optional module, core only supports individual IP addresses now.

cosmicdreams’s picture

Issue tags: +Config novice

tagging

catch’s picture

Issue tags: -Config novice

Sorry removing the tag, there's nothing novice about this issue anymore.

gdd’s picture

Note that regardless of whether or not this functionality is dumped from core, the config system will still have to worry about the problem. How will we store unbounded lists of values? If it is determined that the config system is not appropriate for this data, then what is the alternative? As webchick points out in #1570102: [Policy] Deprecate Ban module, drupal.org uses this feature extensively. We still need to support a performant alternative. In response to #52, I don't think anyone is going to argue that these should become entities, that is not going to help anything. I don't have a lot of other thoughts though. I've been trying to think through possibilities of being able to use the config system but have data stored elsewhere, but I haven't gotten anywhere with that train of thought yet. It would involve having config data whose canonical storage is somewhere other than the active store, and this is going to introduce a lot of extra complexity.

catch’s picture

Yes the problem will need to be resolved either way.

We already have #1175054: Add a storage (API) for persistent non-configuration state open for 'system state', which shows there is plenty of things that fits into neither the configuration system nor the entity system, so I think it's not ever going to be accurate to say that things are configuration vs. entities and nothing else ever. However that's supposed to be a simple key/value store so unlikely to be suited to storing this stuff either. Well you could potentially do it with lots of individual entries if you don't need pattern matching, but it'd be a bit ugly.

While I wouldn't want to actually convert these to entities, they sort-of-are conceptually what entities represent - a blocked IP address represents a 'thing', in the same way that a user account does. And we already store IP addresses along with comments, so it's not impossible you'd want to centralize that IP address information somewhere, then do things like reference it to users, comments, and have a property or field as to whether the IP address is blocked or not. Slightly playing devils advocate but since webchick brought it up :P

gdd’s picture

I was talking with merlin yesterday about a different issue that would have a solution which would also work here. We are currently planning on architecting the configuration system so that it holds multiple implementations of a standard storage interface. The active store in sql and the files on the file system would be the two default implementations, but people could write others, for instance active store in Redis or whatever. Another thing you could is essentially write your own implementation of what the active store is. For instance, for this situation, we could write a SingleTableStorage to manage storing the data across multiple rows instead of a BLOB in one row. When we need it, this storage interface would be injected into the config object to be used as the active store. We would also need a way for these implementations to write their own import/export routines and make them available (through a hook or registry.) Merlin was talking about this as a way to continue to store and query Views across multiple tables rather than interacting with a gigantic array. I'm sure there could be other use cases.

Thoughts?

catch’s picture

Another thing you could is essentially write your own implementation of what the active store is. For instance, for this situation, we could write a SingleTableStorage to manage storing the data across multiple rows instead of a BLOB in one row. When we need it, this storage interface would be injected into the config object to be used as the active store.

Doesn't this remove pluggability then?

gdd’s picture

Yes, for everything not in the default active store, but then again those things wouldn't really map to that kind of alternate storage anyways and we are obviously uncovering places for which this default storage model isn't really appropriate.

Anonymous’s picture

i don't get the problem with unbounded values, at least in terms of our config implementation options. (should they ever go in config? don't know, but not a problem i want to address here.)

i get that we're forced to think about file scope because that's been baked into the public API, so it seems obvious to worry about loading a large gob of data in with a single config('foo.bar') call. but we have a layer of indirection in the returned config object, so the active store doesn't have to store things 'per file' at all. i've got a half baked backend that does away with the 'store blobs per file', instead storing each value with a single key, so lazy loading is baked in. it jars a bit with the public API, but is really just some simple wrappers around key - value storage ideas.

any code that cares about how values in the active store are persisted is doing it wrong. a single blob of XML vs a single blob of serialized php vs many rows across n tables - none of this is should be visible from the public API.

Crell’s picture

beejeebus: By unbounded values, we mean things that could have hundreds or thousands of records. A CMI file with hundreds or thousands of lines in it is a non-starter.

Anonymous’s picture

ok, i guess i misunderstood. i thought we were talking about the active store, not files, seeing as reading from files is completely irrelevant to the concerns in #47. sorry for the noise.

gdd’s picture

No beejeebus is right, this has absolutely nothing to do with the files, which are simply a transport mechanism. There's no reason why we should care if the files have 1,000 entries.

Also note that this code I suggested doesn't care about how the data is stored, and it wouldn't affect the public API at all (except in that you have to inject one storage mechanism over another.)

I find the idea of making the active store map more to a k/v store than a big blob intriguing, although it will make assembling and exporting the files a bigger pain in the ass. I think the more important question is, does that answer all our problems? Or is it a fact of life that some kinds of storage are appropriate for some situations, while others are important for other situations. For instance, merlin wants to be able to store views across multiple normalized tables instead of a k/v store, but still use CMI. I'm starting to think that there is no one-size-fits-all active store implementation.

Anonymous’s picture

yeah, key-value storage totally adds some complexity to building up the files. it's not too bad though, because all you need to do is get all the values for a given file back into $data, then the existing encode and write-to-file stuff Just Works.

don't know if it answers all of our problems, but its not intended to. it's an implementation that will live in contrib and use the nice pluggable interface that we'll ship D8 config stuff with :-)

gdd’s picture

I had a new idea for this which I think will work a lot better without significant changes to the system

1) Unbounded lists are stored in a keyed array in CMI
2) When configuration is imported, this keyed array is loaded as-is into the active store.
3) Whatever module is handling this (either its own module or system.module) responds to this config load by reading this array and writing it out to a dedicated database table that puts each address in its own row.
4) This table is used for lookups at runtime.
5) When new addresses are added they are saved to this table and into the array in the active store.
6) When config is exported we just write this array out to YAML.

If it becomes a performance issue, I could see the values only being written to the active store at export time, but I'd prefer they writethrough as changes are made.

So what we are doing here is providing an additional storage mechanism for speedy lookups, but still using the config system for moving this configuration between systems.

I think this is a good option and could serve as a model for other systems that want to work with similar data.

Thoughts?

catch’s picture

The first limit that's likely to be run into would be max_allowed_packet, that's probably quite a lot of IP addresses so we can see how it goes.

It's not going to be enough to handle path aliases at all since there can be millions of those but if we're calling those content that's OK I think.

Crell’s picture

Question: Is there a reason that blocked IPs should be treated as config, not as DB-stored content? (Or, rather, non-CMI information. IP blocking should be a service and an event, and therefore pluggable.)

sun’s picture

TBH, I'd suggest to simply punt on this for now. :)

If we'll introduce IP blocking patterns (using wildcards), then that would definitely be a simple config object, but AFAIK we don't support that feature currently. (The entire IP blocking functionality is relatively poor; we should investigate whether it would make sense to replace it with Symfony's Firewall component.)

But otherwise, I do not see a problem with retaining the blocked_ips table; especially as this entire functionality is very special and acts at a very early bootstrap level. We can see whether the introduction of generic key/value stores might help with this.

andypost’s picture

Status: Needs work » Postponed

IP blocking is executed in early boostrap level so only config is avaliable. Currently only hook_boot allows to hook-in to early_page_cache so better to remove the ban module #1161486-45: Move IP blocking into its own module to contrib in #1570102: [Policy] Deprecate Ban module

Suppose this issue should be postponed till one of issues are landed or kernel would allow to use ban with page cache

sun’s picture

#1161486: Move IP blocking into its own module landed.

I'd suggest to call this issue won't fix, for the following reasons:

  • Not everything is config.
  • Not everything is state.
  • Not everything is a content/config entity.
  • There's hardly a case for staging the records of banned IPs between servers.
  • We're over-standardizing and we're over-engineering.

Contrary to that:

  1. There's room for turning ban_boot() into an event subscriber, let's do that.
  2. We can and probably should re-introduce sane support for IP address wildcards (not funky/strange access domain rules) as well as IPv6, let's do that.
  3. There might even be room for making the storage pluggable for high-performance sites — but then again, the entire feature has loads of comments all over the place, which say that high-performance sites won't use it in the first place. Ban is a module now, which isn't enabled by default and which you don't need to enable, so who fucking cares? Let's completely remove that high-performance stuff entirely.

Let's cover the 80% use-case and let's ensure it works.

And for the sake of sanity, let's please shift focus onto the actual problems with that functionality:
#1559780: Remove ban_ip_action() from core. It blocks the currently logged-in user. (and potentially, your entire team)

gdd’s picture

Status: Postponed » Closed (won't fix)

OK, works for me. There are still some variables around this functionality that will actually need to be converted to config/state though (assuming the whole thing doesn't get refactored or removed.)

gdd’s picture

Issue summary: View changes

Added a task to convert existing code