I am using 'mysql Ver 14.12 Distrib 5.0.70, for pc-linux-gnu (x86_64) using readline 5.2'.

Trying to create a foreign key does not seem to be working. The code executes with no errors and the table is constructed but the foreign key does not exist. Replacing a valid lookup table name with garbage also does not throw an error and the table is still constructed leaving me to believe that it is simply ignoring this altogether.

From the MySQL docs (http://dev.mysql.com/doc/refman/5.0/en/ansi-diff-foreign-keys.html):
For storage engines other than InnoDB, MySQL Server parses the FOREIGN KEY syntax in CREATE TABLE statements, but does not use or store it. In the future, the implementation will be extended to store this information in the table specification file so that it may be retrieved by mysqldump and ODBC. At a later stage, foreign key constraints will be implemented for MyISAM tables as well.

My tables are set to be InnoDB by default and not MyISAM so the foreign key constraint should not be ignored.

  $schema['lookup_table'] = array(
    'description' => 'This is the table that will contain the lookup data.',
    'fields' => array(
      'id' => array(
        'description' => 'Primary key for lookup data.',
        'type' => 'varchar',
        'length' => 4,
        'not null' => TRUE,
        'default' => ''),
      'name' => array(
        'description' => 'Name of lookup item.',
        'type' => 'varchar',
        'length' => 60,
        'not null' => TRUE,
        'default' => ''),
    ),
    'primary key' => array('id'),
  );
  $schema['main_table'] = array(
    'description' => 'This is the table that will reference the lookup table.',
    'fields' => array(
      'id' => array(
        'description' => 'Primary key for main table.',
        'type' => 'varchar',
        'length' => 9,
        'not null' => TRUE,
        'default' => ''),
      'lookup_id' => array(
        'description' => 'ID of lookup item.',
        'type' => 'varchar',
        'length' => 4,
        'not null' => TRUE,
        'default' => ''),
    ),
    'foreign keys' => array(
      'lookup_table_foreign_key' => array(
        'table' => 'lookup_table',
        'columns' => array('lookup_id' => 'id'),
      ),
    ),
    'primary key' => array('id'),
  );

I've tried to reverse the field names in the columns entry as I could not tell from the documentation (http://drupal.org/node/146939) which is to come first but that did not help either.

Comments

Frando’s picture

Title:foreign_keys not working in hook_schema» Actually use foreign keys in core
Version:7.0-alpha6» 8.x-dev
Category:bug» feature

Drupal core currently does not create foreign keys. The information in hook_schema is not used anywhere in Drupal core. It's there so that contrib can take advantage of it.
For Drupal to actually enforce referential integrity, a lot of APIs would have to be adjusted and several parts of core presumably would have to change considerably.

See #111011: Add foreign keys to core for details.

So this is either closed (by design) or a feature request against Drupal 8. As there is none so far, I decided for the latter.

colan’s picture

Issue tags:-PostgreSQL

Marked #1011802: Really add foreign keys to core… as a duplicate of this issue.

Liam Morland’s picture

Are there plans to move this forward for D8? I am interested in working on it.

colan’s picture

@Liam: This is open for you to work on; nobody else is. Feel free to assign it to yourself, and start working on a patch. We'd all appreciate it. ;)

Liam Morland’s picture

Thanks. There are two ways to approach this: We can either create foreign key constraints, but not rely on them, or we can create them and take advantage of features like ON CASCADE to simplify the PHP code. In the second case, it would mean dropping support for database engines which do not support foreign key constraints. Of the currently-supported databases, that means dropping support for SQLite prior to 3.6.19 and MyISAM (Drupal already uses InnoDB as its default on MySQL). Has this policy decision been made?

colan’s picture

@Liam: Crell would probably be the best person to ask.

Crell’s picture

The problem with foreign keys is that their existence forces a particular application-level architecture. However, Drupal has been moving in the opposite direction: Less reliance on SQL-specific implementation details, not more. There are sites that run on non-SQL backends, like Examiner.com. The CMI initiative for Drupal 8 is working to move nearly all configuration out of the database and into on-disk files. If anything, Drupal is becoming less SQL-centric, not more.

So given that, I don't know that leveraging foreign keys buys us anything. In fact, I can see it backfiring horribly.

colan’s picture

It seems as though referential integrity will have to be handled at the application layer.

Liam Morland’s picture

OK, so we won't be relying in foreign key constraints, but is it OK to add them? If Drupal is managing referential integrity, then unless there is a bug in Drupal, it would make no difference to Drupal. Having foreign key constraints would catch bugs since it would cause integrity-violating operations to fail instead of silently corrupting the database. That sounds like a good thing to me.

Crell’s picture

Except Drupal already has many operations that, for a split second, violate referential integrity. Our node delete process, for instance, deletes the base record first and fields second. We now wrap that in a transaction for safety. Still, that would mean that referential integrity is broken for that period. Changing that would be a huge redesign of the code flow, and we already have a lot of work to do to that code. Having Drupal start throwing database errors for foreign key constraints would only make the process harder, not easier.

Some would no doubt argue that "catching bugs" would include highlighting precisely that situation. :-) Under some circumstances I would be one of them. However, at this point in Drupal's evolution I don't think we can reliably declare that to be a critical fatal-causing bug, which precludes setting up the database to report it as such.

Liam Morland’s picture

If the foreign key constraints are set to DEFERRABLE INITIALLY DEFERRED then the constraints are only checked when the transaction commits, just to support the situation you describe. If node delete is already in a transaction, it will work just fine so no code flow changes will be needed.

Crell’s picture

Our transaction usage is still rather lackluster, to be honest. :-)

I'm still very worried what knock-on effects adding FK constraints will have. The potential for inadvertenly breaking a lot of things, and making development harder or more dependent on SQL, is high.

We would need empiracle proof that it's safe to do before we could consider it. And I still don't know what the benefit would be when we're trying to eliminate SQL dependencies, not add them.

Liam Morland’s picture

Perhaps this is a good opportunity to fix up the transactions. I wouldn't want a power failure to leave my Drupal database half-way through a node delete, with some records gone and some still there. Transactions give me peace of mind.

Unless we use things like ON DELETE CASCADE, this would not add any SQL dependency. It would just mean that the database would check that Drupal is doing what it should be doing anyway. A constraint would only fail when there is a bug in Drupal.

There are social reasons for using constraints too. There was much shaking of heads in disbelief here at work when we found out that Drupal doesn't use FK constraints. In many developer communities, the use of FK constraints would make Drupal look far more credible.

On the other hand, if the answer is no, a patch will not be accepted on this, then this issue should be closed-wontfix.

Crell’s picture

That's not a decision I feel comfortable making unilaterally, so I've tweeted this thread to get more than 8 pairs of eyes on it. :-)

rgristroph’s picture

I like the idea of using foreign key constraints as a way of checking that Drupal is behaving correctly, as @Liam suggests. I also agree with @Crell about the general thrust of Drupal's movement away from SQL.

If the patch that implemented this was free, I'd like to use it in a test environment, just to uncover potential problems. But it won't be free, and maintaining it without it being in core will be a pain, and without the larger community vetting, it will likely be viewed with suspicion when it does uncover problems.

I also have no idea how using foreign keys might effect performance on large, heavily loaded DBs.

I like it as a tool and if it could be a feature turned on and off without cost when it's off that would be wonderful, perhaps as yet another DB backend. I'll help test and try to help maintain it somewhere if it is implemented, maybe I can even help implement it. I'm not sure that adds up to saying that you should invest the time to do it - does anyone have an idea of how much effort this would be ?

Dave Reid’s picture

Side note, that providing information on foreign keys even thought they are not used in actual SQL can still be useful. For example, the Domain module is able to provide an upgrade path for converting domain IDs from integers to machine names based on tables that properly list foreign keys on the {domain} table.

pjcdawkins’s picture

(as an interested onlooker) Would I be right to suggest that, if transactions were used properly throughout core/contrib, this should be quite a straightforward drop-in update to the Schema API? Should this issue be more generally about DB integrity? The general move away from SQL isn't really an argument against good SQL practices, especially since the vast majority of Drupal sites will be using SQL for a long time to come.

Sylvain Lecoy’s picture

This was posponed in drupal 6 to drupal 7. Now being postponed to drupal 8. Wake up we are in 2012. How drupal can be taken seriously by companies if its data structure don't even use foreign keys ?

If we are really embracing the vision of Symphony2 as we claim, we need to make use of best practises. We must, as a framework, teach new comers. The thing which is really bad about declaring a 'foreign key' key in the schema API, is that new dev thinks they will create a data structure which ensure data integrity.

Its only after encountering a situation where they: checked the DB structure, checked their .install files and schema, double-checked the DB structure, googled why drupal does not implements foreign key, and try to find this post, read upon 30 comments, that they will know that its by design (where the API seems actually implements foreign keys). Its exactly for this kind of thing that experienced dev move away from drupal, because they can't trust the software.

I'm not sure that Foreign Key can be a dropin functionnality, but I'm pretty sure that it is possible to:

1 - Actually implements FK for SQL databases.
2 - Refactor code to make use of this constraint (e.g. delete node field before actual node instead of doing the contrary).
3 - Test it against no-SQL databases, and SQL databases.

I would love to help for this kind of stuff, however, even if I'm developing with drupal for almost 2 years now, I know that my knowledge is limited. There is always drupal WTF (like this one) to be discovered which will greatly reduce my ability to contribute. Of course my knowledge of the software will increase, but if even core-contributors says they will not fix this kind of stuff by design, my motivations to keep on learning drupal may be slowed. My vision is to empower best-practises, this one is clearly violated, and should not be accepted in any serious software development project.

Liam Morland’s picture

StatusFileSize
new776 bytes
FAILED: [[SimpleTest]]: [MySQL] Drupal installation failed.
[ View ]

I made a start at implementing FK. The attached patch causes them to be created when D8 is installed. Of course, you cannot create a FK which references a table that has not yet been created. As a result, with this patch, the install process fails. The order of table creation needs to be changed so that only existing tables are referenced. Or it might work to create all the tables in a single transaction. Any idea how to go about this?

Liam Morland’s picture

I have made more progress on this problem and have successfully installed D8 with most of the existing foreign key constraints. I had to comment out some constraints which are violated by actions taken during the install process. I have filed these bugs about these. The first issue is really simple and should be uncontroversial to commit.

#476304: Mismatch of schema definition of uid, nid
#1703208: Shortcut module creates shortcut that violates constraint on Drupal install
#1703222: node_install() creates node_access record that violates constraint

Sylvain Lecoy’s picture

Liam, can you work in a sandbox or a github ? Somewhere I could help you ?

@Crell #12 This issue speaks about data integrity benefits #1199946: Disabled modules are broken beyond repair so the "disable" functionality needs to be removed.

Here is a mail from Dries expressing his love to see some data integrity, I don't know if its still relevant however: http://lists.drupal.org/pipermail/development/2007-January/021908.html.

Liam Morland’s picture

Next week when I am back from vacation I can post the patch. For now, the first thing that needs to get fixed is #476304: Mismatch of schema definition of uid, nid.

After that, there are a few hook_schema implementations in core modules that need to have the order that the tables are listed re-arranged so that all references to other tables refer to tables that appear earlier in the hook implementation.

Liam Morland’s picture

StatusFileSize
new3.83 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch core_fk.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

OK, if you want to give it a try, first install the latest patches in these issues in this order:
#1741462: Reorder schema definitions
#476304: Mismatch of schema definition of uid, nid

Then install the attached patch. That should give you an install with many of the foreign key constrains in place. Some of the FKs are commented out because they don't work for one reason or another, documented in the comments. I have only got the install working. Using the site after installation may or may not yet work.

Liam Morland’s picture

StatusFileSize
new4.54 KB
FAILED: [[SimpleTest]]: [MySQL] Repository checkout: failed to checkout from [git://git.drupal.org/project/drupal.git].
[ View ]

Use this new version of the patch. I just installed it successfully with this one.

Liam Morland’s picture

Issue tags:+PostgreSQL
StatusFileSize
new5.96 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch core_fk_1.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

With this version, you can install and enable all the core modules. It also creates foreign keys on PostgreSQL.

Liam Morland’s picture

StatusFileSize
new5.25 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch core_fk_2.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

A FK-related patch has been committed, resolving #1743198: Mismatch between foreign key and primary key. With this patch in, one more FK constraint can be turned on.

If you want to try it, install the latest patches in these issues in this order:
#1741462: Reorder schema definitions
#476304: Mismatch of schema definition of uid, nid

Then install this patch:

Liam Morland’s picture

Issue tags:+PostgreSQL
StatusFileSize
new4.94 KB
FAILED: [[SimpleTest]]: [MySQL] Repository checkout: failed to checkout from [git://git.drupal.org/project/drupal.git].
[ View ]

#476304: Mismatch of schema definition of uid, nid has been committed. I have updated the patch in #1741462: Reorder schema definitions. The patch attached to this comment is updated to work with these changes. My D8 test environment is not working right now, so I haven't actually tested it, but the changes needed to let it apply were minor.

Liam Morland’s picture

StatusFileSize
new4.94 KB
FAILED: [[SimpleTest]]: [MySQL] 13,682 pass(es), 396 fail(s), and 298 exception(s).
[ View ]

#1741462: Reorder schema definitions has now been fixed. Now all you have to do to try this out is to apply the attached patch. Reroll.

Sylvain Lecoy’s picture

Status:Active» Needs review

What the testbot says ?

Status:Needs review» Needs work

The last submitted patch, core_fk.patch, failed testing.

Liam Morland’s picture

Unfortunately, MySQL does not support deferred foreign key constraint checking. If it did, like Postgres, then foreign key constraints would be checked at the end of a transaction, so as long as everything is valid at the end, the transaction will be fine. As it is, operations, like a node delete, need to happen in exactly the right order. This means fixing it for MySQL could require major reording of code. We also need to add transactions where they are not used but should be.

One possibility would be to not create foreign key constraints in MySQL, but create them in Postgres. It would be much easier to get it working there. However, if the testbot does not test in Postgres with foreign key constraints, things will break when people commit patches which don't follow the foreign key constraints. Is it feasible to have the testbot test in Postgres as well as MySQL?

Sylvain Lecoy’s picture

For PG: @see #697220: Run tests on all supported database servers/engines

For MySQL, this is what this issue is about, re-factoring the code to use this constraint. A simple example is to delete a node only after hook invocation (vs. before as it is now). Consider using ON CASCADE or so.

It is now the real work begin :p

Liam Morland’s picture

It is a much simpler refactor if we can use transactions with deferred constraint checking.

From earlier comments, it is clear that ON CASCADE won't be used because some Drupal sites are run on backends that don't and won't support cascade, so we still have to do all the updates and deletes explicitly.

Damien Tournoud’s picture

Status:Needs work» Closed (won't fix)

Without deferred foreign key checking, there is no way to support cyclical foreign keys, as it is the case with {node}.vid -> {node_revision}.vid -> {node_revision}.nid -> {node}.nid. So this is just a won't fix.

Sylvain Lecoy’s picture

Category:feature» bug
Status:Closed (won't fix)» Needs work

If we cannot do it properly, at least we need to not fake we support it, unless you completely wants to loose the trust of developers in the software? Enough blood shed for now.

I think we either need to go all the way with this, either need a documentation update for saying why Drupal can't use foreign key, remove the 'foreign key' attribute from schema and move it somewhere to a metadata hook and stop pretending that we provide this service at all.

A last point you highlight, and that's why it is turned in a bug report, the cyclical dependency. A node can exist without a revision, while a revision cannot without a node. So this is just a design problem to fix.

Crell’s picture

Given that Drupal's Entity API is by design back-end agnostic, building our logic on the SQL-specific cascade logic is a no-go. At least until entities become so abstracted that you're never, ever touching SQL, even in a hook, and you're treating it as "Mongo DB implemented in MySQL", we don't have sufficient abstraction to not have cascade reliance break things.

Sylvain Lecoy’s picture

Let's face it, Drupal is designed with relational databases in mind. But at least if we are not using fk, then we should be honnest with the developper and informs him. On the other hand, the test infrastructure does not support other db system than MySQL.

The field system is build on relational assumptions, the entity system as well. How can you (Crell) says it is not the case ? I don't get the point, maybe there is something beyond my knowledge.

Let's forget the ON CASCADE thingy, it was an example but I believe the way to process is just to change the way (mainly the order) things are deleted.

Crell’s picture

Drupal WAS designed with a relational database in mind. A few zillion hours of work have gone into Drupal 7 and Drupal 8 to change that assumption. The window in which it made sense to bind our architecture to SQL has passed. If everything in process for Drupal 8 goes through, the only place that SQL storage will still be relevant will be, sometimes, Views. That's it.

If we want Field->Entity FKs, then that needs to be internalized to the entity storage implementation for the database and the API on top of it sufficiently robust that you don't notice the difference if you're just slinging entities around. That sufficiently robust API has to happen first, before we can do the former.

Liam Morland’s picture

This issue isn't about redesigning APIs or architectures. It's just about implementing the relationships that are already supposed to be there. If you declare that a foreign key relationship exists, then violate that relationship, that's a bug, either in the declaring or the violating. As well, if you have a table of fields that are associated with an entity, there had better be an entity for every field; if not, that's data corruption which could lead to very weird behavior and bugs that are hard to reproduce and fix. Foreign key constraints guarantee this won't happen.

I think it has been settled that we are not using cascades. If the database supports transactions with deferred constraint checking, then getting this to work is just a matter of adding transactions and clearing up places were the declared foreign key constraints are not being followed.

My suggestion is this:
1. Resolve #697220: Run tests on all supported database servers/engines so that the testbot tests on Postgres.
2. Adjust the patch so that foreign keys are created on Postgres but not on MySQL. (Not sure if SQLite supports this or not.)
3. Fix up the code so that everything works on Postgres.
4. Profit: Bugs that are uncovered are fixed, more stuff happens in transactions which decreases the likelihood of corrupted databases, and businesses and developers that are concerned about data integrity will start to take Drupal seriously.

Any downside to doing this?

Crell’s picture

Testbot runs on PostgreSQL take many hours, because PostgreSQL's DDL queries are vastly slower than MySQL's and the test suite runs them by the thousands.

And it is about redesigning architectures, when forcing FK constraints would necessitate changes to the way our entity hooks work.

Liam Morland’s picture

I didn't know the tests run slowly on Postgres. The Postgres DDL is transactional (you can put schema changes in a transaction and roll them back if you want). MySQL doesn't support that, so it makes sense that it would be faster. Normally, this doesn't matter because scheme changes are rare.

Unless I misunderstand something about the entity hooks or DB interface, wrapping everything in a transaction, including the call to module_invoke_all(), would solve any consistency issues without any architecture changes.

Crell’s picture

I don't know the details of why DDL is slower on Postgres, just that empirical data have repeatedly shown that testbot can't handle postgres because our test suite runs zillions of DDL queries.

We still technically support MyISAM tables in MySQL, which are non-transactional. Dropping support for those would be a separate feature request (although one I'd support at this point). Also, I don't know that transactions would resolve the deletion order question. (I've not dug into SQL semantics that deeply; they vary per implementation.)

Liam Morland’s picture

This is how transactions help:

Consider a node with a field on that node. The field record refers to the node record. If there is a foreign key constraint and you try to delete the node record, it will not allow that, because the field record refers to it.

Now, consider that the foreign key constraint is deferable. Start a transaction before the node delete. It will now let you delete the node record. Then you delete the field (and anything else that relates to the node). Now commit the transaction. It won't check the foreign key until commit. At commit, everything is OK, because both the node record and the field record are gone. If you deleted the node record then tried to commit without deleting the field record, the commit would fail and the node would still exist.

Basically, a deferable constraint is allowed to be violated while a transaction is open.

Another suggestion: Consider my suggestion in #39 but instead of testing on Postgres, test on SQLite. I have just read that SQLite also supports deferable foreign key constraints. This wouldn't require dropping support for MyISAM; we just wouldn't create foreign key constraints on MySQL.

Crell’s picture

SQLite's DDL is even slower than Postgres. :-( Does MySQL do deferrable constraints (on InnoDB)?

Given that all of that is SQL-specific, that still requires moving all of the entity/field logic into the storage service. Throwing SQL transactions around elsewhere is asking for trouble. I still don't see how this is viable until the Entity API is much better abstracted.

Damien Tournoud’s picture

Title:Actually use foreign keys in core» Enable foreign keys constraints in core
Status:Needs work» Closed (won't fix)

We actually run all the entity save / delete operations in their own transactions. We have been doing that since Drupal 7 :)

Anyway, that's not the question. Foreign keys constraints are not viable on MySQL because of the lack of deferred foreign keys checking, and I'm totally against only enabling that for PostgreSQL and SQLite, as this would only make them more ghetto then they currently are. So this issue is just won't fix.

As a side note, there is *nothing wrong* with talking about 'foreign keys' in our schema. Foreign keys describe the relationships between tables in a relational schema. Foreign keys constraints (which this issue is about) are a different animal meant to ensure strong consistency of the database schema.

Sylvain Lecoy’s picture

This issue (among several one) in essence is a proof that there is something wrong with the 'foreign keys' keyword. How Damien can you be blind to developers complaints ?

Sylvain Lecoy’s picture

Title:Enable foreign keys constraints in core» Remove 'foreign keys' from schema definition
Status:Closed (won't fix)» Needs work

Secondly, when you write SQL, you are not typing "FOREIGN KEYS CONSTRAINT" isn't it ? You are typing "FOREIGN KEYS" as it the case for "PRIMARY KEY".

The schema have "primary key", and "foreign keys", if the former creates a primary key, you expect that the later creates foreign key constraint: so there is something absolutely wrong pretending we offering a service that we do not support.

If you don't want to implements them, its up to you, now we have to tell people that we are not supporting it.

Damien Tournoud’s picture

Title:Remove 'foreign keys' from schema definition» Rename 'foreign keys' from schema definition

We are definitely not removing this information from the schema. We might want to rename it to reduce confusion, but all this really smells like a documentation issue.

Sylvain Lecoy’s picture

Title:Rename 'foreign keys' from schema definition» Remove 'foreign keys' from schema definition

Same remark for 'unique key', its a constraint but its implemented.

Everything works but 'foreign keys', is there any political choice behind this ? How can we keep something not functional, which confuse developers, and its beyond all logic in this planet ? :)

It is in active use in Views or in a related project ? Can someone tell us the real reasons with this ? It looks like the 'french exception' to me. Anything declared in the schema array is functional, BUT 'foreign keys'. Please Damien, be more convincing because right now, I see no reasons to keep something broken upon repair.

EDIT: And a last word @Crell @Liam, simply changing the order which things are deleted is for me a minor architectural change, and does resolve the problem of constraint violated or not. Typically, instead of deleting the node, trigger hook_node_delete, then deleting the fields. You first delete the field and trigger the hook, then delete the node.

You can give any abstraction layer you want, at some point, it will be leaky. The entity system is build upon relational system, the simple fact we are using identifiers from fields to entities speaks volumes about that.

Damien Tournoud’s picture

Please stop changing the title of this issue. We are not removing this information, it is leveraged by many modules in contrib. I'm fine renaming the key if someone wants to roll this kind of (only marginally useful) patch.

Sylvain Lecoy’s picture

Status:Needs review» Needs work

Then I propose:

'primary key' =>
'unique keys' =>
'meta data' => array(
  'foreign keys' =>
)

And yes we all know that best practices are marginally useful for Drupal, but we trying to change that aren't we ? :)

foreign keys are like brushing your teeth: go ahead, do without it, but careful when you smile.

Sylvain Lecoy’s picture

Title:Remove 'foreign keys' from schema definition» Rename 'foreign keys' from schema definition
Status:Needs work» Needs review
StatusFileSize
new5.56 KB
FAILED: [[SimpleTest]]: [MySQL] 13,668 pass(es), 399 fail(s), and 327 exception(s).
[ View ]

Just to prove that there is a simple alternative that is way cleaner, with a minor architecture change, starts to eliminate a range of data integrity problems just by itself, which does not destroy the work of Liam, and perhaps open eyes for some:

I tested to delete a node with a file field attached - with the first patch it fails - while it passes with the second. I just deleted the node entry after all operations: yes even after postDelete() and hook invocation.

Status:Needs review» Needs work

The last submitted patch, Enable-foreign-keys-constraints-in-core.patch, failed testing.

Sylvain Lecoy’s picture

I also wonder why we are still closing this issue when this feature is asked by several people, and will be, probably asked again and again:

Let's also make a poll:

  • Among contrib module maintainers, how many of them think 'foreign keys' actually creates a foreign key ?
  • How many Drupal installation have orphaned rows ?

I bet we'll have over 70% of yes for both answers. Crell you said that we will not be using SQL anymore for entities... OK. But as we are using it, we should do it properly. In the requirements page, only SQL databases are mentioned. There is a lot of viable approach for using tables and relations, we can also use an object oriented-db, or a semantic-store. But we are absolutely not there right now. Mongo DB for instance supports the entity + field API, so document oriented-database are an exception, but this would not be a problem to bypass the foreign key support for NO SQL databases.

Secondly, it may implicates architecture changes, but the transition from Drupal 7 to Drupal 8 is the only window we have. Why systematically close the related issues and/or postpone to the next major release ? It feels like we are playing with people. This is really not acceptable.

Last, but not least: The ANSI SQL-92 standard specifies four referential integrity actions that define the activity that should occur when the tables involved in the relationship are modified: NO ACTION, CASCADE, SET DEFAULT, and SET NULL.

  • ON DELETE NO ACTION - This foreign key only marks that it is a foreign key; namely for use in OR mappers.
  • ON DELETE CASCADE - If a row in the other table is deleted, delete any rows in this table that reference it.
  • ON DELETE SET DEFAULT - If a row in the other table is deleted, set any foreign keys referencing it to the column's default.
  • ON DELETE SET NULL - If a row in the other table is deleted, set any foreign keys referencing it in this table to null.
  • ON DELETE RESTRICT - Prevents any rows in the other table that have keys in this column from being deleted.

These same actions also apply to ON UPDATE.

Sylvain Lecoy’s picture

Status:Needs work» Needs review
StatusFileSize
new33.55 KB
FAILED: [[SimpleTest]]: [MySQL] 48,063 pass(es), 3 fail(s), and 0 exception(s).
[ View ]

So here a patch aiming to heal the wounds. Just moving 'foreign keys' to a 'meta data' array.

The last submitted patch, 911352-55-Rename-foreign-keys.patch, failed testing.

Sylvain Lecoy’s picture

Status:Needs work» Needs review
StatusFileSize
new33.97 KB
PASSED: [[SimpleTest]]: [MySQL] 48,045 pass(es).
[ View ]

Good bot!

pjcdawkins’s picture

Sylvain Lecoy’s picture

Sure, feel free to update the typo accordingly !

Sylvain Lecoy’s picture

StatusFileSize
new33.91 KB
PASSED: [[SimpleTest]]: [MySQL] 48,073 pass(es).
[ View ]

Here a version with 'metadata' instead of 'meta data'.

Sylvain Lecoy’s picture

I suggest we first commit this, then in a follow-up decide to go for the path of actually use foreign keys or not.

Someone can RTBC this for now ?

sun’s picture

I do not see the point in breaking a shittonne of modules out there, just because a string does not necessarily have the implication that a few might expect.

-1

If there was a functional reason for the proposed change, this could at least be explained to anyone. But the only weirdness this change here translates into is:

Oh, we broke 13,000+ modules, because... hrm, yeah, the definition that was previously used was misleading, since Drupal core does not support foreign keys by default, and although it might be possible to extend a database driver in core to actually introduce foreign key support (just not in core), we thought that foreign keys are not foreign keys, whereas, I mean, they're still foreign key definitions, but you know, you have to change your code to specify the exact same thing in a sub-key. That's why. Got it?

That makes no sense at all to me. "broke" might be overstated (since by design, nothing can break if something isn't used), but we'd "force them to update, or they are Doing It Wrong™." (which is == breakage, from my POV)

Sylvain Lecoy’s picture

Sun, we can't technically break 13.000 modules as they are not yet... well... existing :)

This patch is about restoring the truth: fk are metadata not functionnal. See the issues request about that you'll see that there is a real demand. Althought i am not happy not implementing the fk constraint, I think it would be fair to separates it from functionnal keywords.

Crell’s picture

I have to agree with sun here. I don't really see the value this change adds at this point in time.

Sylvain Lecoy’s picture

Title:Document that foreign keys may not be used by all DB drivers» Rename 'foreign keys' from schema definition
Category:task» bug

So you are both Ok mixing actual schema definition and a metadata ? For me its all about consistency.

Despites the multiple issues opened (see #54) you think its a good idea to leave it as is ? We don't implement fk neither fix the semantic ?

ParisLiakos’s picture

Ok, as one of the people that got confused of this i totally agree with Sylvain Lecoy on this.
And drupal is not backwards compatible, so why care about the 13k modules which will break..there are one thousand more changes in drupal 8 that those module will have to get under account.
When i get some more time i am gonna review this patch

Liam Morland’s picture

We need to offer a way for developers to use foreign key constraints if they want to. If not, then we should drop unique and primary keys too so that the database is just a box for data.

I suggest that foreign keys defined by the foreign key element in the schema array be created in the database and foreign keys defined in the metadata array not be created. As part of this, move existing foreign key definitions to the metadata array. This is sure not to break anything and gives developers access to foreign keys if they want them.

Grayside’s picture

So items inside a metadata array are more obviously not enforced? How about unique keys when using a NoSQL database engine?

I'd say it's cleaner to include all valid storage concepts, and have each db implementation document which ones it uses. That is the agnostic storage approach.

As someone said upthread, contrib can still use foreign key for what it means. And that can include experimentation with proper constraint support.

ParisLiakos’s picture

Category:bug» task

I'd say it's cleaner to include all valid storage concepts, and have each db implementation document which ones it uses. That is the agnostic storage approach.

+1 on that..you are right.

Also, i think this is falls under tasks now as well

Crell’s picture

Title:Rename 'foreign keys' from schema definition» Document that foreign keys may not be used by all DB drivers

Grayside is right in #68, I think. We allow API changes between versions, but that doesn't mean we should make them without cause. This seems more like a documentation issue.

hook_schema defines an abstracted idealized version of what a DB table would be. It is not guaranteed to represent the actual schema definition in the database at any given point in time. If a given driver makes use of foreign keys, that's that driver's decision. The abstract idealized definition of the table includes them, but whether or not a given database makes use of that information is out of our control.

Therefore, retitling.

Sylvain Lecoy’s picture

But Crell we all know that no drivers implement them at all right now... so why this decision? Do you propose we creates a contrib module: "MySQL with FK suport" :s

This is a feature asked since Drupal6 by many. What sort of insanity it is to always treats it as a documentation issue ?

On the other hand, I also agree with #68. So let's fix the drivers and the Drupal architecture for Delete/Update operations ?

Liam Morland’s picture

Title:Rename 'foreign keys' from schema definition» Document that foreign keys may not be used by all DB drivers
Category:bug» task

I don't see this as a documentation problem. The problem is that Drupal claims that certain tables have relationships between them but those relationships are not created in the database. This means that users of Drupal are putting themselves at risk of suffering database corruption. Data integrity is not something that is nice to have for bragging rights. It is an important layer of protection for our data, and therefore protection for the business interests of the users.

In #39, I suggested an approach to implementing foreign keys on Postgres. The only objection that has been raised is that testing all patches against Postgres takes too much computer time. This may not be such a huge problem because most patches don't touch the database anyway, so they don't really need to get tested on Postgres.

In any case, we just need to figure out a testing plan and we will be able to create foreign keys on Postgres. This would include documenting that FKs are supported on Postgres but not in MySQL. As in #70, "that driver's decision".

droog’s picture

What am I missing? The goal to be storage-agnostic makes zero sense to me, because basically as that goal becomes closer to being achieved, the benefits of being storage-agnostic diminish. Why not just call it an RDBMS-backed application and be done with it? To re-implement everything an RDBMS offers at the application level seems like an awful amount of work just to say "Hey, you can run Drupal on NoSQL now. Aren't we cool?"

I'd far rather see it tied more closely to one particular free (as in Free Software) DB backend and leverage all the benefits of the database as fully as possible than derive no database-layer benefit at all. Hell, why not do away with these crufty old socket connections and just go flat-file for storage? Backing up and replication would be a breeze! Yee-haw!

Crell’s picture

droog: That is not the sort of system Drupal is. We favor flexibility. Plus, we don't actually use most features of an RDBMS, and never have. If you want a PHP system that is hard coded to only ever work with MySQL, go use WordPress. Drupal is not that and will not be any time in the foreseeable future.

Damien Tournoud’s picture

@Liam: #45 summarizes why I think it's a bad idea to implement this only for PostgreSQL and/or SQLite. The only practical consequence of doing that is that we are going to have even less modules compatible with those databases.

Sylvain Lecoy’s picture

Issue tags:-Cleanup Initiative

I have to disagree with that statement Crell:

we don't actually use most features of an RDBMS

This is not true, we are using superkey and foreign key concept.

If you want a PHP system that is hard coded to only ever work with MySQL, go use WordPress.

Facts are that most Drupal installation run on MySQL. There is a set of people who want to attract the best available developers and on an other side, there is a set of people constantly telling them that if they are not happy, they should move to another project. I think some of them will finally do it by the time.

Most of the arguments against and for should not be dismissed here, but we have to find a compromise. 6 issues so far has been opened for this, I think its time to find a solution...

Sylvain Lecoy’s picture

Issue tags:+Cleanup Initiative

Also adding tag.

s.Daniel’s picture

There are good reasons for the referential integrity (maintained in an relational database) pattern. However just like the MVC or OO pattern it is one of those buzzwords that you get tought at university as a "if you don't you it your an amateur coder" thing. The truth is, while the use of patterns help in complexity reduction, education of new users and and building trust etc., there is no "one pattern fits all solution". In fact applications that handle large amounts of data often don't use referential integrity managed by the database. Like ebay (2, 3) for example. On the other hand we can't simply ignore the facts that a lot of new developers will consider the lack of database managed referential integrity a weakness and that it is a hurdle to understand how Drupal works.

Now we are at seven title renames in this issue I fear this is getting too emotional.
How about we take a step back and ask us what the problem is at a whole and then how to attack it?

  • Referential integrity is there to reduce the likelihood of corrupted data.
  • If we don't implement it in the database we need to make make sure referential integrity is maintained on an application level and document how we do this. (We probably do this allready and I don't know how.)
  • How can/do we prevent such things from happening in the future: Table {comments}: nid field is not unsigned and has wrong datatype.
  • Another task could be add an ER diagram to this documentation to help developers understand Drupal.
Sylvain Lecoy’s picture

Issue tags:+Cleanup Initiative

Thank for your input s.Daniel and your willing to find a solution, I appreciate. However, referential integrity is not one of those buzzwords like MVC or OO (which is by the way maybe new in PHP world, but not in world: object was a buzzword in 2000's), relational models has been first formulated and proposed in 1969 by Edgar F. Codd. I wasn't even born back to this time...

The point you raised is so true to me, it happen everyday when I speak to collegues and they ask "why Drupal doesn't implement foreign keys ?" In #21 I also pointed to an interesting article from Dries about that, and you are correct when you say that new developers can be reluctant.

I expressed my opinion about that, and I think this would help Drupal getting better if we can find a consensus, which is for me letting drivers implements foreign keys or not. I accept that it remains in the schema (abstract) because some contribs modules relies on it, and "who can the more can the less", letting drivers implements it or not. We ship Postgre SQL driver with a "proof of concept" and if people are happy with it we provides it with MySQL as well.

I can step back from myself on a lot of things, but this issue should have never existed for me. Its all about healing wounds that would have not existed at begining. When I am trying to understand the other side "I think this is an issue of upbringing. If somewhere in your educational or professional career you spent time feeding and caring for databases (or worked closely with talented folks who did), then the fundamental tenets of entities and relationships are well-ingrained in your thought process. Among those rudiments is how/when/why to specify keys in your database (primary, foreign and perhaps alternate). It's second nature.

If, however, you've not had such a thorough or positive experience in your past with RDBMS-related endeavors, then you've likely not been exposed to such information. Or perhaps your past includes immersion in an environment that was vociferously anti-database (e.g., "those DBAs are idiots - we few, we chosen few php/js code slingers will save the day"), in which case you might be vehemently opposed to the arcane babblings of some dweeb telling you that FKs (and the constraints they can imply) really are important if you'd just listen.

Most everyone was taught when they were kids that brushing your teeth was important. Can you get by without it? Sure, but somewhere down the line you'll have less teeth available than you could have if you had brushed after every meal. If moms and dads were responsible enough to cover database design as well as oral hygiene, we wouldn't be having this conversation. :-)"

s.Daniel’s picture

All right replace the term buzzword with something more fitting. English is not my native language so anything I'll come up with is likely to be not optimal. "Iron rule" would be my next shot.
Name it as you will, the point I want to make is: Most people think of this too simple (including myself at some point). What everyone wants is save and secure data or the right™ architecture for my code. Throwing in "the most common solution" helps in most of the cases, however it's not necessarily the best solution.

I think you understand the point I want to make and that I'm not trying to say MVC etc. is bad. I'm just saying: The results matter, less how we get there.

/edit to respond to your edit:
I have a university/java background and I was very surprised when I found out Drupal does not use FKs and agree with the comment on stackoverflow as to it beeing an issue of upbringing. Only the conclusion doesn't make sense to me. What he describes is the believe of one generation given to the next generation. I'f we'd stick to that we'd still be living on a disk. This is like religion - now you can hardly discuss religion X with someone who believes in religion Z. Let's please not discuss on that level and concentrate on the actual issues like the one you described:

new developers can be reluctant.

Now: How can we fix that without changing fundamentals of Drupal?

davidwbarratt’s picture

This has probably been said several times, but I think the lack of foreign key constraints is Drupal's Achilles' heel.

I can't tell you how many times I've had to write a little "fix the damn database now!" script, because all of the sudden half of our users can't login, or the paging of articles is all out of whack, or whatever.

I think it also encourages sloppy database design. I think the Field API is one of the best features of Drupal, however, the way it stores this data in SQL is really sad. Having a foreign key constraints forces developers to really think about how relational data works. For instance, in reality, there shouldn't be a single table for each field, but a table for each instance of that field. Before you go screaming, what's wrong with that? yes you end up with a lot more tables, but how often do you really need to query across entity fields?

I'm highly in favor of having true foreign key constraints for MySQL InnoDB. I'd have a hard time recommending a move to Drupal 8 if one of the biggest problems of Drupal wasn't solved.

Anyways, I hope I didn't step on any toes here. Data integrity is something that I think is really important, and is even more important for large organizations who I think we would all like to attract to the Drupal community.

thanks you guys!

Liam Morland’s picture

I had another idea for how to get around the absence of deferred constraint checking in MySQL: Create the FK constraints with on update/delete cascade, but leave the explicit deletions in place. This gives the best of both worlds: DB consistency for those that want it and compatibility with DBs that don't support it.

Sylvain Lecoy’s picture

I've taken yesterday the ownership of the Doctrine Object Relational Mapper for Drupal.

The driver i'm developing leverage the 'foreign keys' meta-data to generates entities. I am able to correctly fetch a user, format_filter with PHP generated entities classes on Drupal 7.

Next step will be to let doctrine updates the schema database, and replaces StorageControllers (UserController, NodeController, etc.) with doctrine managed controller. I'll see how far I can go with that and i'll let you know :-)

Eventually we'll have real foreign keys in contrib.

davidwbarratt’s picture

I support this issue a lot and I really hope that real foreign key constraints come to Drupal core in the future.

There's been a lot of work in the comment.module recently that has broken the possibility of foreign keys on the comment table.

I'd really appreciate it if you could weigh in on the follow-up here:
#1995944: Remove entity_id and entity_type from the comment table and replace with relationship tables

Thanks!

joemurray’s picture

So Crell and Damien have not really been getting new arguments that might persuade them to change their mind about accepting the implementation of foreign constraints into the RDBMS of the Drupal's predominant data storage engine, InnoDB. At a certain point, isn't it the Drupal way to let alternate approaches try to prove their merit in contrib? Both technically and by popularity?

Are there currently ways that could allow a contrib module that wants to implement RDBMS level foreign key constraints to replace the few sections of core code that deliberately and knowingly violate those constraints temporarily? This seems consistent with the idea of pluggable / swappable storage engines. If so, then this would leave core and most D8 sites with 'foreign keys' that were only so in the sense of aspirational metadata comments (to put it pejoratively), but allow the sizeable tribe of traditionalists / purists / code quality doubters / data quality fanatics to create or use contrib modules that would serve their purposes. Whether this is a rerolled version of the core's InnoDB module or a module that sits beside it with a few strategic hooks isn't the issue here. It is to come up with a way to let developers know that D8 core does not provide any assurance that its DB layer will implement foreign key integrity constraints, and arguably to ensure this is possible for contrib module like ones based on Doctrine to provide.

Crell’s picture

I think this is a prerequisite for foreign key enforcement being at all viable for entities: #1497374: Switch from Field-based storage to Entity-based storage.

Once all SQL queries related to entity storage are fully encapsulated behind one class (or set of closely related classes) that can be swapped out, then we can talk.

With how much in Drupal 8 is being transitioned away from SQL to the config system or the state system, I think a lot of the "purist" argument here goes away anyway since we end up relying on relational concepts less in the first place. (Which is fine; relational databases are not the answer to all problems, and most web apps use SQL as a glorified card catalog/key-value store anyway.)

Liam Morland’s picture

@#85 We could write a module that looks through the hook_schema() implementations and creates the FK constraints declared there. It could be called the Database Integrity module. Getting this to work would require fixing any bugs which cause Drupal to violate the declare schema. I think this would be a worthwhile project. Should we have a Database Integrity sprint on Friday at DrupalCon?

@#86 I don't think Drupal 8 is being transitioned away from relational concepts. Whether or not it is stored in an SQL database, the kind of data that Drupal deals with is inherently relational: Comments are related to entities, people are related to their roles which are related to permissions, menu items are related to their parent menu item, etc. Much of the work around entities is really just reimplementing relational ideas inside Drupal instead of using the database's build-in relationship handling. In any case, I understand that there will still be more than one table that includes a nid column. If so, such columns should have a FK constraint.

joemurray’s picture

StatusFileSize
new3.8 KB
PASSED: [[SimpleTest]]: [MySQL] 55,939 pass(es).
[ View ]

Currently, drivers expose whether they support transactions and whether they support transactional DDL statements. This patch exposes whether a driver supports foreign key constraints. As it is possible to use more than one driver at once, it seems feasible for a driver that supports foreign key constraints to be used by a contrib module even if core changes along the lines of #86 aren't in D8. So this patch, and the resolution of this issue it seems to me, is not dependent on that further work.

Liam Morland’s picture

Developers should be able to create foreign keys in the database if they want to. So, I have opened #2254131: DBTNG should be able to create foreign keys.

euk’s picture

With all these comments we are still not even mentioning this in documentation, or am I missing something?

Liam Morland’s picture

Status:Needs review» Needs work

You are right. This issue is now just about documentation. Making it possible to create FKs is being handled in #2254131: DBTNG should be able to create foreign keys.

euk’s picture

Yeah, but that is D8, right?
So we are ditching all D7 devs/users?
So wouldn't it be right thing to at least mention this in the documentation for D7?

Liam Morland’s picture

Yes, it should be mentioned in the documentation.

Liam Morland’s picture

Category:Task» Bug report
Status:Needs work» Needs review
StatusFileSize
new1.97 KB
PASSED: [[SimpleTest]]: [PHP 5.4 MySQL] 92,262 pass(es).
[ View ]

Here is a documentation patch.

Liam Morland’s picture

StatusFileSize
new2.02 KB
PASSED: [[SimpleTest]]: [PHP 5.4 MySQL] 97,338 pass(es).
[ View ]

Reroll.

Crell’s picture

Status:Needs review» Reviewed & tested by the community
Issue tags:+documentation

Docfix only, and a valid one. Let's do.

  • xjm committed 128b73a on 8.0.x
    Issue #911352 by Liam Morland, Crell: Document that foreign keys may not...
xjm’s picture

Status:Reviewed & tested by the community» Patch (to be ported)
Issue tags:+needs backport to D7

Thanks @Crell and @Liam Morland. That docs improvement makes sense to me (and actually would also have saved me some confusion when I started on D7...) I agree that further discussion can happen in #2254131: DBTNG should be able to create foreign keys but it is good to get this fix in now.

This issue only changes documentation, so per https://www.drupal.org/core/beta-changes, this can be completed any time during the Drupal 8 beta phase. Committed and pushed to 8.0.x.

We should also backport this docs improvement to D7.

I made the following small code style fixes on commit (actually in a followup. Be sure to note these for the D7 version of the patch as well.

diff --git a/core/lib/Drupal/Core/Database/database.api.php b/core/lib/Drupal/Core/Database/database.api.php
index 524d482..3e0fbf8 100644
--- a/core/lib/Drupal/Core/Database/database.api.php
+++ b/core/lib/Drupal/Core/Database/database.api.php
@@ -358,7 +358,8 @@
  *   'unique keys' => array(
  *     'vid' => array('vid'),
  *   ),
- *   // For documentation purposes only; foreign keys are not created in the database.
+ *   // For documentation purposes only; foreign keys are not created in the
+ *   // database.
  *   'foreign keys' => array(
  *     'node_revision' => array(
  *       'table' => 'node_field_revision',
@@ -521,7 +522,8 @@ function hook_schema() {
       'nid_vid' => array('nid', 'vid'),
       'vid'     => array('vid'),
     ),
-    // For documentation purposes only; foreign keys are not created in the database.
+    // For documentation purposes only; foreign keys are not created in the
+    // database.
     'foreign keys' => array(
       'node_revision' => array(
         'table' => 'node_field_revision',

  • xjm committed 3402962 on 8.0.x
    Issue #911352 followup: Fix comment wrapping.
    
Liam Morland’s picture

Thanks very much. I will work in the backport.