Pursuant to Dries's request on the devel list, this is an issue to start a patch to get referential integrity into core.

Working on it now. Will post a bit later.

Files: 
CommentFileSizeAuthor
#36 111011-foreign-keys.patch16.38 KBJosh Waihi
Failed: Failed to apply patch.
[ View ]
#30 111011-foreign-key-docs-3.patch3.41 KBJosh Waihi
Failed: Failed to apply patch.
[ View ]
#27 111011-foreign-key-docs-2.patch3.31 KBJosh Waihi
Failed: Failed to apply patch.
[ View ]
#25 111011-foreign-key-docs-2.patch3.41 KBJosh Waihi
Failed: Failed to apply patch.
[ View ]
#17 111011-foreign-key-docs.patch1.92 KBJosh Waihi
Failed: Failed to apply patch.
[ View ]
#2 referential-integrity_0.patch3.24 KBwebchick
#1 referential-integrity.patch61.79 KBwebchick

Comments

webchick’s picture

Status:Active» Needs work
StatusFileSize
new61.79 KB

Ok, here is aggregator module. This ended up taking a ridiculous amount of time. ;) I figured I would post a partial patch and get some eyes on it to make sure that it's the right approach, and maybe get a couple hands on it.

webchick’s picture

StatusFileSize
new3.24 KB

Eek! That got totally messed up. ;) Let's try this one.

pwolanin’s picture

This seems like the issue to deal with a problem that's bugged me for the last yer- database column names that are reused for no good reason. Does "cid" refer to "cache id" or "comment id"; does "vid" refer to a "vocabulary id" or a "revision id". I think that making these ID fields unique would move us towards being able to figure out more easily the foreign key constraints.

webchick’s picture

Well, no. This issue would be declaring the foreign key constraints explicitly, so you wouldn't need to figure anything out. ;)

However, there seemed to be a lack of general interest in doing this on the dev list, so...

Crell’s picture

webchick, your patch adds cascade on update and delete. Have we considered the effect that has on overall development, and how we could best leverage it?

Don't get me wrong, I'm very much in favor of RI and using cascade on update/delete. I just don't know if we've considered all the ramifications of how we can leverage RI to make more things more automated.

Pancho’s picture

Version:6.x-dev» 7.x-dev

Crell is absolutely right. For this move we need to take our time and make sure we oversee the consequences.
We definitely should do this, but only in D7.

Crell’s picture

Status:Needs work» Postponed

I think it's actually been determined that as long as we support any database configuration that doesn't enforce RI (like MySQL MyISAM tables, the most commonly used type), we really can't use it. So that means somewhere around MySQL 6.0, maybe. :-)

I'm marking this postponed rather than won't fix, because it would be nice to fix in 5 years or so.

moshe weitzman’s picture

Until this reopens, please join the discussion at http://groups.drupal.org/node/4328

Crell’s picture

Status:Postponed» Needs work

Reopening this. Even if we don't use them directly, extra metadata is good and can be helpful for advanced usage (Views, Schema module, and other weirdness).

Josh Waihi’s picture

subscribing, very interested in see this happen.

Josh Waihi’s picture

Issue tags:+database

There are two imediate issue I see here:

  1. writing code in drupal that obeys foreign key contraints, i.e. invoking hook_nodeapi before a delete or update is done and so on.
  2. how to structure the schema api

Both of these are crucial for myisam to be able to play with the big boys, innodb, postgres and sqlite.
I can understand webchicks reasons for wanting to use cascades however, I feel like this defeats the purpose of FKs, they're are suppose to be a constraint to prevent such actions from happening. I suspect most of drupal will crash and burn when these FKs get implemented but will rise from ashes a stronger framework for it.

As for schema api, well thats the fun bit! I notice we seem to like seperating things out into massive schema arrays that take up a huge amount of lines. This could sooo easily be reduced. Naturally, I guessing my drupal maturity (exuse the spelling) is lacking here, that there must be a good reason for it. But my ideal schema example ladies and gentalmen:

<?php
$schema
['table1'] = array(
   
'fields' => array(
       
'tid' => array('type' => 'serial', 'primary key' => TRUE), // primary key auto assumes not null
       
'title' => array('type' => 'varchar', 'not null' => TRUE, 'unique' => TRUE),
       
'body' => array('type' => 'varchar', 'length' => 225),
    ),
  
// no primary key defined here because already defined in field
   // if primary key was made of 2 < columns then here a primary key
   // would sit: 'primary key' => array('tid', 'title')
   // same for unique: 'unique' => array('title')
   // indexs are automatic for unique and primary keys but can still be defined
   // here. 'indexes' => array(......
);

$schema['table2'] = array(
   
'fields' => array(
       
'tid' => array('type' => 'serial', 'primary key' => TRUE),
       
't1id' => array('type' => 'int', 'references' => array('table1' => 'tid')),
       
'revision' => array('type' => 'varchar', 'not null' => TRUE, 'length' => 225),
    ),
   
// this would also work:
    // 'references' => array(
    //     't1id' => array('table1' => 'tid'),
    // ),
);
?>

I'm happy to do the work for this issue at this stage, just need some direction and motivation as in keeping this issue alive with feedback and opinion.
Such transactional features to drupal does beg questions such as, "should the transactional features in the mysql driver be seperated out into innodb and myisam instances?" I mean, with more relational functionality - more work will need to be done in PHP to ensure myisam keeps up - innodb shouldn't suffer because of it. Yes this will make myisam slower through drupal, BUT, if innodb becomes the default, all we have to do is merely support myisam as it will slowly dull in usage.

Lots to think about for this issue, I'm sure this comment doesn't cover it all but is a start.

Crell’s picture

As has been discussed at length, as long as we support MyISAM tables we can't leverage foreign keys natively as foreign keys. I'm OK with that. It's still valuable metadata to include as it can potentially be leveraged by other modules like Views, or for user-space offline integrity checking, etc. Even if we don't send that data down to the database itself, it's still useful data to keep around.

The syntax convention for schema itself is just Drupal's standard "long form" array syntax. Let's not play with that for now. Plenty of other battles to fight first. :-)

Josh Waihi’s picture

Drupal now has support for 3 transactional databases and 1 non-transactional database, it is truly a shame to not be better using these transactional databases because of MyISAM. I believe we can support foreign keys natively as foreign keys and still support MyISAM - however, I'd recommend that the MySQL driver defaults to innodb but can fall back to MyISAM as foreign key support in MyISAM would mean holding the logic in Drupal. The bigger problem here is getting people to let go off MyISAM and excepting innodb. This would make the database world in Drupal a whole lot easier to implement relational logic.

Crell’s picture

It would. But abandoning ISAM is not on the table right now, so we can't base our business logic on assuming that we have foreign key integrity. But we'll need to the schema API support first either way, so let's do that and then revisit the question of what to actually do with that metadata later.

Josh Waihi’s picture

But we'll need to the schema API support first either way

@Crell, I think you missed a key word here I can't quite figure out what.
I didn't say anything about abandoning MyISAM, but merely adding the logic it needs to be relational to its drupal driver. Yeah I know, lots of other worms I'll leave in the can for now.

For now, what is it exactly we need to do to the schema to get things rolling. Yes, that means I putting my hand up to roll up the sleaves and give this thing a hoon!

(if you dont' know what hoon is, ask Druplicon in IRC)

Crell’s picture

1) Extend the Schema API syntax to allow for the definition of foreign keys in an easily-parsable way.

2) Ensure that information is cached along with the rest of the schema.

3) Include FK specifications for core schema definitions wherever appropriate.

4) Profit (in later patches)!

Josh Waihi’s picture

Status:Needs work» Needs review
StatusFileSize
new1.92 KB
Failed: Failed to apply patch.
[ View ]

ok, well lets start with documentation, cause if this is right, then the code will follow. How does this look for starters?

Crell’s picture

Looks sane to me on first glance. Let's get more eyes on it, though.

earnie’s picture

Status:Needs review» Needs work

Subscribing. For non-transactional (MyISAM) could the DB API create a referential table that then can be used to create deletes and/or updates to the appropriate tables? We can eliminate a lot of module code if this is kept by the DB API.

In the example of the node table, you should add vid as a foreign key to the node_revisions table and type as a foreign key to node_type.

Crell’s picture

All this talk of emulating RI on MyISAM is really reaching too far, IMO. First of all, doing so requires that every single table in the database be either MyISAM or InnoDB. No mixing, or we don't know which driver to use. Switching to depending on RI also means that we would have to completely reverse the order of every delete or update operation in core. Right now, for instance, we delete a node from the node table (the primary table), then from each of the dependent tables. If we had forced RI, that would break. If we had delete-cascade, that first delete would also delete all of the other dependent records, even if we logically want to do something else first. In that case, we'd have to completely and totally redesign the operations for the nodeapi hooks. Full on RI is more of a wtf than OOP would be. :-)

And doing that while we still need to deal with decidedly non-trivial task of manually enforcing RI on MyISAM, which is the most commonly used database type. That means we're slowing down the code on the most common use case. Sorry, I can't get behind that.

There's plenty of other things we can get from knowing about FKs besides DB-level enforced RI.

Wow that's a lot of acronyms in this post...

Josh Waihi’s picture

However Crell, MyISAM does not have to be the most used database. If we rose our benchmark up to innodb then things would change dramatically. And for those on crappy hosting plans only with MyISAM support - they would have a slower system - no slower than a relational database I would think though. I like earnie's idea here. And sooner or later the logic in drupal will need to be reversed. Other node tables should have there rows deleted that relate to node first before deleting a record for {node}. Why not add to the schema:

<?php
$schema
['mytable'] = array(
   
'fields' =>array(
     
//......
   
),
   
'primary keys' => array(
    
// .....
   
),
   
// other stuff like foriegn and unique keys + indexes
    // then allow db specfic tweaks to happen
  
'mysql' => array(
       
'engine' => 'innodb',
   )
  
// or ....
  
'engine' => MYSQL_ENGINE_INNODB // only mysql looks for this key.
   // or ....
  
'relational' => TRUE // again, only MySQL could find this key useful.
);
?>

This would help determine which driver to use and even let you use both at the same time.
I'd like to know what David Strauss's thoughts are on this topic.

Frando’s picture

What about first actually adding foreign key definitions to core (what this issue is about), and then discussing what to do with them (in a seperate issue)?

And talking about db-specific extensions like in the last response is a great topic, but shouldn't hold back foreign key definitions IMO.

Referential integrity is just one thing that can be done with foreign key definitions in core, but there certainly are other purposes (e.g. views could likely use them).

chx’s picture

I agree with Frando. Small moves++

earnie’s picture

I agree but the patch given still needs work to add all the foreign keys of the node table example.

Josh Waihi’s picture

Status:Needs work» Needs review
StatusFileSize
new3.41 KB
Failed: Failed to apply patch.
[ View ]

getting the ball rolling on this again. did that last patch get commited to HEAD? for some reason, it didn't show up in this patch when I did cvs diff -up though the I had applied the patch. weird.
anyway, this updates the node schema implementation but it removes the description column.

chx’s picture

Status:Needs review» Needs work

Josh, removing unsigned does not belong to this issue. Also:

+ * 'vid' => array('node_revision' => 'nid'),

this certainly is wrong, this wants to be

+ * 'vid' => array('node_revision' => 'vid'),

Josh Waihi’s picture

StatusFileSize
new3.31 KB
Failed: Failed to apply patch.
[ View ]
Josh Waihi’s picture

Status:Needs work» Needs review

Damn it! why don't comments submit when submitting patches???

I corrected the mistake chx pointed out. however, node.uid is not unsigned as defined by node_schema()

Status:Needs review» Needs work

The last submitted patch failed testing.

Josh Waihi’s picture

StatusFileSize
new3.41 KB
Failed: Failed to apply patch.
[ View ]
Josh Waihi’s picture

Status:Needs work» Needs review

not sure what was wrong with the patch I built this one with cvs diff -up > 111011-foreign-key-docs-3.patch

Dries’s picture

The proposed syntax looks good. Would be good if, in addition to just docs, we'd extend some actual tables.

catch’s picture

Status:Needs review» Needs work

Why the primary key and index changes on {node}?

Josh Waihi’s picture

@catch index changes? I just copy & pasted from node.install.
@Dries, sure, next thing I'll extend some tables

catch’s picture

Josh, ignore me, missed that this only affects schema.inc for now.

Syntax looks good to me as well

Josh Waihi’s picture

Status:Needs work» Needs review
StatusFileSize
new16.38 KB
Failed: Failed to apply patch.
[ View ]

Ok, I spent a bit of time adding foreign key definitions to core, I may have missed a few out or applied them where they are not needed, please revise this, I'm no Drupal all rounder, I'm not 100% on where foreign keys should or should goas far as core is concerned, not that it matters while we're not using them :( Feedback please.

catch’s picture

Status:Needs review» Reviewed & tested by the community

These all look good to me. It covers some basics, we can refine in followup patches.

Dries’s picture

Status:Reviewed & tested by the community» Fixed

Committed to CVS HEAD. Thanks!

yched’s picture

Status:Fixed» Active

menu.inc received those changes in the same commit:

- if ($arg[0] == "admin" && module_exists('help') && $function('admin/help#' . $arg[2], $empty_arg) && $help) {
+ if ($arg[0] == "admin" && user_access('access administration pages') && module_exists('help') && $function('admin/help#' . $arg[2], $empty_arg) && $help) {

which doesn't seem related.

Crell’s picture

Um, yched's comment looks like it's a commit-related snafu? I defer to Drieschick on what if anything to do here...

webchick’s picture

Status:Active» Needs work
Issue tags:+Needs Documentation

Hm. I'm not sure. I guess I'll leave it since it's been there for about a month now and no one seems to have noticed. Might've been another bug fix patch.

However, this change never got documented in the converting 6.x to 7.x modules page, which is dreadfully sad.

jhodgdon’s picture

Status:Needs work» Needs review

I have added this to the 6.x-7.x module upgrade page http://drupal.org/node/224333#foreign-keys-added

I only added text to say that these foreign key descriptions had been added to the core schema, since that is all I saw in this patch.

I wasn't sure if there should also be a recommendation for contrib module developers to similarly add foreign keys descriptions to their schema, or if that recommendation was new for 7.x. Is Drupal doing something with foreign key descriptions, and if so, is it new to 7.x? If the answer to those questions is yes, then please mark this back to Needs Work.

Otherwise, if the text in the upgrade page is correct, probably this can be marked Fixed.

Status:Needs review» Needs work

The last submitted patch failed testing.

jhodgdon’s picture

Status:Needs work» Needs review

Testing bot: Please ignore, this is a doc review not a patch review. :)

Josh Waihi’s picture

Status:Needs review» Fixed

Docs are great thanks

Status:Fixed» Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

JacobSingh’s picture

Title:Add foreign keys to core» Show where foreign keys might exist in core schema definitions. NOTE: NOT YET IMPLEMENTED ANYWHERE.

This is a serious WTF. I don't even know why it is in core, but as long as it is I'm changing the title to reflect reality so other people might not be so confused.

Damien Tournoud’s picture

Title:Show where foreign keys might exist in core schema definitions. NOTE: NOT YET IMPLEMENTED ANYWHERE.» Add foreign keys to core

This array is for documentation purposes only, the same way we have a 'description', but more easily parsable by automatic tools.

JacobSingh’s picture

Okay, but sorry please pick a different title. This issue in no way "Adds foreign keys to core". It simply indicates where they might exist, one day.

I'm sure I'm not the only person to be massively confused by this. Especially if someone comes from a background of having worked on more mature RDBS implementations.

A Foreign Key definition means something, just like primary or unique. How is someone supposed to know that one is for "documentation" purposes and the others actually do what they are supposed to?

Damien Tournoud’s picture

It's just documented wrong.

I edited http://drupal.org/node/224333#foreign-keys-added to clarify that this is for documentation purposes only.

LinuxBloke’s picture

Why oh why is anyone still using MyISAM???? InnoDB has been around for quite a while now, and current and proper Database design simply involves foreign keys, period. I would not want to do a mission critical application without them, except for very limited contexts where it would make sense.

I am currently dealing with a project written in Drupal 6.X, and in the middle of scaling this application up for high-availability, high-performance, high-reliability and high-volume. I don't want to migrate to 7.x because 7.x is not ready for prime time yet.

I have converted all tables in this application from MyISAM to InnoDB for performance reasons, and hesitated putting in the foreign key relationships because I'm not sure what happens if Drupal 6, not being foreign key aware, tries to do the work the database should be doing already. Seems to me Drupal 6 will crash and burn.

Is there a reliable way to use real InnoDB FKs with Drupal 6 code? The application should NOT be dictating relational issues to the database, but it would seem that Drupal commits this sin big time. Or maybe there's a replacement for Drupal's ORM core that does the right thing?

Any help would be helpful!

mikl’s picture

For actually adding foreign keys in the database, see #1011802: Really add foreign keys to core….

Liam Morland’s picture

Issue summary:View changes

Referenced issue has been marked as a duplicate of #911352: Document that foreign keys may not be used by all drivers.