I've been looking through the core documentation on creating entities and the Entity API module but nowhere is it stated (that I could find) that the id for an entity should be an integer. I ran into this when using the data module to create some entities where using a string id as the key works much better. I was able to select the entity from a list but when I try to save it gave me a data integrity error since it is trying to save the string key to the target_id field which is an integer.

So my questions are:

1. Are entity ids required to be integers? If so, where is this documented? I realize that all core entities have integer ids and most custom ones do as well but if this is not a requirement it should not be assumed. The data module currently makes it easy to create entities with string ids.

2. If question #1 is that string ids for entities are allowed, should the field schema for entityreference be updated to allow for it? Is there anywhere else where this will cause a problem?

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

randallknutson’s picture

Project: Entity reference » Data

After digging around other places I noticed that there are many places where the entity_id is assumed to be an integer. I guess this would be a bug to file with the data module then since it allows entities to be created with entity ids that are not integers.

randallknutson’s picture

Title: target_id assumed to be an integer » Data Entity allows entities to be created with entity ids that are not integers

Changing name to better reflect the bug.

joachim’s picture

Status: Active » Postponed (maintainer needs more info)

> I ran into this when using the data module to create some entities

Could you explain more about how you were doing this? Which part of the UI are you using?

And yes, entity IDs are integers. I guess that is so ubiquitous throughout Drupal that I didn't think it needed to be said... :/

randallknutson’s picture

Status: Postponed (maintainer needs more info) » Active

In the data module it is perfectly possible to create an entity with a primary field that is a varchar. The data module would then create an entity type with the entity id of text which is not valid elsewhere.

I guess I had never looked closely enough at entities to realize that was a requirement.

I wouldn't worry too much about fixing it besides when checking to see if a primary field has been defined that the field is also an integer. That would solve the problem.

rooby’s picture

Priority: Normal » Major

This is at least major.

The issue is that when you set up a data table you select one (or more) of the fields as the primary key.
This primary key field can be any unique field, it doesn't have to be numeric and it doesn't have to be a certain size (I ran into this with a primary key field of large integers, which breaks things because they are too long).

The data entity blindly uses the primary key field (in the case multiple fields are primary it takes the first one - there is a todo in the code to fix this aspect).

There are many ways you can break things here because entity id is expected to be very specific data.

I think that for this module the entity ID should be a serial integer that is unrelated to the data table columns.

That's the only safe way I can think to do it because you can't know what kind of data is in the table columns.

rooby’s picture

Status: Active » Needs review
FileSize
2.35 KB

This is the best solution I can think of.

It makes it so that for a data table to be a data entity it must have a serial field, and that serial field is used as the entity id.

It also adds a note to the readme about this.

Works for me. I am now able to import using feeds without issue.

joachim’s picture

Status: Needs review » Needs work

This looks good, but it's too restrictive: as long as a field is an integer and unique, it can serve as an entity ID.

Also, my table gave me this error:

> Invalid argument supplied for foreach() data_entity.module:40 [warning]

rooby’s picture

This looks good, but it's too restrictive: as long as a field is an integer and unique, it can serve as an entity ID.

Not so. My table has a big int column that is unique and it was too big and I had issues.
I guess it has to be normal int or smaller.

So I could make it serial and if there are no serial fields then check for a unique int field normal or smaller.
Or maybe specifically normal int, as that is what is used elsewhere for entity id fields, like in the field api tables.

I will do that and fix the bug and post a new patch shortly.

joachim’s picture

My table's ID field is BIGINT, and the IDs look like '11200033035'.

rooby’s picture

That's odd, I wonder why mine was failing then (I also had 11 digit numbers).

I'll see if I can see why mine was failing before.

imclean’s picture

I added something like this to our dev version a while ago but forgot about this issue.

Main change:


/**
 * Get the name of a table entity's id.
 *
 * @todo: consider moving this to the DataTable class.
 */
function data_entity_get_id_field($table) {
  if (isset($table->table_schema['primary key'])) {
    // Use the first primary key which is an integer or serial
    foreach ($table->table_schema['primary key'] as $id_field) {
      if (($table->table_schema['fields'][$id_field]['type'] == 'int') ||
              ($table->table_schema['fields'][$id_field]['type'] == 'serial')) {
        return $id_field;
      }
    }
  }
}

You can still set multiple primary keys of all kinds and this will return the first pk that's an integer field. It doesn't do any further checking, I'm not sure how much hand-holding people should get when using a module like this.

rooby’s picture

I think it's fine to just say we use the first one.

However I think if there is a serial then that should get preference over any int fields.

Opinions?

imclean’s picture

I'm not sure that's an assumption we should be making. It also takes control from the user. The serial pk field can always be added before any integer fields if that's the required behaviour.

These are primary keys so some thought would've been put into the schema already.

rooby’s picture

I see your point.

I will make a patch like #11

joachim’s picture

Best thing might be to take the same approach as #1538588: Add Entity Label field (Was: allow entity ref fields to point to data items), and let the user select which column to use as the entity ID.

rooby’s picture

Makes perfect sense, why didn't I think of that.

imclean’s picture

Status: Needs work » Needs review
FileSize
2.69 KB

This should be pretty straight forward.

This adds an option to the "Entity Configuration Form", as per the other issue.

Not extensively tested and there's no fallback if an entity ID field is not selected. We could possibly fail more gracefully, however I think it's important people are prompted to select an ID field rather than have one assigned for them.

imclean’s picture

Could probably remove the primary key check as well but let's see if this is the right approach first.

joachim’s picture

Status: Needs review » Needs work
+++ b/data_entity/data_entity.module
@@ -23,21 +23,10 @@ function data_entity_meta_add_defaults(&$meta) {
+    $meta += array('entity_id' => '');

This should go outside the foreach loop, surely?

+++ b/data_entity/data_entity.module
@@ -65,7 +53,7 @@ function data_entity_entity_info() {
+        'id' => isset($table->meta['entity_id']) ? $table->meta['entity_id'] : '',

I'm pretty sure we can't set this to be empty -- lots of the entity system will break.

We probably need to keep data_entity_get_id_field() and keep the existing (lousy) logic as a fallback. Or come up with better :)

imclean’s picture

Makes sense. I've used the fallback from #11 and added an empty check for $table->meta['entity_id'].

imclean’s picture

Status: Needs work » Needs review

Update status.

joachim’s picture

Issue summary: View changes
Status: Needs review » Needs work

There's a few things that seem mixed up now in this patch:

      'entity keys' => array(
        'id' => (isset($table->meta['entity_id']) && !empty($table->meta['entity_id'])) ? $table->meta['entity_id'] : data_entity_get_id_field($table),
      ),

This seems a bit tangled. data_entity_get_id_field() is no longer the authority on the id key -- but its name says it is.

At first glance, it seems to me that data_entity_get_id_field() should hold the logic to see if the admin user has chosen a field to be the entity id.

Similar sort of problem in data_entity_admin_entity_form():

  // The id field has these set automatically; it's just there for show.
  $id_field = data_entity_get_id_field($table);
  $form['fields'][$id_field]['locked']['#default_value']    = TRUE;
  $form['fields'][$id_field]['locked']['#disabled']         = TRUE;
  $form['fields'][$id_field]['required']['#default_value']  = TRUE;
  $form['fields'][$id_field]['required']['#disabled']       = TRUE;

We're letting the user pick a field to use as entity Id here, but then we're using the existing logic to lock one of the DB fields!

Since a change in the id field means a change in the entity info, data_entity_admin_entity_form_submit() should clear the entity cache.

Lastly, we should consider the implications of changing the entity ID field. Or rather, we need to prevent that from happening once there is field data. This is because if you change your ID field, then all your entities effectively change their IDs: a row in your table that used to be entity 5 is now entity 47, and so its field data is now flapping around with no entity.

I'm not sure how best to handle that.

PS. We should keep the README changes that are in an earlier patch on this issue, and also change the 'Configure entity form' tab title as it's no longer just about the form.

joachim’s picture

> * @todo Add an admin UI to request tables for this rather than do all.

I'm thinking this is probably the way to handle it: add a new form where you 'entitize' a table. Here you pick the entity ID field, and confirm you want to define the table as an entity type.

Once the table's entity type has Field API fields defined on it, you can no longer change the entity ID.

joachim’s picture

Status: Needs work » Needs review
FileSize
8.34 KB

Here's a patch with that approach.

(NOTE: patch probably requires #1952884: Fix integration with Field UI to be applied first.)

This is what's in the patch:

- fix for the longstanding TODO to have a UI to explicitly choose to make a table into an entity type
- we no longer assume which field is the entity ID.
- this means there's a change in workflow. You now:
1. enable the table as an entity type and choose the ID field
2. set up the entity form (as before)
3. add Field API fields (as before)
- data_entity_get_id_field() has been removed. The entity ID field is stored in the table's metadata.

You can't change the ID field, or de-entitize a table once you have Field API fields defined in it. That's because it would completely break your field data to change the ID field.

I'll probably commit this patch fairly soon, though obviously if anyone has time to review it that would be great :)

IMPORTANT: If anyone has been running Data with previous patches from this issue, you will need to tweak the settings for your table. This is because your tables will NOT be flagged as entitized, but if you have fields on them you won't be able to use the UI to do so!!

The steps to deal with this are:

1. BEFORE your code is updated to this new patch, visit all your tables and make a note of the field that's being used for the entity ID
2. apply the patch / update to latest dev (once I've committed it!)
3. TEMPORARILY comment out the lines in data_entity_admin_entity_type_form() which disable the form elements:

    $form['is_entity_type']['#disabled'] = TRUE;
    $form['entity_id']['#disabled'] = TRUE;

4. visit the entity type settings form, select the 'entity type' box, and ensure the 'entity ID' is set to the ID you noted earlier
5. repeat for all your tables
6. remove the hack you did in step 3.

joachim’s picture

Status: Needs review » Fixed

Committed. New alpha release shortly!

Issue #1671380 by imclean, joachim, rooby: Fixed data tables without a suitable field to use as an entity ID getting defined as entities; added UI to specify which field to use as ID.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.