Added multilingual support to the standard entity schema

To enable full multilingual support for entities, a new schema standard has been defined and introduced. The new schema allows entity properties to have a different value for each language by mimicking Field API's SQL storage:

The {entity} base table holds the basic entity keys and metadata only:
```
| entity_id* | uuid | revision_id | bundle_name |
```
- Primary key: entity_id.
The {entity_revision} table holds the basic revision entity keys:
```
| entity_id | revision_id* | langcode |
```
- Revision id is the primary key.
- langcode: Stores (per-revision) the language the entity was created in.
The new {entity_field_data} table stores the entity property data per language:
```
| entity_id* | revision_id | langcode* | default_langcode | label |
```
- Primary key: entity_id + langcode.
- langcode: Stores the language of the property values.
- default_langcode: Boolean flag that indicates whether the row holds the values for the original language of the entity.
The new {entity_field_revision} table holds the revisions of the property data:
```
| entity_id | revision_id* | langcode* | default_langcode | label |
```
- Primary key: revision_id + langcode.

By default entity load conditions are applied to the default language data. Applying conditions to any language or a specific language is also supported.

This is a form of optimized storage for entity properties that might be kept or dropped as soon as Property and Field APIs are unified. When this happens, both fields and properties will be storage-independent and it will be possible to switch to a fully normalized schema or keep this approach without changes to the code exploiting the Entity Property API. See #1346204: [meta] Drupal 8 Entity API improvements and #1346214: [meta] Unified Entity Field API for more details on this.

Querying

The end goal of the new standard is letting developers write storage-independent code. For this reason it is critical for querying to be performed through the new Entity Query API. This natively supports the schema defined above and will know how to deal with the various tables involved.

Currently the EQ API does not make any assumption on the language conditions to be applied to the query, thus, if the result needs to be language-aware, explicit language conditions will have to be set. If a condition needs to be applied to a field value in the original entity language, no matter which one it is, a default_langcode (meta)field is available.

<?php
$result = \Drupal::entityQuery('node')
  ->condition('promote', 1)
  ->condition('status', 1)
  ->execute(); // Retrieves all nodes that have at least one published translation promoted to the front page.

$result = \Drupal::entityQuery('node')
  ->condition('promote', 1)
  ->condition('status', 1)
  ->condition('langcode', 'en')
  ->execute(); // Retrieves all nodes that have one english translation promoted to the front page.

$result = \Drupal::entityQuery('node')
  ->condition('promote', 1)
  ->condition('status', 1)
  ->condition('default_langcode', 1)
  ->execute(); // Retrieves all nodes being promoted to the front page in the original language.
?>

API Changes

At data structure level the main change is that now the only reliable way to access entity properties is through the Entity API accessors.

Before:

<?php
$label = $entity->label;
?>

After:

// Retrieve the default language label.
$label = $entity->label->value;
// Retrieve the english label.
$label = $entity->getTranslation('en')->label->value;

Impacts:

Module developers

Updates Done (doc team, etc.)

Online documentation:

Not done

Theming guide:

Not done

Module developer documentation:

Not done

Examples project:

Not done

Coder Review:

Not done

Coder Upgrade:

Not done

Other:

Other updates done

Details:

Updated API change code sample.

Comments

Two tables for revisions

Anonymous (not verified) commented 27 December 2013 at 20:57

Why do we need two tables for revisions? I never understood that approach.
We have the "active revision" data table and the "all other revisions" table.
We can easily have just one revision data table and just join the table with entity table on entity id AND revision id and end up with the absolutely same results. Thats why we store the active revision id in the entity table in the first place.

Why so complicated??