Problem/Motivation

Data stored in tablefield can be an efficient means of storing viewing and editing tabular data to enable complex structures on drupal entities. However, accessing the data in SQL can be difficult because it is stored in PHP serialized format. Recent advances in PostgreSQL and MySQL have made enabled tabular data storage in JSON that is queryable. I propose to:

  1. Enable a backend data storage options of JSON
  2. Provide functionality to let views use JSON data extracts (i.e., show a specific row/column in Views) for more rapid rendering and analysis in Views

Proposed resolution

Allow users to specify the backend storage type, serialized or json on a per field basis.

Figure 1: Example table values (used in queries in Figure 2).

Figure 2: SQL queries and results using JSON against table in Figure 1.

-- Get the data in the 2nd row (first data row)
select (field_test_value::json->'tabledata')::json->'row_1' from field_data_field_test;
                                     ?column?
----------------------------------------------------------------------------------
 {"col_0":"for","col_1":"20","col_2":"19","col_3":"18","col_4":"17","weight":"2"}
(1 row)

-- Get the header in the 3rd column
drupal.alpha=# select ((field_test_value::json->'tabledata')::json->'row_0')::json->'col_2' from field_data_field_test;
 ?column?
----------
 "1985"
(1 row)

-- Get the value in the 3rd coluimn of 2nd row
drupal.alpha=# select ((field_test_value::json->'tabledata')::json->'row_1')::json->'col_2' from field_data_field_test;
 ?column?
----------
 "19"
(1 row)

Figure 3: Setting page.

Remaining tasks

  1. (done) Add storage format option to field configuration screens
  2. Replace calls to "serialize" and "unserialize" with "tablefield_serialize" and "tablefield_unserialize", and then handle choice of serialization format in helper functions "tablefield_serialize()" and "tablefield_unserialize()".
  3. Develop code to transform data from serialized to json and back if administrator decides to switch backend data formats for a field.
  4. Add Views code to allow extraction of JSON data at the SQL level

User interface changes

n/a

API changes

n/a

Data model changes

Fields may have data stored as JSON if site builder desires it, but default behavior users serialize() so no automatic changes to data storage are required.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

robertwb created an issue. See original summary.

robertwb’s picture

Issue summary: View changes
FileSize
13.2 KB
9.84 KB
lolandese’s picture

Nice.

I guess that at some point a conditional is needed to detect if the used database type and version supports the JSON Data Type.

  // Get the connection object for the specified database key and target.
  $conn = Database::getConnection();

  switch ($conn->databaseType()) {
  case 'mysql':
    if (version_compare(mysql_get_client_info(), '5.7', '>=')) {
      print 'JSON Data Type available with ';
    }
    print 'mysql  ' . mysql_get_client_info();
    break;

  case 'pgsql':
    $v = pg_version();
    if (version_compare( $v['client'], '9.2', '>=')) {
      print 'JSON Data Type available with ';
    }
    print 'pgsql  ' . $v['client'];
    break;

  default:
    print $conn->databaseType();
    break;
  }
robertwb’s picture

Yeah - most definitely. Attached is a first draft - tested a good bit on the latest release, and tested only briefly against the latest dev. It does not include your db capability check, but I will get that in next week.

lolandese’s picture

Status: Active » Needs work

PHP Fatal error: Maximum function nesting level of '256' reached, aborting! in /sites/all/modules/contrib/tablefield/tablefield.module on line 2439

This seems to happen when using xdebug because of the recursive use of the function tablefield_unserialize().

Furthermore it gives an undefined index error because the field setting initially does not exist. This can probably be avoided in the same function with:

  $storage = isset($field['settings']['backend_storage_format']) ? $field['settings']['backend_storage_format'] : 'serialize';
  switch ($storage) {

Also on https://simplytest.me/project/tablefield/7.x-3.x?patch[]=https://www.dru... we get a 500 Internal Server Error. Unsure if this has the same cause.

robertwb’s picture

Thanks -- will take a look at this asap.

robertwb’s picture

Picking through the nesting error, I am a little stumped (but will dig deeper). For what it's worth, the functions tablefield_serialize() and tablefield_unserialize() do not call themselves recursively, so there may indeed be recursion happening, but not directly from those methods. I DO seem to be encountering that error when attempting to over-ride the tablefield contents in a custom form plugin (somewhat unrelated project) -- continuing to dig.

/**
 * Helper function to unserialize data for storage.
 */
function tablefield_unserialize($field, $data) {
  switch ($field['settings']['backend_storage_format']) {
    case 'json':
    $unserialized = json_decode($data, TRUE);
    break;
    
    case 'serialize':
    default:
    $unserialized = unserialize($data);
    break;
  }
  return $unserialized;
}


/**
 * Helper function to serialize data for storage.
 */
function tablefield_serialize($field, $data) {
  switch ($field['settings']['backend_storage_format']) {
    case 'json':
    $serialized = json_encode($data);
    break;
    
    case 'serialize':
    default:
    $serialized = serialize($data);
    break;
  }
  return $serialized;
}

robertwb’s picture

Updated patch due to changes in code base in latest dev version. Do we have test bot coverage on this? Any idea HOW we might do that, or rather, is there a thought on best ways to enable this and test cases we might consider? If there is not yet, I will add a test coverage issue and link back here.

robertwb’s picture

robertwb’s picture

Removed a dpm() call from patch #08 -- this may be why there was an error in the 03 version? Testing on simplytestme and will report back.

robertwb’s picture

Fixed missing $field info in calls to tablefield_unserialize() (discovered in testing).

robertwb’s picture

Status: Needs work » Needs review
FileSize
9.48 KB

Adapted settings detection from the code that @lolandese provided in #3:

  • This has now been tested successfully on my local postgresql 9.5 and PHP 7.0.22 install.
  • Simplytest.me has mysql 5.5 which is NOT sufficient to enable the JSON - so I cannot verify if mysql works (version detection for postgresql from #3 needed a little tweaking).
  • Code in #11 during rendering json_decode was returning an object and caption testing was failing - this is now fixed, json_decode is instructed to return an associative array.
lolandese’s picture

Cool. I had to change the conditional to:

  case 'mysql':
    if ((function_exists('mysql_get_client_info') && version_compare(mysql_get_client_info(), '5.7', '>=')) || !function_exists('mysql_get_client_info')) {
..}

because the function is dropped in PHP 7 (see http://php.net/manual/en/function.mysql-get-client-info.php). In the case the function does not exists we can be pretty sure we're dealing with PHP 7.

EDIT:
We should find an alternative for mysql_get_client_info in PHP 7 to determine the MySQL version.

lolandese’s picture

Uff!! Looks like it can be as simple as:

/**
 * Provides list of storage format options based on database capability.
 */
function tablefield_field_backend_storage_options() {
  // Only provide PHP serialize if the database does not provide JSON column type.
  $backend_options = array(
    'serialize' => 'Serialized (PHP)',
  );
  // Get the connection object for the specified database key and target.
  $conn = Database::getConnection();
  $v = $conn->version();
  switch ($conn->databaseType()) {
  case 'mysql':
    if (version_compare($v, '5.7', '>=')) {
      $backend_options = array(
        'serialize' => 'Serialized (PHP)',
        'json' => 'JSON',
      );
    }
    break;

  case 'pgsql':
    if (version_compare( $v, '9.2', '>=')) {
      $backend_options = array(
        'serialize' => 'Serialized (PHP)',
        'json' => 'JSON',
      );
    }
    break;
  }
  return $backend_options;
}

That should work for both PHP 5 and 7.

lolandese’s picture

Status: Needs review » Needs work

Furthermore:
Warning: Missing argument 2 for watchdog(), called in /home/martin/www/yellow/sites/all/modules/contrib/tablefield/tablefield.module on line 2512.

See https://api.drupal.org/api/drupal/includes%21bootstrap.inc/function/watc....

lolandese’s picture

Also:
Notice: Undefined index: backend_storage_format in tablefield_unserialize() (line 2489 of /home/martin/www/yellow/sites/all/modules/contrib/tablefield/tablefield.module).

This could be solved by using an update hook in the .install file giving a field settings value to all existing table fields. You can use the existing update hooks as examples.

Alternatively use the code below that assigns a value if no field settings have been set.

/**
 * Helper function to unserialize data for storage.
 */
function tablefield_unserialize($field, $data) {
  $backend_storage_format = isset($field['settings']['backend_storage_format']) ? $field['settings']['backend_storage_format'] : 'serialize';
  switch ($backend_storage_format) {
    case 'json':
    $unserialized = json_decode($data, TRUE);
    break;

    case 'serialize':
    default:
    $unserialized = unserialize($data);
    break;
  }
  return $unserialized;
}
lolandese’s picture

Rolled suggestions of #14 and #16 into a new patch.

robertwb’s picture

Awesome @lolandese! I will get this tested on my alpha system asap.

lollypopgr’s picture

What is the status with this?

lolandese’s picture

@lollypopgr:
Nice to see interest in this awesome new feature built by @robertwb.

Feel free to test. Keep in mind your MySQL version should be 5.7+ or PostgreSQL 9.2+.

If this goes in it will likely be a new 7.x-4.x branch if it would have big implications for an upgrade on existing sites. To analyse that as well.

lolandese’s picture

Issue tags: +LutskGSW18