In Drupal 4.7 blobs were only for mysql (but Drupal handled them like 'text' time - at least it used wildcard like LIKE '%%%s%%' etc), not PostgreSQL. But when column types were changed to bytea it had broken Drupal 5.0 because there are no calls to db_encode_blob() / db_decode_blob() functions present.

Currently I'm receiving such errors after importing some language:

    * warning: pg_query(): Query failed: ERROR: index row size 3024 exceeds btree maximum, 2713 HINT: Values larger than 1/3 of a buffer page cannot be indexed. Consider a function index of an MD5 hash of the value, or use full text indexing. in d:\highspeed\htdocs\drupal\drupal-cvs\includes\database.pgsql.inc on line 125.
    * user warning: query: _locale_import_one_string INSERT INTO locales_source (location, source) VALUES (' modules/aggregator.module:15', ' <p>Thousands of sites (particularly news sites and weblogs) publish their latest headlines and/or stories in a machine-readable format so that other sites can easily link to them. This content is usually in the form of an <a href="http://blogs.law.harvard.edu/tech/rss">RSS</a> feed (which is an XML-based syndication standard).</p> <p>You can read aggregated content from many sites using RSS feed readers, such as <a href="http://www.disobey.com/amphetadesk/">Amphetadesk</a>.</p> <p>Drupal provides the means to aggregate feeds from many sites and display these aggregated feeds to your site''s visitors. To do this, enable the aggregator module in site administration and then go to the aggregator configuration page, where you can subscribe to feeds and set up other options.< in d:\highspeed\htdocs\drupal\drupal-cvs\includes\database.pgsql.inc on line 144.

Please also pay attention to the first warning: you can use source(30) when creating index in mysql, but not in postgre. Currently I have removed it from PostgreSQL in patch like it was before in Drupal 4.7.

[note: the error seems clear even from this little and the report was unreadable from repeating the same for pages. chx]

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

sammys’s picture

Is there a reason we're moving towards binary data for the source? If so, we have to modify locale to work properly with binary data by using %b without single quotes in the INSERT/UPDATE queries and adding a db_decode_blob() on SELECT. Comments?

doq’s picture

BYTEA can contain binary data - means there is no problem storing backslash & \0:
A byte whose value is zero (not the character 0, but the null byte) must be escaped, and the backslash character must be escaped.

So why do we need to use it?

Also

Supported Operators
PostgreSQL offers a single BYTEA operator: concatenation. You can append one BYTEA value to another BYTEA value using the concatenation (||) operator.

Note that you can't compare two BYTEA values, even for equality/inequality. You can, of course, convert a BYTEA value into another value using the CAST() operator, and that opens up other operators.

So it looks like impossible to use LIKE '%%%s%%' .

-------------------------
And what about indeces?

doq’s picture

How do we make the proper system_update_* for this?
As I see we only need to make update for 5.0 beta1. Previous 4.7 appears to work correctly.

drumm’s picture

Title: locale_* tables use blobs? Why? » locale_* tables need %b

Instead of changing the db schema this late in the release cycle is generally not a good idea. Have you considered using %b to properly encode the blobs.

doq’s picture

there are HUNDREADS OF PLACES where %s must be changed to %b and where we need to add db_decode_blob().

As I have said before: blobs can't be compared even for equality, so how do we convert LIKE '%%%s' etc. to run properly on PostgreSQL.

drumm’s picture

I count at most 33 places:

> grep -r {locales_target} . | grep translation | grep '%s' | wc -l
19
> grep -r {locales_source} . | grep source | grep '%s' | wc -l
14

And the change is simply changing '%s' to %b.

As far as I can tell, none of these use LIKE:

> grep -r {locales_source} . | grep source | grep LIKE | wc -l
0
> grep -r {locales_target} . | grep translation | grep LIKE | wc -l
0

However, if blob/btyea are still not suitable, then there must be a locale_update_N() added to locale.install to make this table change for existing sites.

chx’s picture

Please enlighten me -- would a character varying be a solution? I know it's late in the release cycle, but... maybe this is the simpler solution and it affect pgsql only.

drumm’s picture

Character varying?

Steven’s picture

We use blobs to enforce case sensitivity on mysql.

chx’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
736 bytes

Following sammys' advice

[06:36] [sammys] chx: so as I see it, we revert the type to text as it was for 4.7
[06:37] [sammys] chx: for pgsql only

I rolled a patch. He will provide some upgrade queries for the beta testers who want to keep their databases but an update will not be provided.

sammys’s picture

Status: Reviewed & tested by the community » Needs work

as chx has already mentioned we're reverting to the 4.7 implementation. Don't commit yet. i'm adding some comments to the schema.

sammys’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
2.23 KB

Here's the patch with comments in the schema so we don't have this problem again.

drumm’s picture

Status: Reviewed & tested by the community » Fixed

Committed to HEAD with stylistic improvements to the comments.

sammys’s picture

Assigned: Unassigned » sammys

Here are the pgsql queries to manually update any alpha/beta 5.0 installs to the new schema.

ALTER TABLE locales_source RENAME source TO source_old;
ALTER TABLE locales_source ADD source text;
UPDATE locales_source SET source = encode(source_old, 'escape');
ALTER TABLE locales_source ALTER source SET NOT NULL;
-- **** NOTE: only do this step if you're quite sure the values match ****
ALTER TABLE locales_source DROP source_old;
sammys’s picture

Might as well give you a pasteable one for both locales_source and locales_target tables.

-- converting locales_source table
ALTER TABLE locales_source RENAME source TO source_old;
ALTER TABLE locales_source ADD source text;
UPDATE locales_source SET source = encode(source_old, 'escape');
ALTER TABLE locales_source ALTER source SET NOT NULL;
--
-- converting locales_target
ALTER TABLE locales_target RENAME translation TO translation_old;
ALTER TABLE locales_target ADD translation text;
UPDATE locales_target SET translation = encode(translation_old, 'escape');
ALTER TABLE locales_target ALTER translation SET NOT NULL;
--
-- **** NOTE: only do this step if you're quite sure the values match ****
--
ALTER TABLE locales_source DROP source_old;
ALTER TABLE locales_target DROP translation_old;
doq’s picture

do we need to implement this as system_update*()?

sammys’s picture

We decided against it as the only people needing this are those that are using the beta release.

Anonymous’s picture

Status: Fixed » Closed (fixed)
Anonymous’s picture

Issue summary: View changes
FileSize
1.85 KB