Now that DB-TNG has landed, we solidly require PHP 5.2 for D7. We also require MySQL 5.0 for Drupal 7.

It seemed like the biggest thing holding mandatory InnoDB back in Drupal 6 was host compatibility, but I'm not aware of any hosts that meet the significantly higher Drupal 7 requirements without providing InnoDB.

Some thoughts:
* Defaulting to InnoDB will improve medium and large site user experience because they won't have to hire a consultant to tell them to switch to InnoDB.
* When I've seen MySQL support tune a system, I haven't noticed them changing any table engines. But, if we default to InnoDB, MySQL support can get large sites much further on the track to scalability.
* The performance detriments to small sites are negligible.
* We can always add a MySQL-specific option to the Schema API to override the InnoDB default and go with MyISAM for things like search tables.
* I tried to make a foreign key API that created triggers to support pseudo-foreign keys on MyISAM tables, but MySQL 5.0 requires the SUPER privilege to manage triggers. Requiring InnoDB is far, far less demanding than requiring the SUPER privilege on the DB server. So, the choice is really InnoDB or nothing approaching foreign keys until we require MySQL 5.1.
* We have transaction support in core now, but it's not going to get utilized by 95%+ of Drupal sites unless *we* make the call to go to InnoDB, at least as a default. Reliable transaction support is key to making tools like permissions and block administration more reliable by ensuring either all changes or no changes are written. We currently have issues with permission and block administration applying partial (corrupt or incomplete) updates that take down a site.
* Database size may even drop if we implement foreign keys because we won't need the nid columns in tables with vid. We would just use vid and a foreign key to node_revisions. This saves us a column in every taxonomy and CCK table, among many others.

Files: 
CommentFileSizeAuthor
#94 innodb_ftw.patch1.05 KBchx
Unable to apply patch innodb_ftw_3.patch View
#90 innodb_ftw.patch915 byteschx
Passed: 10908 passes, 0 fails, 0 exceptions View
#76 innodb_ftw.patch1.01 KBchx
Passed: 10905 passes, 0 fails, 0 exceptions View
#75 innodb_ftw.patch1014 byteschx
Passed: 10905 passes, 0 fails, 0 exceptions View
#70 innodb_ftw.patch982 byteschx
Passed: 10801 passes, 0 fails, 0 exceptions View
#46 innodb_default.patch873 bytesDavid Strauss
Failed: Failed to apply patch. View
#28 mw.patch1.22 KBmoshe weitzman
Failed: Failed to apply patch. View
#16 mw.patch966 bytesmoshe weitzman
Failed: 7199 passes, 4 fails, 0 exceptions View

Comments

Damien Tournoud’s picture

One side node: the upgrade path will have to be planed *very carefully* if we want this to work.

David Strauss’s picture

@Damien Tournoud: For small sites, we can do the upgrade in place. For large sites, we can provide SQL to pipe into MySQL.

David Strauss’s picture

And for *really large* sites, the issue of moving to Drupal 6 or 7 greatly trumps the difficulty of moving to InnoDB.

David Strauss’s picture

Title: Require InnoDB in MySQL » Default to InnoDB in MySQL

Changed to "default" because we may be able to emulate foreign keys in the DB-TNG layer for MyISAM tables.

earnie’s picture

subscribing.

One issue with InnoDB is to be sure to make the index tables separate tables. Else one might end up with one extremely large file that extends beyond many host providers provisions with no easy recourse for the user.

moshe weitzman’s picture

+1. This is the sort of patch that needs no upgrade path. Don't mess with existing sites.

Crell’s picture

I don't know about requiring. Defaulting is debatable. I DO know hosts that offer PHP 5 and PDO and MySQL 5 that do not allow InnoDB, mostly for size reasons. Remember that InnoDB and MyISAM want different server configurations to behave optimally. We need to do some research before we can think about doing that.

That said, I am open to adding a "try InnoDB" flag to schema API as well as a "use InnoDB if possible" key to the database connection info for MySQL. (The new array-based config structure makes database-specific settings like that easy.) Well, both should probably be arbitrary database engine, I guess.

moshe weitzman’s picture

MySQL falls back to MyISAM if the specified Engine is unavailable so we don't need 'try'. 'Use InnoDB if possible' is exactly what we get by specifying InnoDB. There is little downside here and lots to gain, as David articulates in the opening post.

Crell’s picture

There's two issues here.

One is requiring InnoDB for everything so that we can program in FK-based logic. That worries me a great deal without more research.

The other is defaulting to InnoDB for everything. That worries me somewhat, because there are cases where you don't want that and we may end up with shared hosts that get cranky because InnoDB takes up more resources than MyISAM does.

First step is still, IMO, research.

moshe weitzman’s picture

@Crell - the issue is titled 'default to InnoDB'. Lets just limit the scope to that. David mentioned 'require' in passing but lets just agree that thats not under consideration in this issue.

There might be tables where you don't want InnoDB but surely those are fewer than tables where you do. So this is a net win. I'm all for "research", but I'm not sure where to start. Can you elaborate? This is going to be tiny code change, so I really don't see the risk in doing it without extensive "research". We are nowhere near code freeze. The surest way to get testing under different environments is to commit this now rather than near the freeze.

Evidence of successful fallback to MyISAM

earnie’s picture

IMO: All tables in one DB should be the same engine. Issues can abound with transaction ROLLBACK if you have a mix such that the MyISAM tables would still be updated.

Size of InnoDB vs MyISAM: From experience, I decided to convert all tables to InnoDB from MyISAM. I didn't know much about what I was doing. The size of the one index table grew to the point I nearly filled the extended drive. I then converted each table to use its own index file and removed the large index file. I went from 45GB used space to 6GB used space which is roughly the same size I would be with MyISAM.

webchick’s picture

When Drupal.org first moved to InnoDB, the performance was absolutely crushing. IIRC there were many tweaks on the MySQL/PHP/Apache side that needed to be performed in order for the site to perform at anything close to its previous levels. And (again, IIRC), we still have tables (search*, watchdog?) that are not InnoDB for performance reasons.

Most end users care about performance, not fault tolerance, which is the primary reason to go with InnoDB. Most end users do not have the ability to make modifications to php.ini, httpd.conf, and my.cnf.

I'd like to understand more about what happened when Drupal.org first moved to InnoDB, as well as how much of the above is true, rather than just what I remember vaguely.

earnie’s picture

@webchick: Do you know if on for drupal.org the index was left as one big index file or was it set to an index file per table?

hass’s picture

Defaulting to InnoDB is no good idea. The hard coded storage type MyISAM caused some issues in past and every time I reported this to the maintainers - they removed the MyISAM setting in their code. We should only default to the *servers default*. This is normally InnoDB on MySQL 5.x and might become a new storage engine in MySQL 6.x... I think this should be the decision of the Admin.

nnewton’s picture

+1 I'd support this. InnoDB allows for more scalability, performance, data integrity and features (transactions, fk..etc). MyISAM has its place, but it should be avoided when possible.

"When Drupal.org first moved to InnoDB, the performance was absolutely crushing. IIRC there were many tweaks on the MySQL/PHP/Apache side that needed to be performed in order for the site to perform at anything close to its previous levels. And (again, IIRC), we still have tables (search*, watchdog?) that are not InnoDB for performance reasons."

Not really. When drupal.org moved to InnoDB it did so because of the locking on MyISAM. With this implicit locking lifted, the database server was crushed under the increased traffic. In essence, InnoDB did its job too well and it took artificial limits to restrict concurrency until we could upgrade the database cluster. While InnoDB does have higher requirements (especially memory requirements), this wasn't our issue with converting drupal.org. Also, drupal.org is not exactly a normal site. Anyone running a site even close to our traffic levels is either already using InnoDB or has found a really impressive way to sacrifice goats to the database god.

"Size of InnoDB vs MyISAM: From experience, I decided to convert all tables to InnoDB from MyISAM. I didn't know much about what I was doing. The size of the one index table grew to the point I nearly filled the extended drive. I then converted each table to use its own index file and removed the large index file. I went from 45GB used space to 6GB used space which is roughly the same size I would be with MyISAM."

Your talking about the file per table setting. Your confusing things a bit though. In MyISAM, the index is stored in a separate file entirely as you describe here. However, with InnoDB the data is stored in the leaves of the primary key index. The index actually _is_ the table. What your doing with file per table is just storing each table (and its indexes) in their own file instead of the global InnoDB tablespace. This allows you to play some tricks with the file system and means that when you drop a table the file actually gets deleted and you get that space back. With the global tablespace, when you drop a table you don't get that space back (that global file never shrinks).

This setting makes management easier and allows you to get space back when you drop a table as stated above, but it doesn't do anything to reduce the size of InnoDB tables. Your first experience with global tablespace may have been caused by repeated imports expanding the file.

"@webchick: Do you know if on for drupal.org the index was left as one big index file or was it set to an index file per table?"

It uses a global tablespace. We do this so we can allocate a large consecutive block on the disk for the tablespace. This reduces seek time slightly.

moshe weitzman’s picture

Assigned: Unassigned » moshe weitzman
Status: Active » Needs review
FileSize
966 bytes
Failed: 7199 passes, 4 fails, 0 exceptions View

Thanks Narayan. We now have Strauss and NNewton supporting this change. Pretty compelling IMO.

Here is the patch.

webchick’s picture

Thanks Narayan. That's very helpful.

Can we get some testing on some platforms that do and don't support InnoDB (preferably on the crappiest shared hosting possible) please? I want to make sure this change is bulletproof.

dmitrig01’s picture

moshe - I don't think it should be done like this. Not all tables need to be InnoDB, only some tables. There should be a schema api key like 'use myisam' which defaults to false. or schema['table']['database settings']['mysql']['use myisam'] = TRUE.

earnie’s picture

Status: Needs review » Needs work

Yes, the user must have a choice. That choice also should be given during the install process within the UI.

earnie’s picture

Your talking about the file per table setting. Your confusing things a bit though. In MyISAM, the index is stored in a separate file entirely as you describe here. However, with InnoDB the data is stored in the leaves of the primary key index. The index actually _is_ the table. What your doing with file per table is just storing each table (and its indexes) in their own file instead of the global InnoDB tablespace. This allows you to play some tricks with the file system and means that when you drop a table the file actually gets deleted and you get that space back. With the global tablespace, when you drop a table you don't get that space back (that global file never shrinks).

Yes that is what I meant. And With the global tablespace, when you drop a table you don't get that space back (that global file never shrinks) cost me dollars with my host and it was ever expanding. With file per table I gained back the space. I am importing data and creating 10K nodes with taxonomy daily. The question I have about file per table is; can it be set at runtime or is it a configuration that has to happen at mysqld start?

moshe weitzman’s picture

Status: Needs work » Needs review

@earnie - the usability folks will rightfully recoil at adding a UI where admin chooses the engine type for each table at install time. you can make a contrib module for that if you want. please don't change status for this.

dimitri and earnie are wanting more choice for the admin at install time. but the admin has no choice today. this preserves that - we just give a better default. further, the notion that a few tables need to stay MyISAM is pretty weak. When you get to be very large, you are hand tweaking your queries and tables. The default that Drupal installed with is the least of your concerns. And Jeremy debunked the notion that search tables perform hbetter as MyISAM. See his benchmarks on the Assoc or Infra list when we deployed Xapian on drupal.org

I don't think we need to muck with schema api for this, but i am willing to do that if Crell think so (he is DB maintainer).

nnewton’s picture

"The question I have about file per table is; can it be set at runtime or is it a configuration that has to happen at mysqld start?"

Sadly, you need to change this setting and restart MySQL to put it in play. I do like this setting for management though. At Bryght we use file_per_table on most everything, having the global tablespace can be problematic for management.

earnie’s picture

Status: Needs review » Needs work

Sadly, you need to change this setting and restart MySQL to put it in play. I do like this setting for management though. At Bryght we use file_per_table on most everything, having the global tablespace can be problematic for management.

Then this puts a default of InnoDB in jeopardy. The best we can do is "highly suggest" but default to the database configuration.

@moshe: I do not mean to suggest that each table has a choice; it is one or the other not more than one so the UI would only ask once during the install phase. Yes, InnoDB is much better suited for high read traffic. It can be slower for high write traffic based on what I've read at MySql sites. It is also good where you need to support transactions for larger sites but allow smaller sites to use MyISAM anyway.

I want to see InnoDB as the default; I fear that we are going to put too much burden on the small time user whose hosting company isn't very cooperative. Your patch doesn't give an option for anything else so it is a forceful use of InnoDB and that is what I'm against. The global table space issue is an issue I don't want to force on anyone.

webchick’s picture

We do have that "Advanced settings" fieldset we could potentially cram this setting into.

nnewton’s picture

The global tablespace "problem" is not that large an issue and shouldn't block this. It has been the default for a very long time and has worked well for a very long time. I'm unclear on how your DB space ballooned so egregiously, but thats not normal.

Crell’s picture

"Advanced stuff" is a usability faux pas and is already a misnomer. There's nothing "advanced" about selecting a database server, which is in that fieldset.

That said, were we to make this configurable the correct way to do it is not a "use isam" flag but a "default table engine" setting, of which MyISAM and InnoDB are but two options. (MySQL has a half-dozen engines it supports.) I'm not sure off hand if there is a way to ask the server which engines it offers, but if so we could make it dynamic.

While we're at it, it looks like we're still using the version-sensitive charset setting directive. Even Drupal 6 requires 4.1 already, and Drupal 7 will require 5.0. Let's remove the ugly pseudo-comments while we're at it. :-)

RobLoach’s picture

This issue is my favourite issue of the week!

moshe weitzman’s picture

Status: Needs work » Needs review
FileSize
1.22 KB
Failed: Failed to apply patch. View

I think it is an unwise UI choice to ask users to make this decision at install time (even in a collapsed fieldset). Let them have their drupal already. This is a reversible decision.

The attached patch implements a new mysql_engine key like mysql_suffix so that tables can self declare themselves if needed. I also removed the conditional comments for charset per Larry's request.

earnie’s picture

Status: Needs review » Needs work

I like this patch except that you don't give the user a choice except for having to know all tables being installed and declaring the 'mysql_engine' for each table separately. Give the user a choice on the install interface please.

moshe weitzman’s picture

Status: Needs work » Needs review

I addressed this point already. Lets get some more reviews.

catch’s picture

Most users will have absolutely no idea what the difference is so shouldn't get a choice. Having said that, I'd like to see some benchmarks with InnoDB vs. MyISAM on medium sized sites (up to 10k nodes etc.) running on random arse shared hosting / completely untuned MySQL configs.

BioALIEN’s picture

This is brilliant, I am fully supportive of this change. However, I fear we'll piss some people off by this forceful change. I'm with earnie on this one. Default to InnoDB but give the user a chance to have a say during install time and the perfect place for this is "Advanced settings" as webchick pointed out.

An average person would never touch the "Advanced settings" fieldset so you will get your wish with defaulting to InnoDB and all the goodness highlighted here. However, any semi/experienced sys admin will want to be in control of this right from install time.

Here's a use case:
Developer owns a development server, top of the range, it supports InnoDB, has mountains of ram and processing power. The work is being developed for a small client extremely tight budget with very poor server and going with InnoDB would be a resource hog for them. How do you hand over the project - InnoDB or MyISAM?

I don't know about you, but I choose the latter.

jaydub’s picture

subscribe

catch’s picture

Many, many shared hosting providers expect you to provide a port for the MySQL database. Also, usability testing shows that lots of users open every single fieldset on every single form to see what's there. So we can't assume this will be hidden. On the other hand, it's not a choice that can do a great amount of harm (compared to many others when first setting up Drupal), so I'm not opposed to a choice, but we need to fix that form up a bit if we're going to add additional options (and we should rename 'advanced' to 'additional options' or 'optional options' ;)

This should also be looked in the context of maybe having SQLite support and an un-hacked installer - so Drupal runs as soon as you open it on SQLite, and you get to switch to a MySQL database later, if that happens, these arguments are moot.

earnie’s picture

Most users will have absolutely no idea what the difference is so shouldn't get a choice. Having said that, I'd like to see some benchmarks with InnoDB vs. MyISAM on medium sized sites (up to 10k nodes etc.) running on random arse shared hosting / completely untuned MySQL configs.

@catch: Where do you get your data to backup this statement? Good administrators of Drupal will already know and the poor administrators need to learn. If he has an option to choose the poor administrator can query about what it means.

It needs to be an option left to the user as to which engine he wishes to use for his tables.

RobLoach’s picture

I think an option to convert to and from InnoDB/MyISAM would be helpful. Is this something that would be better lived in contrib though? An additional contrib module?

catch’s picture

Contrib module would be pretty easy to do for later conversions. Like I said above, I'm fine with an option, but only if we tidy up that fieldset a little (or refactor the installer completely like chx has suggested) - that could be a followup patch though.

earnie’s picture

Re: convert to and from InnoDB/MyISAM

phpMyAdmin will allow it to happen. Would be a good addition to the dba module, though.

Crell’s picture

What about allowing a table in schema to define its table type, and then an option on the install page (and then a subsequent admin page somewhere) to set the table type where the default value is "let schema define it". 99% of people can just ignore it and the schema definition will do what it wants, and for the edge case where you want to override it you can force an engine site-wide.

I'm still concerned at this point that we're trying to make an over-simplified push button out of something that is fundamentally not a simple push button concept...

moshe weitzman’s picture

@Crell - it is possible that we are oversimplifying. But I am going for the simplest implementation that could possibly work. Lets commit this and gather feedback. If we need a UI, we'll add one. If we need to rollback because of some serious hosting problems, we will. I feel like install UI is really the realm of the install profile.

Crell’s picture

I'm comfortable making engine a table configuration option in Schema API. After that we can figure out a UI later. webchick, are you OK with that approach?

webchick’s picture

I'm much more a fan of "Test this on crappy shared hosting, THEN get feedback, THEN get commit" than the inverse. :P

Implementation-wise, I agree a flag in that custom modules can use hook_schema_alter() to override would be preferable to one hard-coded option every time, regardless of what that hard-coded option is.

moshe weitzman’s picture

Perhaps you guys missed the latest patch. We already have a schema APi flag which can be altered by install profile, another module, etc.

@webchick or anyone else - could you get concrete about what "test on crappy shared hosting really means".

Just to reiterate

  • Hosts which only offer MyISAM will work just like today - their customers get MyISAM
  • Reverting this is a one line change. Well before code freeze is exactly the time when you *want* to commit slightly experimental patches like this.
Crell’s picture

Anonymous’s picture

Status: Needs review » Needs work

The last submitted patch failed testing.

David Strauss’s picture

FileSize
873 bytes
Failed: Failed to apply patch. View

I've re-rolled Moshe's patch, but I haven't tested it.

David Strauss’s picture

Status: Needs work » Needs review
RobLoach’s picture

Is there anything we can do here for Postgres and SQLite?

David Strauss’s picture

@Rob Nope. MySQL is the only one that allows pluggable engines.

Crell’s picture

Component: database system » mysql database

No, this probably should be refiled over to MySQL specifically as it's not really relevant to other databases.

Status: Needs review » Needs work

The last submitted patch failed testing.

David Strauss’s picture

The test failure here is suspicious. It looks like a flaw in the test.

Josh Waihi’s picture

@David Strauss; you're right! see #355225: Inconsistant Insert Queries Between Database Drivers

its a test written specifically for MyISAM.

So..... Can we make innodb default?

chx’s picture

I deleted two posts here because their respected authors forgot to read that "Hosts which only offer MyISAM will work just like today - their customers get MyISAM" because "MySQL falls back to MyISAM if the specified Engine is unavailable".

chx’s picture

Status: Needs work » Needs review

Re-test (again).

Josh Waihi’s picture

some how I don't think this has re-tested. getting this thing going again

chx’s picture

Status: Needs review » Reviewed & tested by the community

Applies with slight offset. BTW. InnoDB will be needed for the job queue module :)

Crell’s picture

If InnoDB is required for job_queue, then you've just cut out a huge number of installs.

Josh Waihi’s picture

@Crell do we really want to encourage non-relational behavior in our logic though? I think this is still a step in the right direction and encourages people to program relationally even it it means having to put the work in to get installs uptodate

chx’s picture

No, we dont require InnoDB for jobqueue, i coded around it :)

Status: Reviewed & tested by the community » Needs work

The last submitted patch failed testing.

Crell’s picture

@Josh: InnoDB vs. MyISAM has nothing to do with "encouraging non-relational behavior". You can write SQL that is relationally braindead regardless of your table type or DB vendor. My point was that any module that relies on behavior of InnoDB that is missing in MyISAM will not function for a potentially large number of users.

earnie’s picture

But while transactional row locking isn't present a MyISAM database would still function without it to the point of potential inconsistency with the data. Only if relational cascading is made a dependency (and I'm not saying it should not be) would MyISAM be at a loss. If Drupal decides that MyISAM is important to support then relational cascading can never be used.

I use InnoDB on my sites since D5. The select queries are faster. Users do not post to my sites but I feed the sites with external data. So while the data load is slower, the data viewing is faster which is where I want the performance.

Josh Waihi’s picture

@Crell, if this patch goes through, won't that radically alter how many people use MyISAM?

Crell’s picture

It won't radically alter how many SERVERS have InnoDB installed in the first place. There's plenty of them in the shared hosting market. There are also those that have both installed but are optimized for MyISAM, not InnoDB. On those, we may just make things worse since the way to optimize a DB server for MyISAM is quite different than for InnoDB. That's my worry.

David Strauss’s picture

In the problem configuration for InnoDB as the default, all of the following must be true:
(1) The user must be installing against MySQL.
(2) MySQL must have InnoDB support enabled.
(3) MyISAM must be tuned.
(4) InnoDB must not be tuned.

MySQL, by default, does not match against #3. So, we are talking about something that a host has to explicitly do. Moreover, we have no idea how often hosts match all these criteria.

On the other hand, we have concrete benefits from using InnoDB in terms of data integrity and (with some introspection in the driver during installation) transaction support on by default.

Given that we would have to do some introspection or testing to detect transaction support, I think it would be reasonable to do a check if InnoDB is "tuned." By "tuned," I mean has more memory allocated to the InnoDB buffer pool than by default. It's a flag to us that the host cared at some level about InnoDB performance.

We really need InnoDB in broad deployment for transactions to matter.

earnie’s picture

The output of ``mysqld -v --help'' would give the required information to make a determination but I suppose that not all users may be able to execute mysqld.

chx’s picture

Status: Needs work » Needs review
FileSize
982 bytes
Passed: 10801 passes, 0 fails, 0 exceptions View

We do not check whether a code cache is installed or it's optimized. We do not check whether the Apache configuration is sane and performant. We do not check whether the kernel is optimalized for the hardware. These are not the responsibilities of a PHP application. Neither is InnoDB configuration.

Josh Waihi’s picture

Status: Needs review » Reviewed & tested by the community

#70++

Josh Waihi’s picture

@chx should this:

    $sql .= $table['mysql_engine'] . $table['mysql_suffix'];

be this:

    $sql .= $table['mysql_engine'] . ' ' . $table['mysql_suffix'];
catch’s picture

Status: Reviewed & tested by the community » Needs review

DEFAULT CHARACTER SET UTF8
Is this cruft now? Any circumstances where we might need it?

David Strauss’s picture

Status: Needs review » Needs work

@catch Yes, you're correct.

chx’s picture

Status: Needs work » Needs review
FileSize
1014 bytes
Passed: 10905 passes, 0 fails, 0 exceptions View

That was simple enough to fix.

chx’s picture

FileSize
1.01 KB
Passed: 10905 passes, 0 fails, 0 exceptions View

And a much better version thanks to David.

David Strauss’s picture

Status: Needs review » Reviewed & tested by the community

#76 gets my stamp of approval. None of the other table options are necessary or relevant to Schema API. Things like row formats should be tuned only by experienced DBAs on highly custom setups.

I'd even argue that we could drop the character set specification and hard-code UTF8 because our MySQL connection always uses UTF8, but that's for another issue.

Damien Tournoud’s picture

Wouldn't it make sense to quote mysql_engine and mysql_character_set? Just concatenating their values worries me a little.

webchick’s picture

Status: Reviewed & tested by the community » Needs review

It concerns me that everyone who's heavily advocating for for this patch are people who work almost exclusively on high-performance sites, and have the requisite knowledge to tune various configuration files to make MySQL hum. Your average user of Drupal does not fall into either of these brackets, and in fact most have absolutely no control over their configuration files.

So in order to commit this, something I really need to see are before/after benchmarks on at least 2-3 of your typical "bargain bin" hosting companies: DreamHost, GoDaddy, etc. I don't really know the best way to do these benchmarks, though, since we're going to want to capture both write and read speed, as well as how many connections can be handled. I know (or think I remember anyway) from the d.o upgrade that InnoDB is going to be slower, but will remain consistent even in high loads. I want to find out how much slower, because Drupal 7 is already much slower than Drupal 6 in its current state.

Something I don't oppose at all is adding the option during installation to choose InnoDB as the default table type, for high performance sites. It's defaulting it to InnoDB for all users (even if there's an option to switch to MyISAM; since no one will touch the defaults without knowing what they're for) that's concerning to me, without some data to back up that this will not have a crushing performance effect on your typical shared host user.

Suggestions on how best to do said benchmarks? ab? Testing how long it takes Devel Generate module to create 5,000 nodes?

webchick’s picture

Status: Needs review » Needs work

Marking needs work for DamZ's comments, as well as to get it out of the 'needs review' queue for this weekend. Unless someone can get those benchmarks sometime today, which would be awesome! :D

moshe weitzman’s picture

Really? This borders on abuse of your development team. You have a +1 from Narayan, Strauss, Weitzman, Damien, Crell. And you say that we know too much and that worries you? For fucks sake. This decision is beyond conservative. Conservative is OK, but this is simply fear of making a mistake. Mistakes are OK. We aren't building an airplane where lives are at stake. If we find that we slow down most sites, we revert.

The developers here think this is a good change for the majority of sites. And we are not trying to improve our own sites - we know how to convert tables. We are trying to improve the same cheap web hosting sites you mention.

Please try to balance the fear of mistake versus the fear that we stagnate. And I don't think the average site admin should have to choose storage engine during install. And anyway then we have the same debate about a default. Defaulting to MyISAM is hardly better than what we have today.

webchick’s picture

There's no need to get personal and start throwing eff-bombs. All I'm merely asking for is for this patch to be tested on shared hosting environments. I've asked for this since September 3, 2008, and have yet to see anyone post any results.

chx’s picture

Status: Needs work » Closed (won't fix)

Let's write another issue which allows per DB config screens -- we need that for SQLite already -- and let's add an engine selectbox there. We are not defaulting to InnoDB. Would have been better to hear this a couple dozen replies before, though.

chx’s picture

To correct myself, webchick did ask for benchmarks. Due to the complicated nature of these tests we did not do them. My problem is this: let's suppose we do the tests on, say, DreamHost. Now, the change will be within the error margin. Is that enough? Or do we need to test N sites for what N? Or which specific sites? And what does that tell us when between now and Drupal 7 release they can change their my.cnf and totally kill innodb performance.

chx’s picture

1.42 requests/sec on both on hostignition. MyISAM:

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:      103  106   1.5    105     109
Processing:   569  600  39.6    593     789
Waiting:      356  389  40.0    381     582
Total:        674  706  39.7    698     894

InnoDB:

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:      103  105   1.6    105     110
Processing:   575  601  24.9    597     701
Waiting:      365  389  24.7    384     490
Total:        680  706  25.3    702     806
Damien Tournoud’s picture

Status: Closed (won't fix) » Reviewed & tested by the community

Very similar too between InnoDB and MySQL on OVH, a French hosting company (testing /node with 2 simple nodes):

MyISAM: Requests per second:    2.24 [#/sec] (mean)
InnoDB: Requests per second:    2.11 [#/sec] (mean)

I see no reason not to commit this.

webchick’s picture

Status: Reviewed & tested by the community » Needs work

Additionally, Dries did benchmarks and came up with the following:

[11:30]  <Dries__> Very similar too between InnoDB and MySQL on OVH, a French hosting company:
[11:30]  <Dries__> MyISAM: Requests per second:    2.24 [#/sec] (mean)
[11:30]  <Dries__> InnoDB: Requests per second:    2.11 [#/sec] (mean)

Edit: Nevermind. I'm an eeediot. :D He just copy/pasted DamZ's.

While 6-9% is not exactly negligible, we are testing shared hosts here, and so the variability is all over the map. The main thing is it's not orders of magnitude slower, which is a good and happy thing.

We still need to document this... somewhere. Hrm. I guess INSTALL.mysql.txt. A CHANGELOG.txt entry would be good too.

Let's get those done real quick, and then I'll commit this.

webchick’s picture

Heh. DreamHost, w/ D7 and front page with 10 nodes.

MyISAM: 0.76 [#/sec] (mean)

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:      102  145  88.6    122     647
Processing:   746 1171 366.8   1080    2903
Waiting:      488  754 300.3    729    2655
Total:        850 1315 437.8   1202    3550

InnoDB: 2.74 [#/sec] (mean)

              min  mean[+/-sd] median   max
Connect:      102  121  23.9    112     217
Processing:   187  244  39.9    244     382
Waiting:      181  239  40.2    238     381
Total:        289  365  52.9    353     547

InnoDB wins! :P In other news, wtf @ 0.76 req/sec?!

stewsnooze’s picture

Tested at 1and1 uk

InnoDB patch

Concurrency Level: 1
Time taken for tests: 23.429 seconds
Complete requests: 50
Failed requests: 0
Write errors: 0
Total transferred: 740600 bytes
HTML transferred: 713900 bytes
Requests per second: 2.13 [#/sec] (mean)
Time per request: 468.577 [ms] (mean)
Time per request: 468.577 [ms] (mean, across all concurrent requests)
Transfer rate: 30.87 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 32 34 1.0 34 37
Processing: 408 435 9.8 435 451
Waiting: 307 313 4.9 312 326
Total: 441 468 10.0 469 486

Plain old head

Concurrency Level: 1
Time taken for tests: 23.662 seconds
Complete requests: 50
Failed requests: 0
Write errors: 0
Total transferred: 688550 bytes
HTML transferred: 661850 bytes
Requests per second: 2.11 [#/sec] (mean)
Time per request: 473.235 [ms] (mean)
Time per request: 473.235 [ms] (mean, across all concurrent requests)
Transfer rate: 28.42 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 33 34 1.1 34 37
Processing: 405 439 30.8 435 604
Waiting: 307 319 25.6 314 486
Total: 440 473 30.8 468 638

chx’s picture

Status: Needs work » Needs review
FileSize
915 bytes
Passed: 10908 passes, 0 fails, 0 exceptions View

First cut at docs.

chx’s picture

The code did not change that's still in #76 so we do not need to wait for the bot.

webchick’s picture

Status: Needs review » Needs work

That doesn't really cover the reasoning behind the decision and some of the rich discussion that was had in this issue. Also, it has typos. ;)

How about something like:

CHANGELOG.txt:
- Default to InnoDB engine on MySQL when available, for increased scalability and data integrity.

Bottom of INSTALL.mysql.txt:
Note that if the InnoDB storage engine is available, it will be used for all database tables. InnoDB provides features over MyISAM such as transaction support, blah dee blah dee blah. For more information, see [link to MySQL docs on storage engines].

Dries’s picture

The proposed changelog entry looks good -- maybe "to InnoDB instead of MyISAM ..." (i.e. explicitly mention MyISAM)?

chx’s picture

Status: Needs work » Needs review
FileSize
1.05 KB
Unable to apply patch innodb_ftw_3.patch View
webchick’s picture

Status: Needs review » Fixed

Great. I got some final touch-ups from David Strauss in IRC who otherwise gave his approval, added those and committed to HEAD.

Thanks, all! :D Way to rally!!

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Vacilando’s picture

Subscribing.

mcurry’s picture

subscribing