Introduction
The ETL (Extract, Load and Transform) module gives the ability to export and denormalize any kind of content. Basically you are able to select content types or custom entities to be exported into a destination database. It also gives you the possibility to alter the data before persist it.
* For a full description of the module, visit the project page:
https://www.drupal.org/sandbox/gabrielmachadosantos/2728473
Requirements
This module requires the following modules:
Automated review results
http://pareview.sh/pareview/httpgitdrupalorgsandboxgabrielmachadosantos2...
False positives:
-
FILE: /var/www/drupal-7-pareview/pareview_temp/inc/etl.admin.inc -------------------------------------------------------------------------- FOUND 0 ERRORS AND 1 WARNING AFFECTING 1 LINE -------------------------------------------------------------------------- 209 | WARNING | Do not use the raw $form_state['input'], use | | $form_state['values'] instead where possible --------------------------------------------------------------------------
Git clone
git clone --branch 7.x-1.x http://git.drupal.org/sandbox/gabrielmachadosantos/2728473.git etl
Comments
Comment #2
PA robot CreditAttribution: PA robot commentedFixed the git clone URL in the issue summary for non-maintainer users.
We are currently quite busy with all the project applications and we prefer projects with a review bonus. Please help reviewing and put yourself on the high priority list, then we will take a look at your project right away :-)
Also, you should get your friends, colleagues or other community members involved to review this application. Let them go through the review checklist and post a comment that sets this issue to "needs work" (they found some problems with the project) or "reviewed & tested by the community" (they found no major flaws).
I'm a robot and this is an automated message from Project Applications Scraper.
Comment #3
gabrielmachadosantos CreditAttribution: gabrielmachadosantos at CI&T for Pfizer, Inc. commentedComment #4
David Fiaty CreditAttribution: David Fiaty commentedHello Gabriel,
Nice module. These are the comments I would make:
1. The etl.install file is empty. Have you planned to add some code there?
If not, it should maybe be removed from the package.
2. In my opinion, there are not enough inline comments in some parts of the code of the module.
For example in etl.admin.inc: etl_entity_settings_form() and etl_get_mapping_config_rows() need a bit of logic commenting.
3. In et.admin.inc, the 2 first functions don't follow the function comments convention.
"Implements hook_[hook_name]" should maybe be added to the comment section of these functions.
It looks like you've done it for the other core hooks.
4. In etl.module line line 37, you are loading entities using the function etl_load_entities().
Since etl_load_entities() returns a value, your code will run faster if you store the result in a variable
before running the foreach loop on it.
As you're loading entities and some of them could be big, this:
Should maybe be changed to this:
Thank you.
Comment #5
gabrielmachadosantos CreditAttribution: gabrielmachadosantos at CI&T for Pfizer, Inc. commentedHi David, thanks for the review.
Please follow the considerations below:
1. Removed the install file.
2. I believe the logic and the variable names are pretty straight forward since it is a simple manipulation values.
3. It does follow the comment function convention, those aren't hooks, both of them are callbacks forms. https://www.drupal.org/coding-standards/docs#forms
4. I've updated it but only for easy reading because it won't affect at the code performance. http://stackoverflow.com/questions/11074141/using-function-call-in-forea...
Thanks again.
Comment #6
kapil.ropalekar CreditAttribution: kapil.ropalekar as a volunteer and at Cybage Software Pvt Ltd. for Cybage Software Pvt Ltd. commentedHi gabrielmachadosantos,
My findings for your modue as follows:
etl.admin.inc
severity: criticalLine 112: table names should be enclosed in {curly_brackets}
$destination_id = $destination_database->query("SELECT column_name FROM information_schema.columns WHERE table_name = '{$destination_table}' ORDER BY ordinal_position LIMIT 1")->fetchField();
Thanks !
Comment #7
klausiHm, that looks like SQL injection since $destination_table is directly concatenated into the SQL string. Always use placeholders for that, see https://www.drupal.org/writing-secure-code
Comment #8
yogeshmpawarPareview.sh is giving following errors.
http://pareview.sh/pareview/httpgitdrupalorgsandboxgabrielmachadosantos2...
Will do a manual review in some time.
Comment #9
gabrielmachadosantos CreditAttribution: gabrielmachadosantos at CI&T for Pfizer, Inc. commentedThanks kapil.ropalekar and klausi for the review. I've already updated to follow the security standards. Can you please review it?
Yogesh Pawar, there is only one error and it is a false positive as I described in the issue. Thanks.
Comment #10
Evgeny_Yudkin CreditAttribution: Evgeny_Yudkin as a volunteer and at DrupalJedi commentedHello,
I`ve reviewed your module, there are some issues:
Also you may add a separate page to import all entities, filtering by bundle/id If no views installed.
Kind Regards.
Comment #11
Evgeny_Yudkin CreditAttribution: Evgeny_Yudkin as a volunteer and at DrupalJedi commentedComment #12
gabrielmachadosantos CreditAttribution: gabrielmachadosantos at CI&T for Pfizer, Inc. commentedHi there,
Please consider the points below:
1. The label is "Package size" inside the "Batch" section and it is entities, it is the number of entities to be exported inside a package. I agree that it can have a field description, please feel free the provide a patch with a better description and I will be glad to move forward.
2. I think there might be a misunderstanding on the module's purpose. This module was not created to move content to other environments, there is already a module to do this (deploy), it gives you the ability to copy your Drupal content to other databases to be consumed mostly for BI reports. Database information is not set on hook, it is a variable and it can be set in different places. About database connection, I can't use multiple database connection because I need to perform both operations simultaneously. Entities are set in hooks, which gives the ability of any module set up an entity to be exported or synced.
3. As I explained above, this is not a Drupal -> Drupal sync module, as noted on the module's description, quoted below:
"The ETL (Extract, Transform and Load) module gives the ability to export and denormalize any kind of content. Basically you are able to select content types or custom entities to be exported into a destination database. It also gives you the possibility to alter the data before persist it."
4. It does support entityreferences and any other field type or custom entity and if doesn't you can easily use hook_etl_entity_load_data_alter or hook_etl_get_field_parser_value_alter to make it support.
5. I was not able to find the unnecessary variables on etl_load_entities(), can you please point the line? http://cgit.drupalcode.org/sandbox-gabrielmachadosantos-2728473/tree/etl...
6. The etl_load_entities() doesn't have any params but I will add the params info for etl_queue_export() in the next iteration since it is not impacting on the module's functionality.
7. "ETL" is a very known term used specially on BI area, it stands for "Entity Transform and Load" which means it is not only an Entity Transfer as you said, it does much more than simply move entity content, this information will denormalized and it can be changed during the transition if the user demands.
Comment #13
aryashreep CreditAttribution: aryashreep as a volunteer commentedPareview.sh is giving following errors.
Review of the 7.x-1.x branch (commit d2fcd84):
This automated report was generated with PAReview.sh, your friendly project application review script. You can also use the online version to check your project. You have to get a review bonus to get a review from me.
Source: http://pareview.sh/ - PAReview.sh online service
Comment #14
klausiOne minor coding standard error is surely not an application blocker, please do a real manual review.
Comment #15
gabrielmachadosantos CreditAttribution: gabrielmachadosantos at CI&T for Pfizer, Inc. commentedComment #16
amaria CreditAttribution: amaria commentedAutomated Review
CodeSniffer errors:
FILE: /root/repos/pareviewsh/pareview_temp/etl.api.php
----------------------------------------------------------------------
FOUND 2 ERRORS AFFECTING 1 LINE
----------------------------------------------------------------------
54 | ERROR | Type hint "array" missing for $entity_data
54 | ERROR | Type hint "array" missing for $info
----------------------------------------------------------------------...
Quite a few of these type hinting errors in Pareview now.
Manual Review
There was an error trying to create the destination table. SQLSTATE[42000]: Syntax error or access violation: 1075 Incorrect table definition; there can be only one auto column and it must be defined as a key
Using MySQL db. I believe you need to create a Primary Key. Unless somehow that's expected to be done in the alter routines? If so, should be documented.The starred items (*) are fairly big issues and warrant going back to Needs Work. Items marked with a plus sign (+) are important and should be addressed before a stable project release. The rest of the comments in the code walkthrough are recommendations.
If added, please don't remove the security tag, we keep that for statistics and to show examples of security problems.
This review uses the Project Application Review Template.
Comment #17
PA robot CreditAttribution: PA robot commentedClosing due to lack of activity. If you are still working on this application, you should fix all known problems and then set the status to "Needs review". (See also the project application workflow).
I'm a robot and this is an automated message from Project Applications Scraper.