What are the differences between these two modules?


cpliakas’s picture

Taxonomy CSV import/export is a full featured module complete with a UI that allows you to import taxonomy terms in many different ways. It is well written, well supported, and well maintained, and it fits a majority of use cases. Taxonomy Builder API is a very small, lightweight utility that gives programmers some basic API functions to more easily create large taxonomy hierarchies from various data sources. Although there is admittedly overlap and duplication, I committed this project to Drupal.org because Taxonomy CSV import/export couldn't handle my use case for the last three projects I had to create taxonomy trees for. With that being said, I fully support merging the API and feature sets from Taxonomy Builder API into Taxonomy CSV import/export and actually hope to do so in the future. However, there is enough difference in use cases, architecture, and philosophy that I think a separate project is warranted for now.

The reasons why I added this project to Drupal.org are the following:

  • I required a lightweight API to import extremely large taxonomy trees. Taxonomy CSV import/export doesn't split up the API from the UI, so it is a larger module that has a lot of stuff I don't normally need. The projects I was working on had large enough footprints as it was, so adding more code was not desirable and in some cases problematic.
  • I had to import tens of thousands of terms from very large CSV files. From what I can tell, Taxonomy CSV import/export requires that you upload a file through the UI. Taxonomy Builder API allows developers to specify paths to files and actually has a utility function to import terms via the Batch API. Therefore you can split up the processing of large files even if they are tens of megabytes in size. Files of this size cannot be imported in one page request without increasing the script execution time or memory limit, which is not desirable on production boxes.
  • Taxonomy Builder API utilizes various hooks so that you can handle extra data in a CSV file or take action after terms have been added to the database. Often times I get CSV files that were not created with Taxonomy in mind, and I have to pull out the data that's required for the hierarchy and do something outside of taxonomy with the other pieces of information. Taxonomy CSV import/export doesn't have any hooks you can implement, so it would have been extremely difficult for me to utilize that module without significant custom code or having to rework of the source CSV files. Again, I am 100% willing to merge this functionality into Taxonomy CSV import/export, but it is a significant architecture change and in my opinion is best developed and refined in a separate project for now.
  • I have used this code on multiple projects and needed some place to put it. In the spirit of open source, I chose to put it in a place where everyone can use it. If people find it useful and improve on the code, it makes this module a better candidate to get integrated in Taxonomy CSV import/export.

In terms of a UI, I have no plans to implement one, however I wouldn't turn a patch down if one was submitted. The intention of this module is not to compete with Taxonomy CSV import/export, but I am not going to stop people from trying to improve on the codebase. Again, I really do hope to merge this project into Taxonomy CSV import/export once Taxonomy Builder API matures, but as it stands this module saves me a ton time for my specific use cases and allows me to import large datasets without having to worry about memory consumption or script execution times. If I were to use Taxonomy CSV import/export, I would have to write 90% of the code in this module anyways.

Daniel_KM’s picture

I share this point of view and I don't think merge is needed immediately. They have their advantages in different situations. It depends on what you want to do.

In fact, Taxonomy CSV import/export is designed as a one shot module: you use it when you install your site and you disable or desinstall it once import or export is done. On the other hand, Taxonomy Builder Api is designed as a permanent module used to synchronize another database of terms with the drupal one. That's why Taxonomy CSV has a graphical interface and allows multiple import formats and options. That's why Taxonomy Builder is lighter, faster, manage hooks and can be a better choice.

In addition, Taxonomy CSV checks imported terms and gives a lot of infos on it: inputs are often buggy and incompatible with Drupal taxonomy format, so they need to be checked, which requires time and memory. In a continuous import, a check can be performed upstream from the database from which terms are exported.

A few months ago, I write a "line import api" with the same purpose as Taxonomy Builder in mind, but as design was based on primary Taxonomy CSV code, it's only a temporary solution. Taxonomy Builder has its interest here and can replace it. Some other modules have the same goal, particularly Import/Export API.

I don't think a merge would be useful immediately. Nonetheless, next release will clarify structure of Taxonomy CSV. Currently, only graphical interface and process are separated. Next one will divide graphical interface, input check, import and export. A module such as Taxonomy Builder could replace or complement import part. A module such as Taxonomy parser could replace input check and import parts. A module such as Taxonomy export could replace export part, etc.


Daniel Berthereau
Knowledge manager

cpliakas’s picture


Thanks for taking the time to share your point of view. Once the GUI and API are split in Taxonomy CSV import/export, I would love to discuss deprecating this module and integrating some of the functionality into your project. Seems like a natural "merging" point to me.

Thanks again, and great work on your project,

cpliakas’s picture

Status: Active » Closed (fixed)