As a newcomer to internet technology, I've been investigating and weighing the merits of the approaches various CMSs take to content/documents, and would like to discuss those approaches (and others) from a general design point of view, not to promote a certain CMS.
Possible formats of import/export/publish/storage:
plain text, .rtf, html, xhtml, xml (various--incl. DocBook, specialized DTDs, etc.), OpenOffice 1.1, OpenDocument (OO 2.0), .pdf, more?
Design decisions:
flat files or relational database
single or multiple storage formats, and which one(s)
format conversion technology (xslt, etc.)
web publishing techniques (as html, xhtml + css)
WYSIWIG editors: xml, xhtml, html, text (Bitflux, Kupu, TinyMCE, etc.)
Decision Criteria:
* The storage format allows the greatest flexibility in import/export/publishing to and from other formats
* Usability and ease-of-integration of editor
* Relative Storage space requirements
* Relative Performance
I see that the consensus on a few Drupal forum threads addressing this topic was to do conversions outside of the Drupal framework and to avoid doing such conversions in PHP.
My specific project requirements (they carry over to the next page):
*upload OpenDocument document files (user can upload to my site)
*user can input and edit content using WYSIWIG editor without needing to know tags or wiki syntax
*automatic import of periodically published xml data from another website that uses a very specialized DTD