@benjy Is this something that you would consider for this module. I developed a custom module, for a Drupal 8 application at work, that uses the PHPWord (https://github.com/PHPOffice/PHPWord) project to allow Users, determined by role(s)/permission(s), to export Node entities to a Word document. There were some assumptions that I was able to make in my custom module because I knew that this module would, likely, only be used on that project. That being said, I think that I could use some of the code of that module, along with my recent familiarity of Entity Print, to get us started on the path to this feature. I could implement it in a similar manner to the PDF libraries currently in Entity Print so that the codebase remains modular and extensible.
Let me know what you think. I am completely open to your thoughts and suggestions.
If we are able to combine forces with the Print (and Printable) projects, I think that we will want to want to support EPUB and email. We already have a task for email support.
| Comment | File | Size | Author |
|---|---|---|---|
| #88 | add-export-to-word-support-2733781-88.patch | 5.94 KB | omarlopesino |
| #70 | entity-print-word-docx-settings-blank.png | 157.71 KB | socialnicheguru |
| #61 | entity_print-2733781-61.patch | 4.53 KB | daniel_j |
| #16 | 2733781-16-1.patch | 3.65 KB | benjy |
| #11 | interdiff-2733781-7-10.txt | 145.52 KB | jordanpagewhite |
Issue fork entity_print-2733781
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
jordanpagewhite commentedThere is at least one thing that we would have to make a decision on before working on this feature request. Given PdfEngine, we could either:
1. Implement a new Doc version of PdfEngine*, PdfBuilder*, EntityPrintPdfBuilder, and any other files that are PDF-specific OR
2. Generalize PdfEngine*, PdfBuilder*, and EntityPrintPdfBuilder to something like PrintEngine*, PrintBuilder*, and EntityPrintPrintBuilder, etc.
I think that it would probably be more desirable to try to modularize things, but I'm not sure how to best select which PrintEngine we use for each Print-type. Even if we just support PDF and Doc, we would need to be sure that we have 1 or less PdfEngine and 1 or less DocEngine at a time.
Comment #3
benjy commentedI do like this idea, initially I was always going to make this module a generic print module but as you point out, I've moved towards been specific to PDF, given I think we should replace Print in D8, i'd like to formalise a proper way forward.
I'm on holiday until 4th June so I have limited availability until then but I will definitely come back to this in the next week or two.
Comment #4
benjy commentedI have a couple of questions that will help architect this change:
This is a pretty exciting change, thanks for suggesting it, we may have to create an 8.x-2.x branch, i'm not sure yet.
Comment #5
jordanpagewhite commented1. Yes, assuming that we will use these two projects to include WordDoc (https://github.com/PHPOffice/PHPWord) and Epub (https://github.com/Grandt/PHPePub)
2. Agree
3. I think WordDoc and Epub formats are a good place to start. I would like to implement Email functionality too, but I don't think that necessarily be related to this issue. Ideally, it would be nice to have the option of emailing any print format.
4. Yes
P.S. Have a great vacation!
Comment #6
benjy commentedComment #7
jordanpagewhite commentedI'm uploading a WIP patch here for the Export to Word feature request. I have included an interdiff of 2735875-2 to 2733781-7 so that it is easier to view the code that is unique to this feature request. This is currently working fine on my machine, but there are a few things to iron out like an issue with adding CSS to the Word doc and images. I will work on those issues and upload another patch, again hopefully tomorrow.
I want to iron out exactly what code should be specific to 2735875 and which code is specific to this Export to Word feature request. That way, other contributors would have the opportunity to develop additional export formats by rebasing against the latest patch in 2735875.
I'm really excited about these changes. As always, I'd be excited to get feedback from anyone, particularly benjy. Please don't spend too much time reviewing this version of the patch though because I am already aware of a few issues that I'd like to work on.
Comment #8
jordanpagewhite commentedHmm. I think that I'm close to wrapping up this patch, rebased against my latest patch in 2735875. I'm the following error though:
Symfony\Component\Routing\Exception\MissingMandatoryParametersException: Some mandatory parameters are missing ("export_type") to generate a URL for route "entity_print.view".
I cannot figure out exactly why, but I've narrowed down the line throwing the error to line 95 in src/Renderer/ContentEntityRenderer.php :
$html = (string) $this->renderer->render($render);
If anyone has any idea why that line might be throwing that error, please let me know. They seem unrelated to me. I wonder if it's just a red herring and I should look somewhere else. Either way, I'll take a peep after work and see what I can do.
Comment #9
benjy commentedWhat's the URL you're trying to hit?
Comment #10
jordanpagewhite commentedOkay, so I think I've ironed almost everything out now. There is still some issue with the CSS and images on the WordExport. I will try to look into it more later if I have time.
Comment #11
jordanpagewhite commentedWhoops. I accidentally uploaded patches from the wrong branch. Fixed.
Comment #12
benjy commented@jordanpagewhite now we have the config and the renderer refactoring in, do you want to try re-roll this?
Comment #13
jordanpagewhite commented@benjy Yeah, I'll give it a shot this weekend. Sorry my development on this project has slowed down a little bit. I've been really preoccupied and I can really only contribute to d.o. projects in my free time. I should have time this weekend to tackle this re-roll though.
Comment #14
Derimagia commentedI can try helping with this but can anyone give me an update on how to apply these? It looks like these were against the early stages of 2735875 which has massively changed?
Comment #15
benjy commentedYou have two options, either checkout a commit close to when this was last applied and then rebase/merge HEAD and resolve the conflicts or you can just copy and paste the relevant parts out of the above patch into your project and construct a new patch from scratch.
Thanks for chipping in.
Comment #16
benjy commentedI looked at the patch and there was lot of stuff that has already been committed so I pulled out just the plugin into a new patch. I tested quickly and the code executes but the outputted document seems to empty, i'm not familiar with the library and didn't look any further.
I think we should leave the new actions to #2762093: Provide VBO actions for each export type
I've also moved the deliver method into the try/catch to make sure we gracefully handle any errors from the print engines.
Comment #18
benjy commentedComment #20
benjy commentedComment #23
benjy commentedFix test.
Comment #25
benjy commentedComment #26
Derimagia commentedOkay I tried it out and here is my feedback:
Breaking issues:
- Breaks with twig debug on! (PhpWord complains Warning: DOMDocument::loadXML(): Double hyphen within comment:
-> )
Some other notes that I have. These are more minor:
- Cannot use php7 (upstream issue, but it's broken with the version that's recomended to install) - see https://github.com/PHPOffice/PHPWord/issues/736
- Another issue that I"m pretty sure is just an upstream issue, but right now when I export a node the
- Requires php zip module, just needs documentation
- Should we have different display modes for different view types? (See ContentEntityRenderer::getViewMode)
- Settings when selecting "Word Docx" pops up but there are not any settings
Comment #27
chris.smith commentedTo elaborate on the issue regarding the HTML tag, it appears that PHPWord uses DOMDocument::loadXML to parse the node. But, loadXML has limited support for HTML5. In many instances during my testing, the parser was unable to read the source code, and the Word document failed to generate.
Dompdf uses the html5lib parser and does not seem to have the same issues as PHPWord.
As a solution, would it make sense to clean the HTML prior to sending to PHPWord? Or at a minimum validate that the file can be parsed by loadXML prior to sending to PHPWord?
Comment #28
benjy commentedI'd be open to trying to detect breakages here for now but ideally PHPWord would throw errors that we could catch in all cases where the parsing fails and then we'd make an upstream PR to replace their using of ::loadXML with html5lib?
Comment #29
chris.smith commentedI agree that PR would be best. I've opened an issue with PHPWord (https://github.com/PHPOffice/PHPWord/issues/836) to see if they would be willing to replace the parser.
In the meantime, for those interested, I was successful in altering PHPWorld/Shared/HTML to use the Masterminds\HTML5 library to parse the HTML content and generate a Word document.
Comment #30
Derimagia commentedWoops I also forgot this in my notes but it sounds like it would be fixed if upstream expands to html5 but this is a quick one:
- Was complaining about meta tag not being closed in entity-print.html.twig
(<meta charset="utf-8"> -> <meta charset="utf-8"/>)Let me know if you don't have time to tackle anything and I'll help out with it
Comment #31
benjy commentedFeel free to create a new issue for the meta tag and we can get that fixed.
Also, i'm happy to accept workarounds if that is possible to get the PHPWord integration committed, the upstream issue could take a while.
Comment #32
chris.smith commentedSupport for the HTML conversion is limited with PhpWord. Here are a few challenges that I'm currently facing:
No support for HTML5
I believe this issue should be resolved in PhpWord, not this module. As mentioned above, introducing a new HTML5 parser (Masterminds) to PhpWord seems to be a good first step. More work needs to be done to fully replace the existing parser. If you want to fix the issue within this module, we will need to convert the HTML created in RendererBase prior to sending to PhpWord.
No support for CSS
PhpWord has a //todo in their HTML reader to create a stylesheet parser. In the meantime, there is no support for CSS. The entity-print.css file is never processed by PhpWord.
Limited support for inline styles
There is an inline CSS parser in PhpWord, although it has limited support for most style types. I've been able to introduce Emogrifier (https://github.com/jjriv/emogrifier) into this module, which converts CSS stylesheets into inline style attributes in the HTML code before sending to PhpWord. I'd like to get your thoughts on this approach. I believe it could be a good approach for handling styling in Word, but also PDF and any other export formats we may want to print in the future.
Comment #33
Derimagia commented@benjy - The meta tag issue is related to this issue - it's causing issues with phpword.
@chris.smith In terms of the limited html -> css conversion, I don't think that's as much as an issue if we provide ways to hook into the system and create the docx from the entity instead of using the html -> css conversion. With some notes on this as well as long as each type gets it's own display mode (I noted this is #26, but essentially right now it's using the pdf display mode I think)
Comment #34
benjy commentedThat is against the architecture of this module, I think if you're going that route then this module doesn't really offer you anything. Simply add the composer libraries you want and go it alone.
Based on chris.smith's comments i'm wondering we either don't add PHPWord at all or we add it with an experimental warning and document the current issues?
Comment #35
Derimagia commentedEh, I would argue not having having a hook would be against Drupal's architecture. Not using a module because you want to tweak the output a bit because the program doesn't support ALL of what you want and just some of it is not the correct way to do it.
Drupal is all about contributing functionality back - I plan to use this module for a project but then implement more functionality to also save the files to the disk. Under that attitude I wouldn't contribute it back because I shouldn't have used this module to begin with because it's possible I need to fix the output?
Comment #36
benjy commentedThis isn't what was suggested, building the printed document from the entity is far from tweaking the output (there are already events for that). If you understood the architecture of this module I think you'd see what I mean.
If you wanted to use the entity directly then the controllers, renderers, print plugins and PrintBuilder are all useless to you, what you need is your own route, and an instance of the PHPWord object to build your output document.
If you see somewhere we can simply add a hook and your use case works, then absolutely, create an issue and we'll add it. However, make it an event, Drupal is moving away from hooks.
Finally, the last feature you mentioned has an existing issue if you want to contribute: #2571725: Rules action to save PDF to server
Comment #37
Derimagia commentedYeah the use case I'm talking about would be altering it during the (2) item (after it gives the external lib the html but before it returns it - generally altering the third party library).
I used hook in a loose term - Also as far as I knew the main reason they aren't more in D8 is because of the speed of them (Hooks are a lot faster) so I'm hoping we can move everything to events but I agree this one should be since it's not called often.
Thanks for the link to the relevant issue, I'll take a look there.
So I think the only things that should be done for this ticket that are left:
1) Close the html tag in that template since it stops it from working. @chris.smith did this not cause an issue for you?
2) @chris.smith Did we ever workout a fix for the twig debugging? Maybe a quick fix for now is to disable twig debugging for the rendering somehow?
3) Mark this is experimental
Then after this is merged in I'll look into:
1) Multiple display modes per library, @benjy any thoughts on this? RIght now it looks like there's only "pdf" and for backwards compatibility I guess we shouldn't change that. Is there a way to have separate twig/display modes files somehow else?
2) Maybe adding a hook/Event to modify the third party plugin before it returns the document
Comment #38
benjy commentedThis is already supported see the
PreSendPrintEventwhich is trigged viaPrintEvents::PRE_SENDinPrintBuilder.The 2.x branch has never had a release and we already have lots of API changes from 1.x so not too worried about that. I think we could easily make a view mode per export type on install and then try and use that? We already do many other things per export type.
Comment #39
Derimagia commented@benjy Thanks for that! I actually found this and have been writing down more notes. What I found is that this phpdoc library is very buggy (Doesn't support tables even though there's code for it because of bugs, doesn't support images). I had to patch a few things in it sadly that were fixed in issues. But that's not the point of this thread.
The `PreSendPrintEvent` worked pretty good for the most part. The only issue is that there are two properties on the Doc build
printandsectionthat need to be public. Once we do that then I think my request for that is fine.One major but quick issue I found when running xdebug: The Doc Builder is saving the file as
'tmp-file'in the root of the drupal site when building which is a security concern. Should be moved to the true tmp directory.The last annoying thing is that at the very least I wanted to find a work around for the twig debugging issue and disable twig debugging and then re-enable it after. Not sure how easy that is and it can probably be moved to a new bug after this is released if you wanted to mark this as experimental.
The view mode something is something i'd recommend doing as well but is there a way that in the node/entity rendering process to tell if it's being printed in entity print and by what builder? That may work as well but since there's already the pdf view mode we might as well just add the others or deprecate 'pdf' and switch to a display mode that's standard for all prints and then just provide a way to see the builder somehow in case developers need this
Comment #40
sidharth_sahu_16 commentedis there a php library for word doc support ?
Comment #41
Deno commentedThe documentation claims that this feature is in "active development". Any progress?
Comment #42
antogeorge commentedIs any working libarary available for .docx(word ).Please share any developing/developed library for the same.
Comment #43
antogeorge commentedComment #44
benjy commentedThis is far from critical.
Comment #45
antogeorge commentedCould someone provide us the status? Are we able export to word like pdf with this module currently?
Comment #46
antogeorge commented@benjy in addPage method
public function addPage($content) {
// @TODO, this only supports adding one page?
Html::addHtml($this->section, $content, TRUE);
}
only supports one page html document.how can we add multiple page??
Comment #47
andriyun commentedPatch #25 is outdated
I've tried to make it works and didn't get success.
More over I got at least incompatibility code issue. Methods getPrintObject and getBlob has to be implemented accordingly to PrintEngineInterface.
I've also faced with issue that PHPWord library return empty document when it's trying to convert.
Please check rerolled patch.
Comment #48
andriyun commentedComment #49
aaronmchaleIs it just me or has anyone else noticed that text in heading tags e.g. H2, H3, etc, don't get converted to headings in Word, and that there appears to be quite a bit of white-space, I'm assuming these are issues with the underlying library but wondered if anyone else had overcome this?
Thanks
Comment #50
diogo_plta commentedPlease, can you explain how to test this patch?
I´ve tried to install the entity_print 8.x-2.x-dev, install the PHPWord by Composer, applied the patch #47 (but with error), and no options to Word Document engine for printing.
[# entity_print]$ patch < 2733781-47.patch
patching file EntityPrintPHPWord.php
patching file WordDocx.php
can't find file to patch at input line 154
Perhaps you should have used the -p or --strip option?
The text leading up to this was:
--------------------------
|diff --git a/tests/src/Kernel/EntityPrintPluginManagerTest.php b/tests/src/Kernel/EntityPrintPluginManagerTest.php
|index 9dda0cb..fd4f204 100644
|--- a/tests/src/Kernel/EntityPrintPluginManagerTest.php
|+++ b/tests/src/Kernel/EntityPrintPluginManagerTest.php
--------------------------
File to patch:
Skip this patch? [y] y
Comment #51
andriyun commentedPlease try apply patch like here
patch -p1 < path/file.patchComment #52
francoud commentedwill this patch be committed to the stable version?
Comment #53
zoraxI has patched with 2733781-47.patch the entity_print 8.x-2.x-dev version.
When I select in admin/config/content/entityprint Word Document = Word docx;
Here is the error message :
Then a fieldgroup appears "Word Docx Settings" but there is nothing inside...
Comment #54
geraldito commentedExport to Word works fine for me using latest Entity Print 8.x-2.2 with PHPWord 0.18.1 - also the "Word Docx Settings" fieldgroup is empty.
Comment #55
webdrips commented#47 worked for me, but only with the latest phpword library, so running
composer require "phpoffice/phpword"seemed to do the trick by accessing /print/word_docx/node/NID after configuring the engine.Comment #56
VPetroff commentedHow create template for docx?
Comment #57
maxilein commentedPatch applied to 8.x-2.4 and prints docs from entity on D 9.2.9. Thank you.
(Haven't tested views, yet)
Comment #58
maxilein commented#57 activating the PDF display mode breaks the print for word.
There is no display mode for WORD?
Comment #59
geraldito commentedPatch from #47 is outdated and doesn't apply to Entity Print 8.x-2.4. Should this work out of the box now with PHPWord 0.18.3? "Word Docx Settings" fieldgroup is empty for me and I get:
Comment #60
maxilein commentedPatch from #47 is outdated and doesn't apply to ... if you have previously applied the patch 2 of the files already exist.
Errors are only that they cannot be regenerated, because they are already there.
Comment #61
daniel_j commentedThis is a re-roll of the patch in #47, with one or two minor bugs fixed, and brought up to Drupal coding standards (more or less). It applies cleanly to version 2.5.
Comment #62
maxilein commentedThank you Daniel_j.
That patch is applying well.
Comment #63
gngn commentedI tried #61 and added a word export to a view.
It generates a Word document - but an empty one.
Comment #64
maxilein commentedThe word may come out empty if you have controls that are not standard fields - in my experience.
But I never had the time to thoroughly investigate the issues.
Comment #65
julien tekrane commentedBased on #61, I added the fact if no
<body>is present.Comment #66
julien tekrane commentedComment #67
julien tekrane commentedComment #68
julien tekrane commentedComment #69
julien tekrane commentedComment #70
socialnicheguru commentedI do not see any of the settings when I select the Word Docx library.
Comment #71
abhinand gokhala k commentedI have applied the patch #69 and selected the engine for word export.
But got this error while exporting the entity as a Word doc.
TypeError: Cannot assign array to property DOMDocument::$preserveWhiteSpace of type bool in PhpOffice\PhpWord\Shared\Html::addHtml() (line 83 of
Comment #72
piotrsmykaj commentedAdded the output escaping option and removed the problematic code related to the styling to fix the error from #71.
Comment #73
seutje commentedWhen I try to set the Word Document engine to Word Docx and submit the settings form, I am greeted with the following error on Drupal 9.5.4 on PHP 8.0.27:
I assume there's something missing in entity_print.schema.yml.
Comment #74
guardiola86 commentedentity-print-2733781-export-word-72.patch isn't working for me, I get this error:
Comment #75
guardiola86 commentedactually it works, but it breaks this page on submit: /admin/config/content/entityprint
Comment #76
cestmoi commentedPatch #72 caused these errors on saving
default `Word Document` at/admin/config/content/entityprintWarning: array_flip(): Can only flip string and integer values, entry skipped in Drupal\Core\Entity\EntityStorageBase->loadMultiple() (line 278 of pathto\web\core\lib\Drupal\Core\Entity\EntityStorageBase.php).form Error The submitted value 1 in the Word Document element is not allowed.Drupal
10.1.2entity_print
8.x-2.x-devphp
8.1.10Comment #77
rimbu002 commentedPatch 72 on entity_print:2.x-dev on Drupal 9.5.10 and PHP 8.1.21 add the Word Docx engine, but saving the configuration at admin/config/content/entityprint produces 3 errors
1. TypeError: Illegal offset type in Drupal\Core\Entity\EntityStorageBase->load() (line 297 of /mnt/www/html/umnpcid8stg/docroot/core/lib/Drupal/Core/Entity/EntityStorageBase.php)
2. Warning: array_flip(): Can only flip string and integer values, entry skipped in Drupal\Core\Entity\EntityStorageBase->loadMultiple() (line 312 of /mnt/www/html/umnpcid8stg/docroot/core/lib/Drupal/Core/Entity/EntityStorageBase.php)
3. Illegal choice 1 in Word Document element.
Comment #78
rimbu002 commentedPatch #72 on entity_print:2.x-dev, Drupal 9.5.10 and PHP 8.1.21 adds the Word Docx engine, but saving the configuration at admin/config/content/entityprint produces 3 errors
1. TypeError: Illegal offset type in Drupal\Core\Entity\EntityStorageBase->load() (line 297 of /mnt/www/html/umnpcid8stg/docroot/core/lib/Drupal/Core/Entity/EntityStorageBase.php)
2. Warning: array_flip(): Can only flip string and integer values, entry skipped in Drupal\Core\Entity\EntityStorageBase->loadMultiple() (line 312 of /mnt/www/html/umnpcid8stg/docroot/core/lib/Drupal/Core/Entity/EntityStorageBase.php)
3. Illegal choice 1 in Word Document element.
Comment #79
hmohan commentedI get the same errors as rimbu002 mentioned. Is there any update on this?
Thanks
Comment #80
tonka67 commentedI'm working with Patch #72 on entity_print:2.x-dev, Drupal 10.2.1 and PHP 8.1.26.
After installing phpoffice/phpword:^1.0 per installation prompts, I set the Word engine, but threw a "The website encountered an unexpected error. Try again later." error on the config page that would not clear. I uninstalled both Entity Print and Entity Print Views, then reinstalled. Now I can't set the word engine at all - when I click save, it throws the same error on the page, but while I can now refresh, none of the changes are saved.
This produces similar errors to the previous folk, with some differences:
Comment #81
tonka67 commentedI retried Patch #72 after upgrading to Drupal 10.2.2. Now I'm down to 1 error:
TypeError: Cannot access offset of type string on string in Drupal\Component\Utility\NestedArray::setValue() (line 155 of /code/web/core/lib/Drupal/Component/Utility/NestedArray.php).Looking at the console inspector, I also found the following (which does not exist without the patch):
[DOM] Found 2 elements with non-unique id #edit-submit: (More info: https://goo.gl/9p2vKq) <input data-drupal-selector="edit-submit" type="submit" id="edit-submit" name="op" value="Save configuration" class="button button--primary js-form-submit form-submit">Comment #82
tonka67 commentedI'm back to the errors I reported in #80.
Has anyone found a solution yet?
Comment #83
hoporr commentedpatch #72 throws the error as reported above.
However, I did get it to work like this, clearly a workaround.
1) Install entity_print and the word engine.
2) Install patch #69
3) configure the word print-engine in the interface.
4) SAVE
5) (if you print now, it has the previous errors as reported in #71)
6) NOW after you turned on the engine, SWITCH the patch to #72.
7) If you visit the configuration page, it still shows the engine as turned on, with an extra config.
8) Now you can print the doc.
So the point is: you must have the word print engine installed and set in the intetface BEFORE you install patch #72.
Clearly only a workaround.
Comment #84
tonka67 commentedTook me a while to circle back, but #83 works great. Thanks so much.
Comment #85
ramlev commentedDon't know if this is in scope for this issue. But have updated the #72 patch to support inline styles.
This patch have pelago/emogrifier as a suggestion, and if added to the project, it will parse the html and convert css to inline styles.
Comment #86
philippjor commented#83 ist a good workaround, but in Drupal10.3 does not work.
I got an error in this function of WOrdDocx.php
For testing i used "TCPDF();" instead of "EntityPrintPhpWord();" and the config-site of module entityprint worked for me:
/admin/config/content/entityprint
Did anyone tested it in Drupal10 ?
Comment #87
fishfree commentedFor Drupal 10.4.3, the latest version of this module and #85 patch does not work. Even both running composer require phpoffice/phpword under the Drupal project root and this module folder, still no options can be selected for the field "Word Document" on the /admin/config/content/entityprint
Comment #88
omarlopesinoI've just re-rolled patch from 85 #85 so it works with 8.x-2.x branch. I've also renamed the plugin id and class name because it were conflicting in the settings form producing fatal errors. Now it works fine.
With the patch I've just attached I can successfully enable phpword plugin and also export it to docx.
Please review, thanks!
Comment #89
omarlopesinoComment #91
daniel_j commentedI created a MR from the patch in #88, with a little bit of coding-standards cleanup. If this gets merged, please be sure to credit @omarlopesino.
Comment #92
sendor commentedI used patch #88 on but it had errors when some HTML items weren't properly closed as required to convert to XML, and images without the path on the server generated PHPWord errors. That's why I added this code to patch #88.
developed on
drupal core: 10.6.2
php ext: tidy
PhpWord: 1.4.0
Comment #93
sendor commented