Add a command-line utility to export content in YAML format [#3532694]

Problem/Motivation

Recipes -- and especially site templates -- can include content. Core can import the format generated by the venerable Default Content contrib module.

However, exporting content is a gap. You need the Default Content module to do that. Default Content has some problems, namely that it is not extensible for export -- new types of fields and data structures can't be handled by it without patching the module.

Besides, it's not really feasible to require recipe authors, and especially site template creators, to use Default Content to put content into their recipes. It needs to work, and it needs to be able to handle all core field types, and it needs to be able to handle exotic contrib field types (Entity Reference Revisions, Smart Date, Experience Builder's stuff, and so on).

Proposed resolution

I propose we add a new content:export command to the core/scripts/drupal script.

Initially, to keep things simple, it should support exporting entities one at time, with no handling of dependencies, and only in YAML format. For example:

$ php core/scripts/drupal content:export node 42
... YAML DUMP HERE ...

To generate the export, it should use the Serialization module's normalization API to normalize the entity and all of its fields. This means the command will have to exit with an error if Serialization is not installed -- but that's probably okay for the time being. This is a developer-facing command anyway, and we can lift that restriction when and if Serialization is turned into a core subsystem (which was discussed in #2296029: Move Serialization module back into a core/lib component).

The exported content should be, pretty much, exactly what you'd get out of the Default Content module. We don't need to handle normalization for all core field types right away; that can happen in follow-ups, as long as the normalization is pluggable.

Indeed, this command will not, initially, be as robust as Default Content is -- both because Default Content is more battle-tested, and crucially, because it has a lot of hard-coded handling of various special cases and field types (both core and contrib), much of which will need to be ported into core piecemeal after this first issue is committed. But we can start here.

In a follow-up issue, we should add support for exporting an entity and its dependencies into a folder structure. We should also support doing the export as a specific user (maybe a --user=N option), rather than merely "the first one with an administrative role".

User interface changes

None, but there will be a new content:export command for the drupal script.

Introduced terminology

None.

API changes

No API changes as such, but a slew of additions to the experimental default content API (including a new event) and a change to the Serialization module's normalizers in order to support passing callback functions to the normalizers' $context parameter, which necessitates a new interface for those callbacks. Fields and data types will be able to specify a setting that lets them opt into, or out of, being exported.

None of these changes have BC implications.

Data model changes

None.

Release notes snippet

TBD

Issue fork drupal-3532694

Show commands

Start within a Git clone of the project using the version control instructions.

Add & fetch this issue fork’s repository

Or, if you do not have SSH keys set up on git.drupalcode.org:

Add & fetch this issue fork’s repository

3532694-add-a-command-line changes, plain diff MR !12512
Check out this branch for the first time

Check out existing branch, if you already have it locally

About issue forks

Comments

Comment #1

26 June 2025 at 21:40

phenaproxima created an issue. See original summary.

Comment #2

larowlan

🇦🇺🏝.au GMT+10

commented 26 June 2025 at 21:42

+1 for this, it will likely require a dependency on the serialization module but I think that is fine

Comment #3

phenaproxima

he/him

English

Massachusetts

commented 26 June 2025 at 21:44

Issue summary:

View changes

Comment #4

phenaproxima

he/him

English

Massachusetts

commented 26 June 2025 at 21:51

Just to give a little context, @larowlan linked me to https://www.previousnext.com.au/blog/we-could-add-default-content-drupal..., which is from almost a decade ago. It outlines four tricky problems that prevent the addition of default content capabilities to core. I want to quickly shoot these down.

Adding default content to standard profile would mean that none of the profile's configuration would have been imported yet.

Core's default content import is done by recipes, and works well. The concern about shipping content with modules is moot. Recipes' job is to put all necessary configuration in place before any content is created, and the recipe system's strong, straightfoward configuration handling means that default content can come in easily. This concern is definitely no longer applicable.

The second issue here is that the default content module relies on the Rest, HAL and serialization modules in core.

This was true for v1 of the Default Content module. We would still need to have Serialization enabled for export, that's true...but there are no special modules required for import, which is the end user-facing case. Recipe authors are unlikely to be bothered by the need to install Serialization before exporting content.

There are shortcomings in core's normalizers. The main ones are around fields that resemble entity references but really aren't and fields with calculated values. And then there's normalizing files and images.

This is probably legitimate, although the situation is likely significantly better now than it was at the time the blog post was written, thanks to the advent of JSON:API and the core improvements that it brought us. But yes, normalization is the meat and potatoes of doing default content export correctly.

Just because we could add default content to the standard profile - does that mean we should?

We already put default content in recipes, which puts profile support on the back burner (where it belongs). This is not a problem anymore.

So there you go. If you ask me, the time to do this is now!

Comment #5

thejimbirch commented 26 June 2025 at 23:21

Component:

recipe system

» default content system

Moving to the Default content system.

Comment #6

27 June 2025 at 00:47

phenaproxima opened merge request !12512

Comment #7

nicxvan commented 27 June 2025 at 00:55

Slightly different use case, but tome can export content to json for keeping it in git.

Comment #8

phenaproxima

he/him

English

Massachusetts

commented 27 June 2025 at 01:47

Issue summary:

View changes

Comment #9

phenaproxima

he/him

English

Massachusetts

commented 27 June 2025 at 15:42

Issue summary:

View changes

Updating API changes based on my current progress.

Comment #10

phenaproxima

he/him

English

Massachusetts

commented 27 June 2025 at 18:29

Status:

Active

» Needs review

This is now reviewable and has a passing test.

My deep journey into the Serialization module (which I've not used before) has shown me several things:

Default Content's export format is actually very close -- in most cases, identical -- to what this MR produces, using the Serialization module API. There are some small differences, but those are usually because either the fixture files were incomplete as generated, or because Default Content does certain things a little differently, but with the same net effect (for example, always casting primitives, which results in empty strings for undefined path aliases, which can also be NULL -- the import will work the same either way.
Layout Builder is a significant sore point, due to the fact that it unconditionally denies access to its data for serialization. That needs to be fixed in #2942975: [PP-1] Expose Layout Builder data to REST and JSON:API, which is a long-running issue and appears to be quite the dragon, but need not block this feature.
The test case is to import the default content we test our importer with (which includes all core entity types, and a translation, plus a couple of edge cases), immediately export it, and then confirm that the exported version matches the stuff that we imported. This proves that the exporter is producing material that our importer understands. This doesn't necessarily prove that all exported content will be flawlessly importable in all cases, but it's an excellent foundation.
The export does not write anything to disk (yet), or export dependencies. However, there is a mechanism here (ExportMetadata) which makes it easy for normalizers to flag other entities as dependencies.
As is true with importing content, exporting also requires administrator access, so that you have maximum visibility of all fields, and also have the ability to include hashed passwords in exported users. This is the same as what Default Content does, so I presume it's okay from a security standpoint, but might be worth a sign-off anyway.

So...onward! Let's get this crucially important feature shipshape, and merged in.

Comment #11

nicxvan commented 27 June 2025 at 19:07

Great work! and this went so quickly.

I like how you can set individual properties as not exportable.

It's kind of sad that layout builder is left out again, but I think fixing that issue is out of scope as you mentioned I assume this will not work with Experience builder either then?

I know I've objected to final and private on several issues now, but a content exporter feels explicitly like the kind of thing you want to extend and this precludes that.

That method of testing is pretty clever!

Haven't deeply reviewed this yet.

Comment #12

phenaproxima

he/him

English

Massachusetts

commented 27 June 2025 at 19:27

I assume this will not work with Experience builder either then

Not out of the box, but by hooking into the serialization system, it gives XB a way to become exportable: all it needs to do is implement a normalizer that can normalize its various data structures. That's the single biggest advantage of this approach over Default Content's -- modules can handle exporting their own data.

a content exporter feels explicitly like the kind of thing you want to extend

The exporter should not be extensible. If you want to change how it operates, you should implement a normalizer. To me, that feels like the correct amount of API surface here; what would the use case be for directly extending the exporter itself?

Comment #13

nicxvan commented 28 June 2025 at 03:21

Everything is final though not just the exporter.

what would the use case be for directly extending the exporter itself?

We shouldn't limit contrib based on my lack of creativity. My point is, just with everything else with final you have no recourse beyond unfinalize or reflection.

It's marked final you can't extend it or decorate it so if someone wants to experiment with the complex data normalize the solution is to just copy everything and fork it.

Comment #14

phenaproxima

he/him

English

Massachusetts

commented 28 June 2025 at 03:30

It's marked final you can't extend it or decorate it

(emphasis added)

That's not true. You can decorate anything with an interface, and the export normalizer has an interface (NormalizerInterface). It is a decorator itself. Decoration is the correct way to add more things to final classes.

I am not going to die on the final/private hill in this issue; if a committer tells me to mark it non-final and make the private members protected, I'll do that. But I will insist that the class be marked internal with a clearly-worded warning, because it is part of an experimental subsystem. If someone extends an internal class and it breaks them, they deserve what they get. 😈

Comment #15

nicxvan commented 28 June 2025 at 03:44

That's not true. You can decorate anything with an interface, and the export normalizer has an interface (NormalizerInterface).

I will explore that further I may have missed something in my testing it's been a bit.

But I will insist that the class be marked internal with a clearly-worded warning, because it is part of an experimental subsystem.

I 100% agree.

If someone extends an internal class and it breaks them, they deserve what they get. 😈

Also agreed, it's caveat emptor.
My point of contention is we should not block it, but that we should warn them not to extend it.

It's why I want to begin using @final. It's an even stronger warning.

I'm just wary of final after running into blockers with rector and symfony.

Comment #16

phenaproxima

he/him

English

Massachusetts

commented 28 June 2025 at 04:03

Issue summary:

View changes

Comment #17

murz

English

Yerevan, Armenia

commented 28 June 2025 at 04:44

As an alternative, until this feature is in core, we can use this module: https://www.drupal.org/project/single_content_sync - it can export Layout Builder too, and also integrate reference entities like menu and path_alias into the single yaml file together with node.

Comment #18

phenaproxima

he/him

English

Massachusetts

commented 28 June 2025 at 12:53

Issue tags:

+Needs followup

A few follow-ups have been suggested to me privately by interested parties, plus a couple of ideas of my own:

Support exporting an entity and all of its dependencies into a folder structure on disk. This includes exporting physical files that are part of File entities -- we'll probably want to introduce the concept of an "attachment" to ExportMetadata to facilitate this.
Support straight-up JSON when exporting and importing. The importer needs no changes for this (only a one-line adjustment to the Finder class) and it would be trivial to make the export command write JSON instead of YAML, maybe with a --format=json option.
Allow the export to be done as a specific user, not just "whoever has the administrative role". The importer already supports this; the exporter should too. It'd be a pretty easy change, and we could add a --as-user=N option to the export command for that.
Ambitious: convert demo_umami_content to use the default content system! That would be a really strong test of its capabilities and would exercise more nooks and crannies than just our comparatively sad little test fixture. :)

Comment #19

nicxvan commented 28 June 2025 at 13:13

One thing that I'm not sure how to flag for infrastructure is if this includes media then the git repos for recipes using these exports may get enormous.

I've been using tome to manage https://nlighteneddevelopment.com for a couple of years.

I don't have a lot of content or images and I deploy once or twice a year and that repo is currently a gigabyte.

Comparing that with a drupal 11 site that I have hundreds of deploys which is like 50 megabytes.

I've been considering if there is a way to set up git lfs for my site.

I don't think this is a blocker by any means but infrastructure should prepare.

Comment #20

phenaproxima

he/him

English

Massachusetts

commented 28 June 2025 at 13:24

I'm not sure how to flag for infrastructure

Infra has its own issue queue: https://www.drupal.org/project/issues/infrastructure?categories=All

Comment #21

nicxvan commented 28 June 2025 at 13:43

Thanks!

Comment #22

phenaproxima

he/him

English

Massachusetts

commented 28 June 2025 at 18:27

Issue tags:	-Needs followup
Related issues:		+#3532950: Support importing default content in JSON format, +#3532954: [PP-2] Use core's default content system for Umami, +#3532952: [PP-1] The content:export command should allow you to specify which user account to use while exporting, +#3532951: Support exporting content and its dependencies to a folder structure on disk

Filed the follow-ups from #18.

Comment #23

phenaproxima

he/him

English

Massachusetts

commented 28 June 2025 at 23:32

Issue tags:

+Contributed project soft blocker

Tagging as a contributed project soft blocker because without this, we can't easily build site templates.

Comment #24

phenaproxima

he/him

English

Massachusetts

commented 28 June 2025 at 23:36

Issue tags:

+Experience Builder

Opened #3532961: [PP-1] Add a normalizer for component tree items to take advantage of this in Experience Builder.

Comment #25

phenaproxima

he/him

English

Massachusetts

commented 29 June 2025 at 00:52

Adding a related issue that will absolutely impact this one -- or will be impacted by this one -- depending on which gets committed first.

Comment #26

phenaproxima

he/him

English

Massachusetts

commented 29 June 2025 at 11:29

Adding #3533005: Allow fields to be marked as non-exportable as related, which would implement field-level export access control in core.

Comment #27

larowlan

🇦🇺🏝.au GMT+10

commented 29 June 2025 at 21:29

Status:

Needs review

» Needs work

Left some comments on the MR, nice work!

Comment #28

phenaproxima

he/him

English

Massachusetts

commented 30 June 2025 at 23:20

Status:

Needs work

» Needs review

Comment #29

thejimbirch commented 1 July 2025 at 13:21

I reviewed and made a minor suggestion. Leaving as needs review for someone more technical than I to review.

Comment #30

phenaproxima

he/him

English

Massachusetts

commented 3 July 2025 at 02:49

Change record drafted: https://www.drupal.org/node/3533854

Comment #31

phenaproxima

he/him

English

Massachusetts

commented 3 July 2025 at 03:13

Issue summary:

View changes

Comment #32

3 July 2025 at 03:19

phenaproxima credited alexpott.

Comment #33

phenaproxima

he/him

English

Massachusetts

commented 3 July 2025 at 03:19

Adjusting credit.

Comment #34

larowlan

🇦🇺🏝.au GMT+10

commented 4 July 2025 at 04:55

Status:

Needs review

» Needs work

Gave this a manual test against umami, works well.

 php core/scripts/drupal content:export node 3


 [ERROR] The Serialization module is required to export content.


d31d89485d9d:/data/app$ drush en -y serialization
>  [notice] The configuration was successfully updated. 205 configuration objects updated.
>  [notice] Message: The configuration was successfully updated. There are /205/ configuration
> objects updated.
>
 [success] Module serialization has been installed.
d31d89485d9d:/data/app$ php core/scripts/drupal content:export node 3
default:
  revision_uid:
    -
      entity: 1ad0846a-d9f4-4a5b-9a99-ffb9ae6a05c4
  status:
    -
      value: true
  uid:
    -
      entity: 1ad0846a-d9f4-4a5b-9a99-ffb9ae6a05c4
  title:
    -
      value: 'Super easy vegetarian pasta bake'
  created:
    -
      value: 1750135677
  promote:
    -
      value: true
  sticky:
    -
      value: false
  path:
    -
      alias: /recipes/super-easy-vegetarian-pasta-bake
      langcode: en
  content_translation_source:
    -
      value: und
  content_translation_outdated:
    -
      value: false
  field_cooking_time:
    -
      value: 20
  field_difficulty:
    -
      value: easy
  field_ingredients:
    -
      value: '400g wholewheat pasta'
    -
      value: ' 1 onion'
    -
      value: ' 2 garlic cloves'
    -
      value: ' 1 pack vegetarian sausages'
    -
      value: ' 400g chopped tomatoes'
    -
      value: ' 50g sliced sun dried tomatoes'
    -
      value: ' 1 pinch sugar'
    -
      value: ' 3 tbsp red pesto'
    -
      value: ' 50g cheddar cheese'
    -
      value: ' Basil or mixed herbs'
    -
      value: ' 100g mozzarella'
  field_media_image:
    -
      entity: e936d651-a50d-4426-9ab9-fe79d1cac01c
  field_number_of_servings:
    -
      value: 4
  field_preparation_time:
    -
      value: 5
  field_recipe_category:
    -
      entity: 8cb1b744-b885-4885-b0a5-cc8d2fe1498e
  field_recipe_instruction:
    -
      value: |
        <ol>
          <li>In a large pan, boil the pasta in plenty of water until cooked.</li>
          <li>Whilst the pasta is cooking, chop the onion and gently fry it with the garlic in a little oil until soft and the onion looks clear.</li>
          <li>Add the vegetarian sausages. Once browned, remove and chop into chunky bites.</li>
          <li>Pop the sausages back into the pan and add the tomatoes, sugar, pesto and sun dried tomatoes. Season to taste. Simmer until most of the water from the chopped tomatoes has gone.</li>
          <li>Drain the pasta and add to the pan with the sausages and tomatoes. Stir in half of the cheddar and transfer to a shallow dish. Sprinkle with the rest of the cheddar and dot the sliced mozzarella over the top.</li>
          <li>Grill for 10 minutes or until the cheese has melted and started to brown. Serve with basil leaves.</li>
        </ol>
      format: basic_html
  field_summary:
    -
      value: 'A wholesome pasta bake is the ultimate comfort food. This delicious bake is super quick to prepare and an ideal midweek meal for all the family.'
      format: basic_html
  field_tags:
    -
      entity: 2a30ad8a-2e29-4ee0-ad4a-cb7790233ac2
    -
      entity: ec239227-ecb0-47a9-bb5a-af275483ec59
    -
      entity: 8f933857-2858-4092-bf5b-94105148852f
translations:
  es:
    revision_uid:
      -
        entity: 1ad0846a-d9f4-4a5b-9a99-ffb9ae6a05c4
    status:
      -
        value: true
    uid:
      -
        entity: 1ad0846a-d9f4-4a5b-9a99-ffb9ae6a05c4
    title:
      -
        value: 'Pasta vegetariana al horno súper fácil'
    created:
      -
        value: 1750135677
    promote:
      -
        value: true
    sticky:
      -
        value: false
    revision_translation_affected:
      -
        value: true
    path:
      -
        alias: /recipes/pasta-vegetariana-horno-super-facil
        langcode: es
    content_translation_source:
      -
        value: und
    content_translation_outdated:
      -
        value: false
    field_cooking_time:
      -
        value: 20
    field_difficulty:
      -
        value: easy
    field_ingredients:
      -
        value: '400g pasta de trigo integral'
      -
        value: ' 1 cebolla'
      -
        value: ' 2 dientes de ajo'
      -
        value: ' 1 paquete de salchichas vegetarianas'
      -
        value: ' 400g tomates picados'
      -
        value: ' 50g rodajas de tomates secados al sol'
      -
        value: ' 1 pizca de azúcar'
      -
        value: ' 45g pesto rojo'
      -
        value: ' 50g queso cheddar'
      -
        value: ' Albahaca o hierbas mixtas'
      -
        value: ' 100g queso mozzarella'
    field_media_image:
      -
        entity: e936d651-a50d-4426-9ab9-fe79d1cac01c
    field_number_of_servings:
      -
        value: 4
    field_preparation_time:
      -
        value: 5
    field_recipe_category:
      -
        entity: 8cb1b744-b885-4885-b0a5-cc8d2fe1498e
    field_recipe_instruction:
      -
        value: |
          <ol>
            <li>En una sartén grande, hervir la pasta en abundante agua hasta que esté cocida.</li>
            <li>Mientras se cocina la pasta, pica la cebolla y fríela suavemente con el ajo en un poco de aceite hasta que esté suave y la cebolla se vea clara.</li>
            <li>Añadir las salchichas vegetarianas. Una vez dorado, retirar y picar en trozos grandes.</li>
            <li>Pon las salchichas en la sartén y agrega los tomates, el azúcar, el pesto y los tomates secos. Sazone al gusto. Cocine a fuego lento hasta que la mayor parte del agua de los tomates picados se haya ido.</li>
            <li>Escurrir la pasta y agregar a la sartén con las salchichas y los tomates. Agregue la mitad del queso cheddar y transfierelo a un plato poco profundo. Espolvoree con el resto del queso cheddar y salpique la mozzarella en rodajas por encima.</li>
            <li>Asar durante 10 minutos o hasta que el queso se derrita y comience a dorarse. Servir con hojas de albahaca.</li>
          </ol>
        format: basic_html
    field_summary:
      -
        value: 'Una pasta al horno es la comida más fácil y saludable. Este delicioso plato es súper rápido de preparar y una comida ideal entre semana para toda la familia.'
        format: basic_html
    field_tags:
      -
        entity: 2a30ad8a-2e29-4ee0-ad4a-cb7790233ac2
      -
        entity: ec239227-ecb0-47a9-bb5a-af275483ec59
      -
        entity: 8f933857-2858-4092-bf5b-94105148852f
_meta:
  version: '1.0'
  entity_type: node
  uuid: 119ecac6-9dd3-44ea-9ed8-a798a42fcac5
  bundle: recipe
  default_langcode: en
  depends:
    1ad0846a-d9f4-4a5b-9a99-ffb9ae6a05c4: user
    e936d651-a50d-4426-9ab9-fe79d1cac01c: media
    8cb1b744-b885-4885-b0a5-cc8d2fe1498e: taxonomy_term
    2a30ad8a-2e29-4ee0-ad4a-cb7790233ac2: taxonomy_term
    ec239227-ecb0-47a9-bb5a-af275483ec59: taxonomy_term
    8f933857-2858-4092-bf5b-94105148852f: taxonomy_term

Tested it with user 3 and I think we need to look into how password hashing works

content:export user 1
default:
  preferred_langcode:
    -
      value: en
  name:
    -
      value: admin
  mail:
    -
      value: admin@example.com
  timezone:
    -
      value: UTC
  status:
    -
      value: true
  created:
    -
      value: 1751604450
  access:
    -
      value: 1751604506
  login:
    -
      value: 1751604506
  init:
    -
      value: admin@example.com
  roles:
    -
      target_id: administrator
  pass:
    -
      value: $2y$10$XGC/VsNAmnEfIbaIBC7rm.WbYfoeDNgAnV0dx2cwmgx/9V7rE2SNy
      existing: ''
      pre_hashed: false
_meta:
  version: '1.0'
  entity_type: user
  uuid: c71bd884-8881-4d58-b59d-f116082a6117
  default_langcode: en
d31d89485d9d:/data/app$ php core/scripts/drupal content:export user 3
default:
  preferred_langcode:
    -
      value: en
  preferred_admin_langcode:
    -
      value: en
  name:
    -
      value: 'Margaret Hopper'
  mail:
    -
      value: margaret.hopper@example.com
  timezone:
    -
      value: UTC
  status:
    -
      value: true
  created:
    -
      value: 1751604450
  access:
    -
      value: 0
  login:
    -
      value: 0
  roles:
    -
      target_id: editor
  pass:
    -
      value: $2y$10$zF3I5I5J1ILAD/GoXFRxzeqI1sijFx2zjfif1u.QQJXLrj/aQK872
      existing: ''
      pre_hashed: false  # 👈️👈️👈️👈️👈️
_meta:
  version: '1.0'
  entity_type: user
  uuid: 703b9efb-d3d8-4a08-bd63-601f739b1f81
  default_langcode: en

Because pre_hashed is set to FALSE, when the user is imported their password will get re-hashed - see \Drupal\Core\Field\Plugin\Field\FieldType\PasswordItem::preSave and they won't be able to login.

I think we probably want to fix that and add test-coverage.

Other than that, I think this is looking good to go

Comment #35

phenaproxima

he/him

English

Massachusetts

commented 4 July 2025 at 13:12

Status:

Needs work

» Needs review

Ooooh, great catch. Fixed with a test (there's no Christmas-ey CI run for this one since it's not fixing a pre-existing bug in HEAD).

Comment #36

berdir

German

Switzerland

commented 7 July 2025 at 21:30

Started this on the MR, where I added some more comments, but moving this to an issue comment.

I'm not really sold on the serializer + callbacks structure.

When I created default_content v2, I explicitly avoided the serialization module and the Symfony seralize component. The existing importer in core avoids it too. \Drupal\Core\DefaultContent\Importer::setFieldValues expects values that it can set as-is with a few specific known exceptions.

You mention this is explicit in regards to entity fields, but this relies on an existing arbitrary normalization format ($format is NULL), we have no idea what we get back from the that normalization process.

You work around this by adding several callbacks that explicitly undo the the specific normalizers in core, such as timestamp and entity references. What about field types you're missing, what if those normalizers have been customized?

What exactly does using serialization provide if we undo half of it?

95% of the normalization in default_content is two methods, normalizeTranslation (30loc) and getValueFromProperty() (66loc). I wrote it 5 years ago with no changes since then (there are a bunch of open issues to add support for some additional field types, but it can handle _a lot_ out of the box). It's specifically built to match the import logic and builds on content entity and typed data API and to handle entity types generically. I think we can find some extension points for this (possibly either those callbacks, or maybe tagged services, which I think would be more direct).

Comment #37

phenaproxima

he/him

English

Massachusetts

commented 7 July 2025 at 22:02

I explicitly avoided the serialization module and the Symfony seralize component. The existing importer in core avoids it too.

It does not avoid Serialization because of any specific problem in Serialization; it avoids it because it is a module, and the core importer is a subsystem (which it needs to be, since recipes can be applied even with no other modules installed).

What exactly does using serialization provide if we undo half of it?

We aren't undoing as much as you think. Serialization, as it exists in core, gets us 95% of the way there. Most of the reason we need the callbacks is to match the stuff that Default Content puts out -- which, again, is the short-term goal of this MR. With a coherent import and export system in core, we can begin to evolve the "format" a little bit and remove some of these workarounds.

this relies on an existing arbitrary normalization format ($format is NULL), we have no idea what we get back from the that normalization process

That's fair. We could send it an actual value (I had previously been using raw, or raw:1.0) so that at least a format is defined and we can build on that. Happy to restore that if you want; it won't hurt anything, and it would certainly be prudent to make the desired output format explicit.

But the whole point of the callback system (which was @alexpott's idea) is so that the normalizers themselves don't need to know anything special about output format, and just focus on downcasting our data structures to simple arrays and primitives.

Indeed, in previous versions of the diff, I did change the normalizers to know about the specific export data format, and act accordingly -- this way is much cleaner and far less prone to getting stuck with edge cases that cannot be worked around in contrib (if Default Content doesn't know how to handle a specific field type correctly, you're screwed; with the callback system, you can do something about it). Apart from the new setting on data definitions, export logic is confined to export-related code.

Comment #38

berdir

German

Switzerland

commented 7 July 2025 at 23:52

It does not avoid Serialization because of any specific problem in Serialization; it avoids it because it is a module, and the core importer is a subsystem (which it needs to be, since recipes can be applied even with no other modules installed).

I said "I", not "It". When I wrote default_content 2.x. The exporter here specifically doesn't avoid it, it absolutely depends on it.

We aren't undoing as much as you think. Serialization, as it exists in core, gets us 95% of the way there

What gets you there is the specific implementations of the ContentEntity, List and Field normalizers. We know how content entities are built, they are containers of lists of field items with properties. Serialization/Normalization is a super generic API capable of dealing with arbitrary data structures, which we are not working with.

I don't see how it makes sense to use a generic normalize API when we know that we do not support generic denormalization.

With a coherent import and export system in core, we can begin to evolve the "format" a little bit and remove some of these workarounds.

The workarounds are because serialization is used. default_content doesn't need any special handling for timestamp fields, or date fields, which this doesn't handle yet. It just exports the raw values. The serialization normalizers were added to remove drupalisms from our data structures and allow arbitrary clients to consume our data. They want formatted, standardized dates (for example), not UNIX timestamps.

default content export and import was purpose-built for a compact, simple and stable export/import format of default content in Drupal. 1.x used hal_json and it was pretty annoying to work with. The reason hal_json was used is that it deals with dependencies, which is useful for us and why I specifically added that as well. The default normalizer doesn't do that, so you have to add that back.

Field definitions don't really have a way to identify serial identifiers from non-serial ones, our storage basically just assumes that integers are, while strings are not (\Drupal\Core\Entity\Sql\SqlContentEntityStorageSchema::processIdentifierSchema). We can easily add a check for that in \Drupal\default_content\Normalizer\ContentEntityNormalizer::getFieldsToNormalize().

if Default Content doesn't know how to handle a specific field type correctly, you're screwed; with the callback system, you can do something about it

.. with the callback system that specifically invented for this. There is absolutely no reason why we couldn't add something similar to the default_content logic. As mentioned, it could be built on tagged services, so we wouldn't need an event listener to register callback, that we then pass through a magic array key around:

  something:
    class: Drupal\Core\Entity\Something
    tags:
      - { name: default_content_export, type: field_item:image }

We can register them directly on the exporter, check if we have something matching the type we have and call it. And we can add something to support import as well.

Serialization absolutely does respect access. There are at least a couple of normalizers (I think FieldItemNormalizer is one) that call $field->access('view') before normalizing.

Right I forgot about that. The existing default_content logic does not, because it was specifically designed to export and import raw data and not worry about access and users. So this is another workaround that's needed because you use serialization.

Comment #39

phenaproxima

he/him

English

Massachusetts

commented 8 July 2025 at 00:41

Here's the thing: I personally do not, at the end of the day, actually care whether this uses Serialization or not.

The goal of this MR is to do whatever it takes to get core to export content in the format it knows how to import (which was lifted from Default Content). Whether it does that with a dedicated normalizer, or a tagged service collector, or straight-up magic, is not very important to me. I have two needs here that I'm trying to fulfill:

I need to be able to export content in a way that is customizable for specific field types (this is what's missing from Default Content), without having to change core. Without that, I cannot build site templates, full stop.
The exported content needs to match what the importer knows how to handle, because that is what recipes already use in the wild.

Context also matters: the default content API is experimental and we have a great deal of latitude to change it. There is plenty of time between now and 11.3.0 to work out the architecture.

This feature is strategically necessary. I am doing whatever needs to be done to get it in at all, and this has already been refactored too many times.

My vote: merge it more or less as-is, and then open follow-ups make whatever architectural changes we want (tagged services? an officially supported "exportable" setting for fields? etc.) before we call the core default content API stable. If that means we don't need to bother with Serialization, great -- so be it. Until 11.3.0 reaches beta, we can do whatever we like.

I'm certainly open to some of what you propose!

But site templates need this feature, and they need it now.

Comment #40

phenaproxima

he/him

English

Massachusetts

commented 8 July 2025 at 03:22

@berdir, I did a little bit of experimenting and indeed, not using Serialization definitely does show the potential to simplify a number of things quite a bit. Registering additional field types could be done with, say, a PreExportEvent subscriber that does something like:

\Drupal::service(Exporter::class)->setCallback('field_type', function (MyFieldItem $item, ExportMetadata $metadata): array {
  // ...something special...
});

As I've said, I'm mostly agnostic to how it works. But, with that being said, this has been refactored four times and if I'm going to make further architectural changes, I'd really like there to be alignment between those who feel strongly about that architecture, so that the next refactor is the last one.

Comment #41

berdir

German

Switzerland

commented 8 July 2025 at 05:42

Noted on the time constraints.

I don't have as much time to keep up with this (wrote my previous comment at 1am) but I've been thinking on how to add extension points to the current default content export code. I'll try to create a MR to show my ideas asap and then we could discuss in slack or a call with the others?

On stability: also noted, but you also change stable apis with the field settings and the callbacks where it it gets more complicated with stability. IMHO, an approach that's works more out of the box and requires fewer adjustments in core and contrib will make your life easier. Your site templates will need to play nice with contrib entity types too.

Comment #42

alexpott

he/they

English

🇪🇺🌍

commented 8 July 2025 at 13:44

Status:

Needs review

» Needs work

@berdir thanks for the reviews.

Discussed with @phenaproxima - we agreed to removed the dependency on serialization as suggested by @berdir. @phenaproxima and I still think there is value in allowing fields to have some say in how they are exported via the setSetting() capability. Re 11.3.x vs previous releases - I think hardcoding a string is fine in this situation - we can create an enum from the string when we use it and error if it is wrong. Also @phenaproxima has a test showing that these settings do not end up in base field override configuration so I don't think we need to worry about configuration schema. @berdir maybe you have another suggestion that we could leverage for a field to give the exporter extra information. We're trying to avoid the exporter making too many assumptions about fields and give some control to the fields.

Comment #43

phenaproxima

he/him

English

Massachusetts

commented 8 July 2025 at 15:49

Status:

Needs work

» Needs review

Comment #44

berdir

German

Switzerland

commented 8 July 2025 at 19:54

Thanks for considering my feedback. I really like the direction, this is way more isolated to the component and requires fewer overrides and customizations.

I do have "few" more thoughts, we can decide what of that we can look into in follow-ups or if there are things we want to change before we get in. (Warning, still a long comment, because I'm me and like writing many words).

* I'm still not too fond of the settings approach, but I can live with it. What I'd suggest is that we explore to allow this but also start off with sane defaults. Basically what \Drupal\default_content\Normalizer\ContentEntityNormalizer::getFieldsToNormalize() does, as a fallback, if nothing is explicitly specified. This could also be used for the changed field instead of a callback. Kind of what Exportable::ignore() does, but in the default case, we'd check those for the entity keys and so on. The current API might not have enough context to access that though. What I would have done in default_content is add a specific event to the mentioned method. There will be plenty of contrib and custom entity types that do not use \Drupal\Core\Entity\ContentEntityBase::baseFieldDefinitions() (as that was added after 8.0 IIRC) an those will be broken. Their ID's will be exported, resulting in conflicts and so on. There are also still a bunch of "useless" fields being exported now, such as revision affected (calculated on save) and content translation metadata fields (less clear, but IMHO not useful for default content).

* For the event, now that we have control over it, what I had thought about to explore in default_content is to make it "active", so basically just pass in the field (or even property) and metadata and call it for all of them instead of using it to register callbacks based on the field type. It would be slower, but events are pretty fast once initialized and performance isn't really a concern here. Just an idea, didn't fully think this through. Advantages would be that multiple events can possibly deal with fields of the same type and they're not limited to act on the type. pathauto could do something about it's weird flag, scheduler could act on all the base fields it adds and so on. On tagged service, I definitely don't feel strongly about that, especially now that we need a far less of those and many are provided by default.

* In default_content, I specifically pushed a lot of the customization to the property level, because it allows to handle field types more generically. default_content doesn't need any special handling for files, image or dynamic entity reference fields for example, because they all use an EntityRefeference property. It doesn't always work (there are a bunch of issues about layout paragraphs for example), but it does work nicely for those.

* Files: there are no changes on file yml files but I assume you did verify to recreate them. However, if you delete the fixture folder you'll notice that the actual files won't be exported. This is currently missing and implementing it isn't really compatible with the current stream wrapper approach. default_content handles this in \Drupal\default_content\ContentFileStorage::writeEntity. This, beside references, is why I recommend extracting the output part.

* A feature that was kept in the importer is the ability to have nested entities such as paragraphs (in ERR, this is called composite entities). It is clear that the decision to do this embedding would live in the ERR module. But for it to work, ERR needs an API to do the normalization into an array. There would be workarounds I suppose (with the current API, let it export to YAML into memory, then parse that again), but it's pretty awkward. That's one reason why in default_content, the normalization is a separate API/service.

* On UserInterface vs AccountInterface. The distinction is vague and I'm not sure we even should have it, but UserInterface is an entity. AccountInterface is a more abstract concept, it is in theory possible that there could be another entity type implementing that that isn't users, and this logic might not apply to that. That's why UserInterface for me is the correct interface. Not a big deal because it's rather theoretical.

Comment #45

phenaproxima

he/him

English

Massachusetts

commented 8 July 2025 at 22:12

I'm still not too fond of the settings approach, but I can live with it.

I experimented with having the event be the thing that carries a list of what to export and what to skip, and was very quickly convinced that it's better than the setting. Having this be something the event can decide is much more flexible, allows sane defaults, and can be easily overridden by modules that need to do something different. It also doesn't introduce a new setting with an ambiguous relationship to config. An added bonus is that doing this allowed me to remove the Exportable enum, further reducing API surface.

Overriding exportability for an individual property can still be done (and it is) -- you just need to write a custom export callback for it (the Path module's DefaultContentSubscriber is an example). I think this is a reasonable balance.

there are no changes on file yml files but I assume you did verify to recreate them

The fixtures were originally created with Default Content. :) There are some minor differences between them now and how Default Content generates them, but those differences are due to subtle shifts in how field values are exported. Functionally, they should not affect how content is imported.

UserInterface vs AccountInterface

I agree that the difference is largely academic, so here's an equally academic reason for keeping AccountInterface: this way, we aren't having a core subsystem depend on a module (even though it's a required one). A minor point of cleanliness, but a solid one.

But for it to work, ERR needs an API to do the normalization into an array.

Thought about this for a bit and decided it makes sense to support this. Exporter::export() will just return the array so that you can export recursively if needed; the final serialization can take place in the command.

Comment #46

mstrelan commented 9 July 2025 at 04:46

Status:

Needs review

» Needs work

I think it's possible we might be trying to access ->uuid() on null in the link event subscriber. Other than that most of the other comments I made are nits.

Comment #47

phenaproxima

he/him

English

Massachusetts

commented 9 July 2025 at 13:42

Status:

Needs work

» Needs review

All outstanding feedback is resolved.

Comment #48

phenaproxima

he/him

English

Massachusetts

commented 9 July 2025 at 13:48

Issue summary:

View changes

Comment #49

phenaproxima

he/him

English

Massachusetts

commented 9 July 2025 at 16:09

Crediting @mstrelan for his review.

Comment #50

berdir

German

Switzerland

commented 10 July 2025 at 05:13

Issue summary:

View changes

* Files: there are no changes on file yml files but I assume you did verify to recreate them. However, if you delete the fixture folder you'll notice that the actual files won't be exported. This is currently missing and implementing it isn't really compatible with the current stream wrapper approach. default_content handles this in \Drupal\default_content\ContentFileStorage::writeEntity. This, beside references, is why I recommend extracting the output part.

The fixtures were originally created with Default Content. :) There are some minor differences between them now and how Default Content generates them, but those differences are due to subtle shifts in how field values are exported. Functionally, they should not affect how content is imported.

I suspected I was too verbose with this. The important bit is in the second part of my paragraph, the first was just an intro, a preemptive reply to "I verified file entities by re-exporting them". I know they were created with default_content. My point is that what this is missing is the logic to export the actual files, not the content of the file entities. That needs special handling. Try deleting the whole folder and then re-exporting them, not just overwriting the existing files. And it requires hardcoding file entities in the drush command or wherever the logic for dealing with the output will be. Fine with a follow-up for this, as it will require that we write the files directly to a folder, but I think it would be good to have those follow-ups ready for the next steps (such as references as well)

Comment #51

phenaproxima

he/him

English

Massachusetts

commented 10 July 2025 at 09:56

Oh! Gotcha. We do have that follow-up; file export will be handled as part of adding dependency export capabilities.

Comment #52

mstrelan commented 11 July 2025 at 01:23

I had some additional questions and suggestions for improving the docs, otherwise this is looking really good.

Comment #53

mstrelan commented 11 July 2025 at 03:45

Status:

Needs review

» Reviewed & tested by the community

Thanks for addressing that feedback. I've only just actually tested it now rather than just reading the code and it works well. One thing that stands out to me that could be addressed in a follow up is whether it makes sense to export the created timestamp. It would be a bit weird to have a node or user that was created before a site was created.

Comment #54

berdir

German

Switzerland

commented 11 July 2025 at 05:41

I didn't review every detail but I don't have any objections anymore to this being RTBC.

On created, mixed thoughts, default content can get old, but it can also look weird if it's all the same, especially on articles which are sorted by date, can also introduce a random factor in tests. I'd keep it, it's easy to remove through an event or by hand.

What I'd appreciate is if someone creates issues for pathauto and ERR and maybe even try to implement the provided events to replicate the current logic in default_content. Would also help to verify this is extensible enough.

Comment #55

phenaproxima

he/him

English

Massachusetts

commented 11 July 2025 at 14:00

Done.

Comment #56

alexpott

he/they

English

🇪🇺🌍

commented 15 July 2025 at 16:21

Status:

Reviewed & tested by the community

» Fixed

Committed 80fd4ba and pushed to 11.x. Thanks!

We also need to open an issue to make an 11.3.x version and up of default content that uses all the core stuff.

Comment #57

15 July 2025 at 16:22

alexpott committed 80fd4ba0 on 11.x

Issue #3532694 by phenaproxima, alexpott, nicxvan, berdir, mstrelan,...

Comment #58

29 July 2025 at 16:24

Status:

Fixed

» Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

Comment #59

gábor hojtsy

he/him

Hungarian

Hungary

commented 2 December 2025 at 13:16

Issue tags:

+11.3.0 release highlights

Add a command-line utility to export content in YAML format

Problem/Motivation

Proposed resolution

User interface changes

Introduced terminology

API changes

Data model changes

Release notes snippet

Issue fork drupal-3532694

Comments

Change records for this issue

Related issues

Referenced by