Building a custom migration in Drupal 8, Part 5: Paragraphs

 

In the last post, we migrated our uploaded images and attachments by creating a custom file migration. We explored the process section of our migration *.yml, and enhanced it with custom mappings. We leveraged Drupal 8's powerful process plugin system to even further customize our migrations. We found out we could chain migrations together through the migration_lookup plugin. Finally, we created and ran a simple node migration. So that's it, right? Series over? Heck no! In this part, we throw a contrib module into the migration mix, and combine multiple source fields into a single, multi-part field in Drupal 8.

Breaking up the body

One of the original motivations for writing this series was to write about migrating content into Paragraphs. Paragraphs is a relatively new module that offers an alternative way to organize your content. Instead of having a single, monster body field with an ever growing WYSIWYG editor, Paragraphs allows you to break up the body into separate paragraph entities. Each paragraph can have it's own paragaph type, it's own fields, theme template, and so on. Paragraphs can even be reorganized with a click and drag. When I started looking into building my new Drupal 8 site, I wanted to enhance the blogging experience on it through custom paragraphs, combining it with a fluid-width, stripe-centric design. 

In my Drupal 7 site, my blog content type relied on fields provided by Drupal core, such as the image field, and the body field. The problem with this approach is that I didn't display the image field as an image. Instead, I used it as an attachment field, embedding the image manually using WYSIWYG. While I could have preserved this method when migrating to Drupal 8, I wanted to migrate any attached images directly to a paragraph type that displayed the image. The body field would be migrated to a paragraph type that displays formatted text. When looking at the Paragraphs module for migrations, however, it poses a number of unique challenges that can be difficult to wrap your head around. It took me the better part of two entire weekends to figure it out. 

Installing and setting up Paragraphs

Before we can get to migrating into paragraph entities, let's install and set up the Paragraphs module. Paragraphs depends on another module named Entity Reference Revisions (ERR), so we'll need to install that too. Using the command line, navigate to the root directory of your Drupal 8 site and use composer to install the module:

$ cd path/to/yoursite
$ composer install drupal/entity_reference_revisions >=8.x-1.2
$ composer install drupal/paragraphs
$ drush en -y paragraphs

It's really, really important to install at least version 1.2 of ERR, as it has migration enhancements we'll need later. Now we need to create some new paragraph types under Admin > Structure > Paragraph types. It's really tempting to make a paragraph type to suit every possible need, but it's better to have no more than a handful. Since my site is mostly blogging and images, I only needed a small number of paragraph types:

  • image_gallery
  • text
  • preformatted_text
  • text_with_image
  • image_and_description
  • media_embed

For migration purposes, we only need the first two paragraph types, image_gallery and text. This makes things a bit simplier when writing the migration, although we may want to edit and reformat older posts manually later. Each paragraph type has it's own list of fields too: The image_gallery type has a core image field named field_para_images, and the text paragraph type has a formatted text field named field_para_text. With that set up, we can start thinking about migrations.

Creating the body migration

As I mentioned earlier, I wanted to use Paragraphs primarily to improve the blogging experience on my site. As such, I have a blog content type. This type exists on both my Drupal 7 site, and my Drupal 8 site. The only difference is that on Drupal 7, I have a separate field_blog_images and body field, whereas in Drupal 8 I only have an add_sections paragraph field. You might think because the paragraphs are in a field, we can start by writing the blog migration and take care of the paragraph types in a mapping. This isn't quite correct. There is a very popular post on Stack Overflow that suggests you write a custom process plugin. While this method can work, it's no longer recommended for Entity Reference Revisions 1.2 or higher. 

The problem is that each paragraph is itself a separate entity. Instead of treating them like a field, we need to treat them as a migration in their own right! How we do this is to start with -- paradoxically -- the d7_node.yml migration template. Copy it from core/modules/node/migration_templates/d7_node.yml into a new editor. Change the idlabel, and add a migration_group:

id: yoursite_blog_text_paragraph
migration_tags:
  - 'Drupal 7'
migration_group: yoursite
label: 'yoursite blog text paragraph'

Why the node template? First of all, neither the ERR or Paragraphs module provides a migration_templates directory. Secondly, we need to process each field in each blog post we intend to migrate into a new paragraph. That data still comes from the d7_node source plugin as it would for any node migration. Let's start with the text paragraph type as it'll be a little simpler and won't depend on our file migration. Specify the source node type in the source section:

source:
  plugin: d7_node
  node_type: blog

Unlike the entity:node destination we used before, we have to use an altother different destination module for paragraphs. The ERR module provides a destination plugin that creates paragraphs for us, so change the destination section to the following:

destination:
  plugin: 'entity_reference_revisions:paragraph'
  default_bundle: your_paragraph_machine_name
migration_dependencies:
  required: {  }
  optional: {  }

Notice that we removes any migration_dependencies while we were at it. The default_bundle parameter specifies the paragraph type machine name. For my migration, I'm migrating the body field into a paragraph with the machine name of text, so my default_bundle is text. This gives us the basic outline for our any of our paragraph migrations. We pull in from the source node and target a specific paragraph type. Finally, we need deal with the field mapping.

Since we copied from the d7_node.yml migration template, we need to delete everything from the process section to start. We're only interested in migrating the node's body field into field_para_text in our paragraph entity. We can't use a direct mapping, though, since we know the body field is a multi-part field. Luckily for us we've already migrated multi-part fields using the iterator plugin:

process:
  field_para_text:
    plugin: iterator
    source: body
    process:
      value: value
      format:
        plugin: default_value
        default_value: full_html

We map the body's value part to the corresponding value part of field_para_text. Since I want to preserve all my formatting from my posts, I used the default_value plugin to set the format part of field_para_text to the full_html text format. Save the migration to migrate_plus.migration.the_migration_id.yml to your sync directory. Use drush cim to import it, and drush ms to check the migration status. For our text paragraph migration, we should get a unprocessed count that equals the total number of nodes for our source content type:

$ drush cim -y
...
$ drush ms

Group: YourSite group (yoursite)    Status  Total  Imported  Unprocessed  Last imported       
yoursite_role                      Idle    4      0         4            N/A
yoursite_user                      Idle    18     0         18           N/A
yoursite_file                      Idle    507    0         507          N/A
yoursite_gallery                   Idle    134    0         134          N/A
yoursite_blog_text_paragraph       Idle    1050   0         1050         N/A

We could run our new paragraph migration now, but it wouldn't show up in any UI. Why? Remember that our destination wasn't a node, but a paragraph type. Since there's no UI to see "detatched" paragraph entities, we won't see any new content. This is to be expected.

Creating the image paragraph migration

That takes care of our body to text paragraph migration, but what about the migration from field_blog_images to image_gallery paragraphs? Instead of starting over from the d7_node.yml template, copy your new text paragraph migration into a new editor. Change the idlabel, and the default_bundle of the destination accordingly:

id: yoursite_blog_image_paragraph
migration_tags:
  - 'Drupal 7'
migration_group: yoursite
label: 'yoursite blog image paragraph'
source:
  plugin: d7_node
  node_type: blog
process:
destination:
  plugin: 'entity_reference_revisions:paragraph'
  default_bundle: image_gallery

Notice that we've emptied the process section again. We'll need a completely different one for this migration as we're working with different fields. Since we're going to be migrating an image field, we also need to make our migration dependent on our file migration:

migration_dependencies:
  required:
    - yoursite_file
  optional: {  }

Now we need to write our field mapping. We're migrating from field_blog_images to the image field in our image_gallery paragraph type, field_para_images. Like the body field, the core image field is a multi-part field, so we'll need to use the iterator plugin:

process:
  field_para_images:
    plugin: iterator
    source: field_blog_image
    process:
      alt: alt
      title: title
      height: height
      width: width 

Then we run into the problem of the file entities. On the Drupal 7 field_blog_image the file entity ID is specified in a part named fid. In our Drupal 8 field_para_images it's called target_id. We don't want to directly map the two parts together as migrated files may have a different ID. Instead, like the Gallery migration we wrote in the last part of this series, we map the field through the file migration using the migration_lookup plugin:

process:
  field_para_images:
    plugin: iterator
    source: field_blog_image
    process:
      target_id:
        plugin: migration_lookup
        migration: yoursite_file
        source: fid
      alt: alt
      title: title
      height: height
      width: width

Save the new migration *.yml to the sync directory, import it, and check the migration status:

$ drush cim -y
...
$ drush ms

Group: YourSite group (yoursite)    Status  Total  Imported  Unprocessed  Last imported       
yoursite_role                      Idle    4      0         4            N/A
yoursite_user                      Idle    18     0         18           N/A
yoursite_file                      Idle    507    0         507          N/A
yoursite_gallery                   Idle    134    0         134          N/A
yoursite_blog_text_paragraph       Idle    1050   0         1050         N/A
yoursite_blog_image_paragraph      Idle    1050   0         1050         N/A

Note the both our paragraph migrations have the same number of items to import. This is because only the source plugin determines which items to import, the process section isn't relevent for this display. 

Creating the blog migration

Now that we have both our fields migrated into separate paragraphs, we can start writing the migration to create our destination nodes. Copy the core/modules/node/migration_templates/d7_node.yml migration template into a new editor. And, you guessed it, update the idlabel, add a migration_group. Add a node_type to the souce section for our source node type:

id: yoursite_blog
migration_tags:
  - 'Drupal 7'
migration_group: yoursite
label: 'yoursite blog'
source:
  plugin: d7_node
  node_type: blog

Our blog migration is going to be dependent on our user migration and both paragraph migrations, so add that to the bottom of the file:

destination:
  plugin: 'entity:node'
migration_dependencies:
  required:
    - yoursite_user
    - yoursite_blog_image_paragraph
    - yoursite_blog_text_paragraph

Under the process section, we add a mapping for the node type, using the default_value plugin:

  type:
    plugin: default_value
    default_value: blog

We may also want to map the author uid through our user migration by changing the default mapping to:

uid:
  plugin: migration_lookup
  migration: yoursite_user
  source: node_uid

Adding the paragraph migrations

Whew! And we still haven't gotten to adding our paragraph migrations yet! Our target blog content type has no body and one paragraph field, field_add_sections. The paragraph field is a multi-part field like our images and body field. So it makes perfect sense to start with an iterator plugin, but we quickly runing a new problem:

field_add_sections:
  plugin: iterator
  source:

Ummm...  What do we do there? Unlike our other fields, we have two sources that need to go into one field: one for the text paragraph, and another for the image paragraph. Furthermore, we don't know what the paragraph entity IDs are, those are in the paragraph migrations we created earlier. Fixing this requires us to leverage another hidden ability of Drupal 8 migrations. While the process section primarily lets us specify field mappings and process plugins, there's nothing preventing us from defining additional, non-field mappings:

  para_image:
    plugin: migration
    migration: deninet_blog_image_paragraph
    source: nid
  para_text:
    plugin: migration
    migration: deninet_blog_text_paragraph
    source: nid

Wait, what? There's no field name called para_image or para_text on our Drupal 8 nodes! That's right, these are psudofields. They exist only in the migration. Our paragraph migrations used the d7_node plugin as a source, but created a new paragraph entity instead. Drupal 8 is smart enough to create a database table for us that maps the Drupal 7 node ID to the paragraph entity ID for each of our paragraph types. Here we have two psudofields, each use the node ID as the source, and return a paragraph entity. 

This solves one of our two problems. We now have the correct paragraph entities in our migration, but how do we combine them into a field for save to field_add_sections? It turns out that the source property of the iterator plugin can accept multiple enties:

  field_add_sections:
    plugin: iterator
    source:
      - '@para_image'
      - '@para_text'
    process:
      target_id: '0'
      target_revision_id: '1'

Notice how we prefix the psudofields with '@' in the source parameter. This tells Drupal 8 to look for the value in psudofields, rather than pull it directly from the database. The iterator plugin will first look in the para_image psudofield for the image paragraph, and add that as a field value to field_add_sections, then it repeats the process for the text paragraph. I admit, I'm not sure what the target_id or target_revision_id are for, but they appear to be necessary for the migration to work. With that finished we can save the migration *.yml, import it, and check to see if everything looks good in the migration status:

$ drush cim -y
...
$ drush ms

Group: YourSite group (yoursite)    Status  Total  Imported  Unprocessed  Last imported       
yoursite_role                      Idle    4      0         4            N/A
yoursite_user                      Idle    18     0         18           N/A
yoursite_file                      Idle    507    0         507          N/A
yoursite_gallery                   Idle    134    0         134          N/A
yoursite_blog_text_paragraph       Idle    1050   0         1050         N/A
yoursite_blog_image_paragraph      Idle    1050   0         1050         N/A
yoursite_blog                      Idle    1050   0         1050         N/A

Awesome! Now, to migrate our blog entires we have to work up to it through the dependency chain. Provided we've migrated our roles, users, and files, we now have to migrate our paragraphs (both kinds) and finally the target content type.

Summary

We're well on our way to becoming migration wizards now! In this part we installed and configured the Paragraphs module. Instead of using a chunky body field, we have a new field in our target blog content type that can use multiple paragraph types. We not only migrated content, but we transformed it into something else entirely! Now we no longer have to face a lengthy and manual process of converting each body and image field into a paragraph, the migration does it for us. In Part 6, we're going to dig even deeper and write our own source plugins to reorganize our content even further.

Thanks to our sponsors!

This post was created with the support of my wonderful supporters on Patreon:

  • Alina Mackenzie​
  • Karoly Negyesi
  • Chris Weber

If you like this post, consider becoming a supporter at patreon.com/socketwench.

Thank you!!!