Migrate Data from a CSV File in Drupal 8/9
The process of migrating data into a Drupal database from a CSV file can be fulfilled through Drupal’s integrated Migrate API and three extra custom modules (Migrate Source CSV, Migrate Plus and Migrate Tools).
This is known as the ETL (Extract – Transform – Load) process, in which data is fetched from one source in the first step, transformed in the second step, and finally loaded to its destination on the Drupal database in the third step.
This tutorial will explain the creation of 12 book nodes for a library database. Keep reading to learn how!
Step # 1 – Install Drush and the Required Modules
To execute migrations in Drupal, we need Drush. Drush is not a Drupal module, but a command-line interface to execute Drupal commands. To install the latest version of Drush,
- Open the terminal application of your system
- Place the cursor outside the
/web
directory. - Type:
composer require drush/drush
This will install Drush inside the vendor/bin/drush
directory of your Drupal installation. However, it is cumbersome to type vendor/bin/drush
instead of drush
, each time you want to execute a drush command.
Drush launcher makes it possible to execute the specific Drush version of each Drupal installation on a per-project basis.
Why does this make sense?
Each project has different requirements, specific Drush versions help to avoid dependency issues. Some contrib modules may not work properly with the latest version of Drush.
The specific instructions for OSX and Windows systems can be found here: https://github.com/drush-ops/drush-launcher#installation—phar
For a Linux system:
- Type the following to download the file named drush.phar from GitHub:
wget -O drush.phar
https://github.com/drush-ops/drush-launcher/releases/latest/download/drush.phar
This downloads the file named drush.phar
from GitHub.
- Type the following to make the file executable:
chmod +x drush.phar
- Type:
sudo mv drush.phar /usr/local/bin/drush
That will move the .phar file to your $PATH and run Drush as a global command.
It is now time to install the required contributed modules to perform the migration.
- Type the following:
composer require drupal/migrate_tools
composer require drupal/migrate_source_csv
Once Composer has finished downloading the modules,
- Open the Drupal backend in your browser
- Click Extend
- Enable Migrate, Migrate Plus, Migrate Tools and Migrate Source CSV
- Click Install
Step # 2 – More About the ETL Process
The process of Extracting, Transforming, and Loading data can be achieved by defining the migration in a .yml file, and then executing it with a Drush command, so the Drupal database can correctly be populated.
There are some important facts to notice:
- Each one of the steps is performed through Drupal plugins.
- You are only allowed to use one plugin In the first step (source definition i.e. Extract) and one plugin in the last step (destination definition, i.e. Load) of the process.
- In other words, you may only fetch data from one source (CSV file, JSON feed, etc) and store it in Drupal only under a particular entity bundle, for example, Article, Page, a custom content type, a user, or a configuration entity, as well.
- You are allowed to use as many plugins as necessary to model the data so that it matches the format expected by Drupal.
- Drupal has by default a list of source/process/destination plugins that can be used within the definition file.
To see a list of all source plugins,
- Open your terminal window
- Type:
drush ev "print_r(array_keys(\Drupal::service('plugin.manager.migrate.process')->getDefinitions()));"
** note - the above command should be all one line - it's wrapped in the browser.
Notice: the csv plugin is there because we already have enabled the Migrate Source CSV module.
To see a list of all process plugins,
- Type the following:
drush ev
"print_r(array_keys(\Drupal::service('plugin.manager.migrate.source')->getDefinitions()));"
That list is a little bit longer, remember that you can use as many plugins as you need in the process step.
To see a list of all destination plugins,
- Type the following:
drush ev "print_r(array_keys(\Drupal::service('plugin.manager.migrate.destination')->getDefinitions()));"
Step # 3 – Create the Content Type
- Click Structure > Content types > Add Content type
- Create the ‘Book’ content type
- Click Save and manage fields
- Use the title row of the CSV file to create the fields
I am going to concatenate the values of the columns edition_number and editor, so I need only one field in the database for this purpose.
Notice: the field names (machine names) do not have to match exactly to the column names of the CSV file, yet it makes sense, to at least relate them with similar words – that eases the field mapping in the process step.
The title
field is mandatory for every Drupal node, so there is no need to create a title field. You can leave the body field untouched or you can delete it, it depends on what you are planning to do with your particular content type.
Step # 4 – The Migration Definition File
Source Step
- Open your preferred code editor
- Type the following:
id: my_first_migration
label: Migrate terms from a CSV source
source:
plugin: csv
path: public://csv/library.csv
header_row_count: 1
ids:
[id]
delimiter: ';'
enclosure: "'"
The id of the file matches its name.
We are using the csv
plugin of the contrib module Migrate Source CSV in the source section.
The number 1 at the header_row_count
option indicates the value of the column titles (they are all placed on the first row).
The ids
definition provides the unique identifier for each record on the CSV file, in this case, the id
column, which is of type integer
. Don’t forget the brackets, since the module is expecting an array here.
delimiter
and enclosure
refer to the structure of the CSV file, in my particular case, it is delimited by “;” characters, whereas strings are enclosed between “‘“ single quotation marks.
Notice also, the definition of a path. That is where you have to place the CSV file, so Drupal will be able to read the data from it.
- Open your terminal application.
- Type:
mkdir web/sites/default/files/csv
cp /home/path/to/library.csv web/sites/default/files/csv/
chmod -R 777 web/sites/default/files/csv/
This will:
- create a directory called
csv
inside the public folder of your Drupal installation. - place a copy of the CSV file inside that directory.
- make the file accessible to everyone in the system (that includes the source plugin).
Process Step
This is where we map each one of the columns of the CSV file with the fields in the content type:
process:
title: title
field_id: id
field_author: author
field_location: location
field_availability: availability
field_editor:
plugin: concat
source:
- editor
- edition_number
delimiter: ' '
type:
plugin: default_value
default_value: book
The first 5 key/value pairs in the form drupal_machine_name: csv_column_name map the CSV records to the database fields without performing any changes.
The field_editor field will be the result of performing the concatenation of 2 strings (the values inside the editor and edition_number columns).
The delimiter option makes it possible to set a delimiter between both strings, in this case, a blank space.
The default_value plugin helps us to define an entity type since this information is not available in the source data.
Destination Step
The final part of this process is the destination step.
destination:
plugin: entity:node
We are migrating content and each record will be a node.
Step # 5 – Execute the Migration
- Click Configuration > Configuration synchronization > Import > Single item
- Select Migration from the dropdown
- Paste the code from the .yml file into the textarea
- Click Import
- Click Confirm to synchronize the configuration. You will get the message “The configuration was imported successfully”
- Change to the terminal application.
- Type the following:
drush migrate:import my_first_migration
You can now check the content on your site.
You have learned the basic principles of migrating data from a CSV file to Drupal 8/9.
As you have already seen, the migration process requires attention to the details, so make sure that you work first on a staging server because one little mistake could break the whole site. I hope you liked this tutorial.
Thanks for reading!
I am getting You must declare “ids” as a unique array of fields in your source settings error after implementing this. Any solution for this.
Hi Dipikarani,
did you map the [id] in the source step between square brackets? So the system knows it is an array.
Same issue I faced.Thanks for sharing in right time.
I really appreciate this tutorial!! I was struggling trying to figure out what process plugin to use and came across this. THANK YOU!
I did find a correction. The command you stated for finding the process plugins uses ‘sources’ again in your example:
To see a list of all process plugins,
I think it should probably read:
lol, I kept coping the first one and getting the sources. Maybe this can save the next novice, like me, a little time.
thank you so much! I added a note just beneath the command to save anyone else the headache 🙂
Hi,
there is a way to insert data into an Entity reference revision field ?
I have a node with an Entity reference revision field, that reference 4 different types of paragraph and each one have their own fields.
For example :
Node : article
field : paragraph (type1 or type2)
Paragraph type1 :
field : title ; textrich
Paragraph type2 :
field : title ; textrich; number, date
How we can do an import with this structure ?
You’re using the paragraphs module? I’m not sure that’s possible – have you looked at the Feeds module? https://www.drupal.org/docs/contributed-modules/feeds-paragraphs