CSV: Support one to many relationships [#759966]

Comment	File	Size	Author
#41	feeds-2.0-alpha8-auto_split_multi_value_cell_for_csv.patch	666 bytes	breeze_man

#31	Screen Shot 2012-04-06 at 12.05.15 PM.png	37.84 KB	arkjoseph
#20	feeds_multivaluecsv_parser.tgz	2.46 KB	rodrigogarcia
#14	feeds-multivaluecsvparser-759966-14.patch	9.83 KB	Nephele

Comment #1

alex_b CreditAttribution: alex_b commented 2 April 2010 at 12:53

Title:	For mulitple value fields (1-to-many relationships) like node reference, what is the syntax for CSV files?	» CSV: Support one to many relationships
Version:	6.x-1.0-alpha12	» 6.x-1.x-dev
Component:	Documentation	» Code
Category:	support	» feature

That's not supported at the moment. What you're looking for is support for a nest operation that transforms the CSV file's flat format to nested arrays.

Log in or register to post comments

Comment #2

alex_b CreditAttribution: alex_b commented 2 April 2010 at 15:27

One to many relationships in CSV are properly captured in first normal form:

name,phone_number
mark,2023151234
mark,2033334155
mark,3153152120

The CSV parser currently transforms this into an array that looks like this:

array(
  array(
    'name' => 'mark',
    'phone_number' => '2023151234',
  ),
  array(
    'name' => 'mark',
    'phone_number' => '2033334155',
  ),
  array(
    'name' => 'mark',
    'phone_number' => '3153152120',
  ),
);

What we need though is this:

array(
  array(
    'name' => 'mark',
    'phone_number' => array(
      '2023151234',
      '2033334155',
      '3153152120',
    ),
  ),
);

The way we can achieve this is to allow for specifying unique fields on the parser. The parser would then attempt to nest rows where the same field value occurs twice. This would have to work with multiple unique fields, consider a CSV file with three fields:

name,location,phone_number

Now we specify name and location as unique:

name,location

This should cause CSV parser to nest all entries (i. e. the phone_number entry) under location and all location entries under name.

I've not thought this all the way through. e. g. open question is how mappers would consume a multi level nested arrays.

Log in or register to post comments

Comment #3

Anonymous (not verified) CreditAttribution: Anonymous commented 3 April 2010 at 00:44

Thanks for the feedback Alex.

One small problem I see with this though. How will the Parser (or Mapper?) know the difference between an new node and a multiple field entry. For example:

0:  name,phone_number
1:  mark,2023151234
2a: mark,2033334155
2b: mark,2033334127

or, if this is in a file:

0: name,phone_number
3: mark,2033335543

Line1: New node with some data
Line2a: Same node, new data set, please overwrite if it already exists
Line2b: Same node, additional entry for multiple value phone_number field
Line 3: New, *additional data*, for already existing node in DB. Please *add* the phone_number data to any existing node, do not delete any current phone_number relationships.

How do we distinguish between 2a and 2b and 3?

Log in or register to post comments

Comment #4

clemens.tolboom

Dutch

Groningen, 🇳🇱/🇪🇺

CreditAttribution: clemens.tolboom commented 15 June 2010 at 17:37

I have solved this for multiple locations by 'defining' a multiple value separator.

Multiple locations.country codes AM\nRM\nRU\nUA

For this i only needed to patch the location mapper but it would be better to patch the CSV parser. This way it would be easy to define a multifield separator through the CSV parser settings.

Or maybe through the field mapper? That sounds not better but is an option.

What do you think about this?

Log in or register to post comments

Comment #5

alex_b CreditAttribution: alex_b commented 15 June 2010 at 17:42

#3: by specifying one or more group-by fields (in our example that would be 'name'). Records would be grouped by the fields specified.

#4: I agree that we should support one-to-many relationships on the parser level. My current state of thinking about this problem is captured in #2. I'd like to avoid supporting sub-seperators that effectively make one field to multiple fields (I think this is what you suggest). However, splitting such fields is really simple and can be done easily on a per-use case basis.

Log in or register to post comments

Comment #6

clemens.tolboom

Dutch

Groningen, 🇳🇱/🇪🇺

CreditAttribution: clemens.tolboom commented 16 June 2010 at 10:32

However, splitting such fields is really simple and can be done easily on a per-use case basis.

How easy would that be? My case now is having a csv with multiple terms and locations. taxonomy fails. Location i patched for now with explode("\n", $value)

Have the parser allow field splits could be a quick win. But risky too ... having to check for the special split value. So I guess grouping would work better.

Log in or register to post comments

Comment #7

alex_b CreditAttribution: alex_b commented 16 June 2010 at 14:05

Have the parser allow field splits could be a quick win. But risky too ... having to check for the special split value. So I guess grouping would work better.

Agreed. I meant "easy" as in easy for a one-off like you did that for Location. Splitting out from within fields in a generalized way is actually hard to do.

Log in or register to post comments

Comment #8

clemens.tolboom

Dutch

Groningen, 🇳🇱/🇪🇺

CreditAttribution: clemens.tolboom commented 22 June 2010 at 09:09

I'm still puzzling how to spit out a csv from say three tables node (2 nodes), taxonomy (3 terms per node), location (2 location per node).

Version 1: Doing a full join (easy from db perspective) result into 2 * 3 * 2 = 12 rows.

Importing Version 1 would mean to filter out duplicates of Terms and Location. A travel round trip (twice New York=duplicate location) is ruled out to get imported or we must make this configurable.

Version 2: Creating the csv by programming would result into 6 rows with 2 empty location values. I'm not sure my data provider is willing to program this.

n-1 t-1 l-1
n-1 t-2 l-1
n-1 t-3 ---
n-2 t-1 l-1
n-2 t-2 l-1
n-2 t-3 ---

Importing Version 2 would be easier I guess. Node is the repeating indicator. Terms and Locations are unique values to aggregate into an array.

Log in or register to post comments

Comment #9

infojunkie

Vancouver

CreditAttribution: infojunkie commented 26 June 2010 at 01:56

subscribe

Log in or register to post comments

Comment #10

stella CreditAttribution: stella commented 9 September 2010 at 16:17

subscribe

Log in or register to post comments

Comment #11

geek-merlin

they/omg

German

Freiburg, Germany

CreditAttribution: geek-merlin commented 16 October 2010 at 11:28

this might benefit from #943344: Transformers

Log in or register to post comments

Comment #12

OddSim CreditAttribution: OddSim commented 12 November 2010 at 04:52

My hack around this was to delimit multi-values with double pipes, then alter the CSV parser to explode fields containing the double pipes.

Example input:

    name,phone_number
    mark,2023151234||2033334155||3153152120

Change to ParserCSV.inc

 -  $row[$columnName] = isset($field) ? $field : '';
 +  $row[$columnName] = isset($field)
 +      ? (strpos($field, '||') !== false)
 +          ? explode('||', $field)
 +          : $field
 +      : '';

Log in or register to post comments

Comment #13

alex_b CreditAttribution: alex_b commented 16 November 2010 at 22:53

I did something similar here: https://gist.github.com/658222

Log in or register to post comments

Comment #14

Nephele CreditAttribution: Nephele commented 22 December 2010 at 04:57

Status:

Active

» Needs review

File	Size
feeds-multivaluecsvparser-759966-14.patch	9.83 KB

I find the approach in #12/#13 more natural than the approach in #2. So I took the liberty of expanding the github snippet into a full parser plugin that could hopefully be added to the Feeds module. It seems to me that if the approach in #2 is also wanted, it could be implemented as a separate parser, and users could then choose which parser they prefer.

The attached patch makes a new Multivalue CSV Parser available when defining a feed. The new parser includes options to specify the multivalue separator, and whether or not to strip spaces around the separator. With the default options, the string '|| one || two || three' would be split into three values: 'one', 'two', and 'three'. (Note, though, that the patch was written for 7.x rather than 6.x, since I'm working in Drupal 7).

The patch also includes a couple minor bug fixes that I noticed while testing this code (although they're not strictly part of the new parser):
* In 'FeedsSource.inc', fixed a bug that appears if you change the parser (e.g., from CSV Parser to Multivalue CSV Parser) for a feed definition that has previously been used to import a file -- the bug is that when you go to import another file, any options for the new parser are ignored.
* In 'FeedsCSVParser.inc', fixed the template so that it uses the specified delimiter instead of always using ','

Log in or register to post comments

Comment #15

jlab CreditAttribution: jlab commented 4 January 2011 at 15:26

I tried patching with feeds-6.x-1.x-dev

patch -p0 < feeds-multivaluecsvparser-759966-14.patch

But I'm getting the following output:

patching file feeds.plugins.inc
patching file includes/FeedsSource.inc
Hunk #1 FAILED at 504.
1 out of 1 hunk FAILED -- saving rejects to file includes/FeedsSource.inc.rej
patching file plugins/FeedsCSVParser.inc
Hunk #1 FAILED at 195.
1 out of 1 hunk FAILED -- saving rejects to file plugins/FeedsCSVParser.inc.rej
patching file plugins/FeedsMultiValueCSVParser.inc

Log in or register to post comments

Comment #16

Nephele CreditAttribution: Nephele commented 4 January 2011 at 22:16

The patch I provided in #14 is for feeds-7.x-2.0-alpha2, which is presumably why you're getting errors. It probably wouldn't be too much work to create a 6.x patch, too -- I just wasn't sure how much interest there'd be, plus it's harder for me to test 6.x at this point.

Log in or register to post comments

Comment #17

serg.remote CreditAttribution: serg.remote commented 31 January 2011 at 05:27

#12 is small and elegant - works perfectly for me

Log in or register to post comments

Comment #18

videographics CreditAttribution: videographics commented 15 March 2011 at 06:28

The patch in #14 looks good. From an interface point of view, why not present the user with a choice of delimiters for multi-value fields just like a choice of field delimiters is offered?

Log in or register to post comments

Comment #19

Ayesh CreditAttribution: Ayesh commented 22 March 2011 at 14:39

For most comments here, look at http://drupal.org/project/feeds_tamper

Log in or register to post comments

Comment #20

rodrigogarcia CreditAttribution: rodrigogarcia commented 24 March 2011 at 16:53

File	Size
feeds_multivaluecsv_parser.tgz	2.46 KB

Hello,
I took #14 patch and made a plugin module for Drupal 6, in case somebody finds it useful.

Log in or register to post comments

Comment #21

johnv

Dutch

CreditAttribution: johnv commented 25 March 2011 at 15:45

@Ayesh, #19, thank you, I solved the multi-value case without problems and without patches with Feeds Tamper (on D7).

Log in or register to post comments

Comment #22

Ayesh CreditAttribution: Ayesh commented 8 May 2011 at 16:43

#12 worked for me too.
I think single pipe is even better because most standard applications' outputs are delimited by single pipes(or semi colons).
Thank you oddsim!

Log in or register to post comments

Comment #23

rfay

English

Palisade, CO, USA

CreditAttribution: rfay commented 9 May 2011 at 17:49

subscribe

Log in or register to post comments

Comment #24

dkroneb CreditAttribution: dkroneb commented 27 July 2011 at 15:25

Version:

6.x-1.x-dev

» 7.x-2.x-dev

Thank-you Nephele, I did the #14 attached patch on "feeds-7.x-2.x-dev". I did the code paste by hand, slight change on line numbers for the dev version.
Just what I needed to import for "unlimited" multiply items. I highly recommend bringing the "Multivalue CSV parser" into feeds versions.

Log in or register to post comments

Comment #25

chrisjlee CreditAttribution: chrisjlee commented 2 September 2011 at 21:53

@dkroneb So is there documented process for importing multiple values into a single field yet?

Or has this patch been committed into feed versions yet?

Log in or register to post comments

Comment #26

rfay

English

Palisade, CO, USA

CreditAttribution: rfay commented 3 September 2011 at 06:34

The real answer here remains the Feeds Tamper module.

Log in or register to post comments

Comment #27

verres CreditAttribution: verres commented 8 September 2011 at 21:40

@rfray,

how can the feeds tamper module create the multi-value cck field? is there a plugin for this already? if so - it's not listed on the plugin list. I could easily create a 'pipeline' separated import field - explode it into an array...but then how would I take each array element and create the multiple values for the cck node reference?

for example - if I imported a field such as: "222|333|444" -- and I want each value to be treated as a separate value in an unlimited value node-reference, where/how would I do that? thanks

UPDATE.... I didn't notice you had it posted here: http://drupal.org/node/1122422 -- worked like a charm!

Log in or register to post comments

Comment #28

rfay

English

Palisade, CO, USA

CreditAttribution: rfay commented 9 September 2011 at 00:26

@verres, sorry! I thought I'd mentioned how to do it, not just feeds tamper. The last screencast at http://www.drupalcommerce.org/node/467 demonstrates how to do multivalued fields using feeds tamper. In this case it's multivalued product reference fields.

Log in or register to post comments

Comment #29

carn1x CreditAttribution: carn1x commented 19 September 2011 at 11:07

Unfortunately the Feeds Tamper method requires that the creator of the CSV is technically competent. I have clients who I would not trust to merge fields into a "CSV cell", yet the alternative is that I need to manually convert each time the client updates their product list.

The node_import module from D6 made multi-value field mapping very simple however it did so by analyzing the CSV first and then mapping before the actual import. For multi-value fields, it provided multi-select lists where you could select multiple columns. It would be great it somebody could recreate similar functionality in Feeds to allow for great automation.

Log in or register to post comments

Comment #30

plamenD CreditAttribution: plamenD commented 17 February 2012 at 15:48

The approach in #2 it seems easer to implement, so if is of any help hear is me solution which works for me so far.
in feeds/libraries/ParserCSV.inc after line 300

+     $instancesof = array_count_values($this->columnNames);
        foreach ($this->columnNames as $columnName) {
          $field = array_shift($fields);
-         $row[$columnName] = isset($field) ? $field : '';
+        if ($instancesof[$columnName] > 1) {
+         $row[$columnName][] = isset($field) ? $field : '';
+       }
+       else {
+          $row[$columnName] = isset($field) ? $field : '';
+       }
        }

And CSV format have to be:

guid,title,image,image,keys,keys,keys
1,"Title","[link to image]","[link to image]","first cloud link","Second cloud link","Third cloud link"

Please Excuse my english is not my native.

Log in or register to post comments

Comment #31

arkjoseph CreditAttribution: arkjoseph commented 6 April 2012 at 19:08

File	Size
Screen Shot 2012-04-06 at 12.05.15 PM.png	37.84 KB

Is there an a solution to the original question?

Our client data csv file will be formatted in a way that mutli-value fields will need to be on a per-row basis. We need the csv file to grow vertically and '||' separated values will become cumbersome and require too much interaction with the csv.
The attached file is a sample of how the data will be coming to us.

Am I asking to much for this feature to be integrated into feeds? Is there a possibility that feeds tamper can help out with this import?

Suggestions or tips ?

Log in or register to post comments

Comment #32

zeezhao CreditAttribution: zeezhao commented 9 April 2012 at 07:09

See the patch for field collections to see if can help:
drupal.org/node/1063434

Log in or register to post comments

Comment #33

arkjoseph CreditAttribution: arkjoseph commented 9 April 2012 at 17:46

Thanks zeezhao,
I will investigate the patch. It doesn't look like its for 6.x but im sure it will get some attention soon enough.
Also, I am al little confused as to how Field Collection plays a role in Feeds?

Thanks again!

Log in or register to post comments

Comment #34

clemens.tolboom

Dutch

Groningen, 🇳🇱/🇪🇺

CreditAttribution: clemens.tolboom commented 13 April 2012 at 08:40

Link (Commerce Feeds) #1122422: CSV Import multiple product reference to product display from #27
Link (Field Collection) #1063434: Add Feeds integration to FieldCollection from #32

Log in or register to post comments

Comment #35

arski CreditAttribution: arski commented 1 October 2012 at 11:10

hmm, should the notation in first normal form work for entity references too? e.g. I have data in the form like:

1;foo
1;bar

where foo and bar are entity labels.. however, when I import the CSV, only one of them ends up being referenced :(

Log in or register to post comments

Comment #36

twistor CreditAttribution: twistor commented 6 October 2012 at 06:40

The first normal form described by Alex has never been implemented.

Log in or register to post comments

Comment #37

windmaomao CreditAttribution: windmaomao commented 8 February 2013 at 21:45

I see same thing as #35,

#14 patch doesn't work with entityreference :nid :label mapping either. Wonder how to make it work ?

Log in or register to post comments

Comment #38

jephree CreditAttribution: jephree commented 18 March 2013 at 00:17

Status:

Needs work

» Needs review

I tried solving this problem with this module...
https://github.com/jephree/drupal_aggregate_feed

Log in or register to post comments

Comment #39

18 March 2013 at 00:13

Status:

Needs review

» Needs work

The last submitted patch, feeds-multivaluecsvparser-759966-14.patch, failed testing.

Log in or register to post comments

Comment #40

2 January 2014 at 19:06

Status:

Needs review

» Needs work

Correcting issue status, see #2125263: Issue status not taken from last comment during D7 migration.

Log in or register to post comments

Comment #41

breeze_man CreditAttribution: breeze_man commented 24 October 2014 at 17:07

Issue summary:

View changes

File	Size
feeds-2.0-alpha8-auto_split_multi_value_cell_for_csv.patch	666 bytes

If use feeds_tamper, you need to setup rule for every multi value field. Why not let CSV parser detect and split multi value automatically.
I made a patch which is the same as #12, please feel free to use it.

Log in or register to post comments

CSV: Support one to many relationships

Comments