Problem/Motivation
The explode
process plugin looks too strict, it only accepts strings. For example if receiving a NULL it fails with "NULL is not a string". But sending NULLs to pipeline looks like a common case when the source just lacks a value where the processor should convert it to an empty array. Also an integer or any scalar should be casted to a string instead of just crashing. Empty values, even casted to strings, will produce a one-element arrays: ['']
. But you expect that if you pass *something empty* you'll get also *something empty* (an empty array). While with the actual behavior, of strict checking the source value, this looks legit, in the case we accept also values casting to strings, we should accept as business logic to convert *empty values* into *empty arrays*.
Proposed resolution
Make explode
useful for real cases:
- Add a new boolean configuration:
strict
. - With
strict: false
, allow values that can be casted to strings as input values and cast them to strings before split. - With
strict: false
, values that cannot be casted to strings (as an array) will raise a MigrationException. - With
strict: false
, values casted to empty strings will produce empty arrays ([]
) as opposite tostrict: true
when source '' (empty string) is producing an array with a single empty element (['']
). - Default
strict
to TRUE to assure BC for existing migrations.
Remaining tasks
None.
User interface changes
None.
API changes
None.
Data model changes
None.
Comment | File | Size | Author |
---|---|---|---|
#39 | interdiff-35-39.txt | 1.24 KB | jofitz |
#39 | 2853823-39.patch | 7.02 KB | jofitz |
#35 | interdiff-33-35.txt | 1.42 KB | jofitz |
#35 | 2853823-35.patch | 7.09 KB | jofitz |
#33 | interdiff-29-33.txt | 5.76 KB | jofitz |
Comments
Comment #2
claudiu.cristeaPatch.
Comment #3
claudiu.cristeaComment #4
claudiu.cristeaAdded unit test.
Comment #6
claudiu.cristeaHm. Now the question that is rising is: Is it correct to return an array with an empty string in the case of empty input values?
Even with HEAD the behavior is the same: If the input value is
''
(empty string) will return['']
, but I think it should return an empty array:[]
.Comment #7
claudiu.cristeaYes, we need a practical explode, that handles correctly the empty values.
Comment #8
claudiu.cristeaComment #10
claudiu.cristeaOuch!
EDIT: I wrongly added the .patch extension to the interdiff
Comment #12
claudiu.cristeaI wrongly added the .patch extension to the interdiff
Comment #13
phenaproximaSelf-assigning for review.
Comment #14
phenaproximaLovely patch with thorough test coverage. This gives me a warm and fuzzy feeling. I just have two complaints, one of which is minor.
This line is confusing to grok. Can it be refactored a bit for clarity?
Delimiter should start with a capital D.
Comment #15
jofitz CreditAttribution: jofitz at ComputerMinds commentedHow's this for grok-ability?
Comment #16
phenaproximaI'm sorry to be such a complete pain about this, but I think I still find it kind of confusing. I think we should take advantage of the PHP string-casting rules:
1. All scalar values can be string-cast, even NULL.
2. Arrays can never be cast to strings.
3. Objects can only be cast to strings if they implement __toString().
4. Resources can never be cast to strings.
So what if we did something like this:
Comment #17
jofitz CreditAttribution: jofitz at ComputerMinds commentedJust a slight tweak to the exception in #16.
Comment #18
phenaproxima<3 it. Preemptively RTBC when tests pass.
Comment #19
claudiu.cristeaIn this case there's no need for $original, right?
Comment #20
jofitz CreditAttribution: jofitz at ComputerMinds commented@claudiu.cristea you are absolutely right - now corrected in code.
Comment #21
phenaproximaGreat stuff, and we have now fixed the hole left by is_scalar() (it doesn't consider NULL to be scalar). RTBC if the tests pass.
Comment #22
alexpottI'm not sure that this makes a lot of sense... like an integer
0
will return an empty array but1
will return[1]
. I would have thought it might be more consistent when strict is false to cast to a string and if it is equal to an empty string return an empty array.Comment #23
claudiu.cristea@alexpott,
Yeah, totally make sense. So, with strict equals FALSE:
Comment #24
alexpott@claudiu.cristea hmmm... I'm not sure you're right. PHP considers FALSE, 0, '0', '', NULL and [] all to be empty. See https://secure.php.net/manual/en/function.empty.php
Comment #25
alexpottHere's how the explode function actually treats this:
Comment #26
claudiu.cristeaYes, make sense. I wrongly thought that (string) FALSE === '0'. In fact it's empty string.
We need to replace this with:
Comment #27
claudiu.cristeaImplemented #22.
Comment #28
Munavijayalakshmi CreditAttribution: Munavijayalakshmi at Valuebound commentedWord 'The' is mentioned twice in a sequence.
Comment #29
Munavijayalakshmi CreditAttribution: Munavijayalakshmi at Valuebound commentedComment #30
claudiu.cristeaUpdating the IS to reflect #22.
Comment #31
heddnI think I prefer not changing the wording of the doxygen to account for an edge case. Let's focus on the normal 'node/1' explosion here.
Let's use the data provider phpunit approach here.
Comment #32
jofitz CreditAttribution: jofitz at ComputerMinds commentedComment #33
jofitz CreditAttribution: jofitz at ComputerMinds commentedAlso re-instated the use of setExpectedException() which appears to have been accidentally dropped after #20.
Comment #34
heddnOne small nit, because I think the committers would kick it back if we didn't add descriptive array keys for the data provider.
Rather than using a comment, name the array key for the test cases with the test name.
ex.
// Check that an empty source string returns an empty array.Comment #35
jofitz CreditAttribution: jofitz at ComputerMinds commentedUse data provider array keys rather than comments.
Comment #36
claudiu.cristeaGreat! +1 for RTBC
Comment #37
heddnComment #38
alexpottThe early return means that we need to move the empty delimiter check to be before we do anything.
Comment #39
jofitz CreditAttribution: jofitz at ComputerMinds commentedMoved the empty delimiter check to the beginning of the method.
Comment #40
phenaproxima'delimiter' should be 'Delimiter', but that's fixable on commit. Otherwise, it looks like @alexpott's concern has been addressed.
Comment #41
alexpottCommitted and pushed 6bad867 to 8.4.x and 9c87940 to 8.3.x. Thanks!
This is BC safe improvement to experimental code - therefore committing to 8.3.x