Problem/Motivation
Use case: we need to store HTML data in a field, and perform search filters and sorts on the same field, ignoring the HTML characters.
This can be achieved in elasticsearch by adding an analyzer, for example:
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "keyword",
"char_filter": ["html_strip"]
}
}
}
},
"mappings": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
},
"plain_text": {
"type": "text",
"analyzer": "my_analyzer"
}
}
}
}
}
}
Steps to reproduce
N/A
Proposed resolution
Allow the analysis configuration to be provided in a pipeline yaml file:
my_pipeline:
label: 'My Pipeline'
destinationSettings:
elasticsearch:
settings:
analysis:
analyzer:
my_analyzer:
tokenizer: keyword
char_filter:
- html_strip
mappings:
...
Remaining tasks
Test coverage, reviews, etc
User interface changes
API changes
Data model changes
Issue fork data_pipelines_elasticsearch-3512930
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #5
nterbogt commentedAdding contributors from previous issue.
Comment #6
nterbogt commentedComment #7
nterbogt commentedCan I please have a review of this one?
Comment #9
nterbogt commentedComment #12
nterbogt commentedThis is now fixed. It will be rolled out in the next release.