Re: Is the "@" character a delimiter for Elasticsearch index, like bla...

JeanDo · Posted 09-08-2020 10:28 AM

Hello,

I'm working with Visual Investigator 10.7, and I experiment how the elasticsearch index works; I have a "person" entity with a "mail" variable "john.dow@gmail.com". When I search for "john" (typing "john" in the search bar), my entity doesn't show up. If I search for "john*", then it works and my person does show up; idem if I search for "john.dow". My conclusion is that the "@" works as a delimiter, but not the period.

So my question is: is "@" a separator for ES index, and is period a separator? More generally which are the separators for the index, and how (if possible) the administrator can parameter the list of separator?

Thank you for your help,

JD

Rachel_ · Posted 09-17-2020 04:01 AM

Hi,

In general, search behaviour in VI depends on how the data is analyzed when it is indexed in Elasticsearch, which is configurable on a per field basis:

In the “Manage Investigate and Search” app, when editing the field properties for a field on an entity, there is an “Advanced Search” section. For String fields, this section contains the following options:

Include in free text searches: whether the content of the field is searchable without prefixing query terms with the field name (colloquially called “unfielded free text searching” as opposed to “fielded searching” where query terms are prefixed with the field name).
Analyze for search: whether the content of the field is analyzed or not. Note this only affects “fielded searching”. If not, then only exact match or wildcard searching will be supported by “fielded searching”. If yes, then the behaviour depends on the value of the “Index Analyzer” option.
Index Analyzer: which analyzer to use when indexing the content of the field. “(use system default analyzer)” uses the application locale (note this is separate from the HTTP Accept-Language header used to localise the UI) to pick an appropriate analyzer e.g. english for en-US, french for fr-FR. These language specific analyzers tokenize text on language appropriate word boundaries and perform language appropriate operations like stemming as well converting tokens to lowercase. In contrast, the “Standard” analyzer tokenizes on language appropriate word boundaries and converts to lowercase but doesn’t perform any addition operations like stemming. Another example, “Whitespace” simply splits the text on whitespace without converting tokens to lowercase.
Available for phonetic searches: whether the content of this field should be analyzed by the VI phonetic analyzer in addition to the “Index Analyzer” to support phonetic searching.
Available for synonym searches: whether the content of this field should be analyzed by the VI synonym analyzer (the list of synonyms is configured globally in the “Properties” tab of the “Manage Investigate and Search” app) in addition to the “Index Analyzer” to support synonym searching.

More specifically, it does indeed look like the english analyzer (which is the default as the default locale is en-US), tokenizes text on “@” (in addition to other word boundaries) but not on “.”. To achieve custom search behaviour beyond what’s available via the “Index Analyzer” option would require a custom analyzer. If you need to get help on defining a custom analyzer and configuring VI such that it appears as an option under “Index Analyzer”, it would be best to raise a Technical Support track.

JeanDo · Posted 09-17-2020 11:21 AM

Thank you for your response! I understand thath I can't have a custom index for the moment, and so I have to be very precise with my requests if I want the result to show up. That's a big problem for the end-user in our context, but I see no other choice available.

Regards,

JD

Is the "@" character a delimiter for Elasticsearch index, like blank character?

Re: Is the "@" character a delimiter for Elasticsearch index, like blank character?

Re: Is the "@" character a delimiter for Elasticsearch index, like blank character?

Read the Report