官术网_书友最值得收藏!

  • Elasticsearch Blueprints
  • Vineeth Mohan
  • 632字
  • 2021-07-16 13:39:32

Searching your documents

To search we use a large set of documents and our interest here lies only in a subset of this document set. This can be based on a set of constraints and conditions. A search does not stop here. You might be interested in getting a snapshot view of your query result. In our case, if the user searches for dell, he/she might be interested in seeing different unique product types and their document count. This is called an aggregation. Through this, we enhance our search experience and make it more explorable. Here, we try to discover various querying options through which we can express our requirement and communicate the same to Elasticsearch.

In our search application, we expose a search box that can be used for a search. We abstract out information about which field is searched or what is the precedence of the fields that we search. Let's see the query types that would be best for this search box.

A match query

A match query is the ideal place to start your query. It can be used to search many field types, such as a number, string, or even date. Let's see how we can use this to provide the search input box. Let's assume that a user fired a search against the keyword laptop. It does make sense to search on the field's name and description for this keyword and there is no sense to do the same for price or date fields.

Note

Elasticsearch, by default, stores an additional search field called _all, which is an aggregated field over all the field values in that document. Hence, to do a document-level search, it's good to use _all.

A simple match query in all the fields for a word laptop is as follows:

{
  "query": {
    "match": {
      "_all": "laptop"
    }
  }
}

Wait, won't we be using the _all search on the date and price fields too? Which we don't intent to… not in this case. Remember, we search include_in_all as false for all fields other than the name and description fields. This will make sure that these field values won't flow to _all.

Sweet, we are able to search on the string fields that make sense to us and we get neat results. However, now, the requirement from the management has changed. Rather than treating the name field and description field with equal precedence, I would rather like to give weightage to the name field over description. This means that for a document match, if the word is present in the name field, make that document more relevant over a document, where the match that worked is only on the field description. Let's see how we can achieve it using a variance of a match query.

Multifield match query

A multifield match query has the provision to search on multiple fields rather than a single field. Wait, it doesn't stop here. You can also give precedence or importance to each field along with it. This helps us to tell Elasticsearch to treat certain field matches better than others:

{
    "query": {
        "multi_match": {
          "query": "laptop",
          "fields": [
          "name^2",
          "description"
          ]
        }
    }
}

Here, we ask Elasticsearch to match the word laptop on both the field name and description, but give greater relevancy to a match on the field name over a match on description field.

Let's consider the following documents:

  • Document A:
    • Name: Lenovo laptop
    • Description: This is a great product with very high rating from Lenovo
  • Document B:
    • Name: Lenovo bags
    • Description: These are great laptop bags with very high rating from Lenovo

A search on the word laptop will yield a better match on Document A rather than Document B, which makes perfectly good sense in the real-world scenario.

主站蜘蛛池模板: 东丽区| 哈巴河县| 桦南县| 承德县| 和政县| 承德市| 浦北县| 海原县| 织金县| 千阳县| 辽源市| 塘沽区| 水城县| 巴林左旗| 印江| 新河县| 云梦县| 印江| 阳春市| 郯城县| 土默特右旗| 任丘市| 阜新| 应用必备| 谷城县| 汝州市| 郎溪县| 保德县| 会宁县| 额敏县| 林州市| SHOW| 雷波县| 图木舒克市| 西充县| 沅陵县| 大丰市| 安远县| 肥东县| 聊城市| 岳西县|