官术网_书友最值得收藏!

  • Elasticsearch Blueprints
  • Vineeth Mohan
  • 651字
  • 2021-07-16 13:39:32

Choosing between a query and a filter

The basic idea of a search is to narrow down on a subset of documents that you have. In the Elasticsearch context, this means that based on various conditions, you might want to select a set of documents from an index or a set of index. A query and filter facilitate this for you.

If you have already gone through the reference guide or some other documentation of Elasticsearch, you might have noticed that the same set of operations might be available for both queries and filters. So, what are the differentiating factors of a query and filter even when the set of operations given by them are almost the same? Let's find out.

In a query, a matched document can be a better match than another matched document. In a filter, all the matched documents are treated equally.

This means that there is a way to score or rank a document matched against a query to another document match. This is done by computing a score value that tells you how good a match a particular document is against a query. If the query is a better match, give a higher score and if it's a lesser match, give a lesser score. This way, we can identify the best matches and use that in the paging.

For an e-commerce site, the success is decided on what percentage of input traffic is converted to purchase. A customer searches for something he/she is interested in buying and if we can't show him the most relevant results in the first page itself, then the chance of converting the search into a purchase would be slim. Mostly, none of the customers would look at the second page or subsequent pages for best options. They will assume that the products in further pages are of lesser importance than the current page and will drop the search there. Hence, we have to use queries to make our result order more relevant to the user.

But wait, what are the advantages of filters? Let's explore them.

Note

Filters don't compute the matched score per document and hence, they are faster. The results are also cached, which means that from the second search, the speed will be really good.

So, for structured searches, such as a date range, number range, and so on, where scoring doesn't come in picture, filters is our man. It has to be noted that filters can be used in many areas. They are:

  • Queries: A filter can be used for querying. Note that like a query has a separate section called query in a Query DSL (domain-specific language), there is no separate section for filters. Rather, you need to embed your filter inside the constant_score query type or the filtered_query type.
  • Scoring: Elasticsearch provides you a query type called the function_score query. Using the capabilities of this query type, we can use a filter and boost the score based on the filter match.
  • A post filter: This is applied to the search results, but not to the input of the aggregation. This means that even though the scope of aggregation is its query, we can modify this behavior by adding the post filter. Post filters are only applied to search results or hits and not to the aggregation input.
  • Aggregations: We can also specify filters inside aggregations to filter documents inside a bucket.

A very interesting point to note here is that filters are cached and used independent of the context. This means that once you use a filter in a query and reuse the same filter in an aggregation or post filter, the same cache is hit instead of computing the results.

Note

Hence, make sure that you always use a mixture of filters and queries, where constraints are as much moved to filters depending on the situation. This will allow unwanted computation of scores.

主站蜘蛛池模板: 广安市| 乐昌市| 清水河县| 石台县| 江北区| 岢岚县| 黄石市| 纳雍县| 高要市| 浠水县| 渭源县| 房山区| 保靖县| 安阳县| 象州县| 濉溪县| 石台县| 惠安县| 定西市| 香河县| 临澧县| 清流县| 治县。| 和林格尔县| 福建省| 元氏县| 通州区| 雅安市| 吴旗县| 绥宁县| 和田县| 兴隆县| 武隆县| 玉溪市| 唐山市| 宿州市| 泸州市| 黑山县| 齐齐哈尔市| 阜新| 南充市|