[[sorting]] == Sorting and Relevance
By default, results are returned sorted by relevance—with the most
relevant docs first.((("sorting", "by relevance")))((("relevance", "sorting results by"))) Later in this chapter, we explain what we mean by
relevance and how it is calculated, but let's start by looking at the sort
parameter and how to use it.
=== Sorting
In order to sort by relevance, we need to represent relevance as a value. In
Elasticsearch, the relevance score is represented by the floating-point
number returned in the search results as the _score, ((("relevance scores", "returned in search results score")))((("score", "relevance score of search results")))so the default sort
order is _score descending.
Sometimes, though, you don't have a meaningful relevance score. For instance,
the following query just returns all tweets whose user_id field has the
value 1:
[source,js]
GET /_search { "query" : { "filtered" : { "filter" : { "term" : { "user_id" : 1 } } } }
}
Filters have no bearing on _score, and the((("score", seealso="relevance; relevance scores")))((("match_all query", "score as neutral 1")))((("filters", "score and"))) missing-but-implied match_all
query just sets the _score to a neutral value of 1 for all documents. In
other words, all documents are considered to be equally relevant.
==== Sorting by Field Values
In this case, it probably makes sense to sort tweets by recency, with the most
recent tweets first.((("sorting", "by field values")))((("fields", "sorting search results by field values")))((("sort parameter"))) We can do this with the sort parameter:
[source,js]
GET /_search { "query" : { "filtered" : { "filter" : { "term" : { "user_id" : 1 }} } }, "sort": { "date": { "order": "desc" }}
}
// SENSE: 056_Sorting/85_Sort_by_date.json
You will notice two differences in the results:
[source,js]
"hits" : { "total" : 6, "max_score" : null, <1> "hits" : [ { "_index" : "us", "_type" : "tweet", "_id" : "14", "_score" : null, <1> "_source" : { "date": "2014-09-24", ... }, "sort" : [ 1411516800000 ] <2> }, ...
}
<1> The _score is not calculated, because it is not being used for sorting.
<2> The value of the date field, expressed as milliseconds since the epoch,
is returned in the sort values.
The first is that we have ((("date field, sorting search results by")))a new element in each result called sort, which
contains the value(s) that was used for sorting. In this case, we sorted on
date, which internally is((("milliseconds-since-the-epoch (date)"))) indexed as milliseconds since the epoch. The long
number 1411516800000 is equivalent to the date string 2014-09-24 00:00:00
UTC.
The second is that the _score and max_score are both null. ((("score", "not calculating"))) Calculating
the _score can be quite expensive, and usually its only purpose is for
sorting; we're not sorting by relevance, so it doesn't make sense to keep
track of the _score. If you want the _score to be calculated regardless,
you can set((("track_scores parameter"))) the track_scores parameter to true.
[TIP]
As a shortcut, you can ((("sorting", "specifying just the field name to sort on")))specify just the name of the field to sort on:
[source,js]
"sort": "number_of_children"
Fields will be sorted in ((("sorting", "default ordering")))ascending order by default, and
the _score value in descending order.
==== Multilevel Sorting
Perhaps we want to combine the _score from a((("sorting", "multilevel")))((("multilevel sorting"))) query with the date, and
show all matching results sorted first by date, then by relevance:
[source,js]
GET /_search { "query" : { "filtered" : { "query": { "match": { "tweet": "manage text search" }}, "filter" : { "term" : { "user_id" : 2 }} } }, "sort": [ { "date": { "order": "desc" }}, { "_score": { "order": "desc" }} ]
}
// SENSE: 056_Sorting/85_Multilevel_sort.json
Order is important. Results are sorted by the first criterion first. Only
results whose first sort value is identical will then be sorted by the
second criterion, and so on.
Multilevel sorting doesn't have to involve the _score. You could sort
by using several different fields,((("fields", "sorting by multiple fields"))) on geo-distance or on a custom value
calculated in a script.
[NOTE]
Query-string search((("sorting", "in query string searches")))((("sort parameter", "using in query strings")))((("query strings", "sorting search results for"))) also supports custom sorting, using the sort parameter
in the query string:
[source,js]
GET /_search?sort=date:desc&sort=_score&q=search
====
==== Sorting on Multivalue Fields
When sorting on fields with more than one value,((("sorting", "on multivalue fields")))((("fields", "multivalue", "sorting on"))) remember that the values do not have any intrinsic order; a multivalue field is just a bag of values. Which one do you choose to sort on?
For numbers and dates, you can reduce a multivalue field to a single value
by using the min, max, avg, or sum sort modes. ((("sum sort mode")))((("avg sort mode")))((("max sort mode")))((("min sort mode")))((("sort modes")))((("dates field, sorting on earliest value")))For instance, you
could sort on the earliest date in each dates field by using the following:
[role="pagebreak-before"]
[source,js]
"sort": { "dates": { "order": "asc", "mode": "min" }