1

这是示例数据:

在 中type blog_comments,我有一些评论数据,其结构如下:

{"blog_id": 1, "comments": "Apple", "comment_id": 1}

而对于#1和博客,总共#2有6条评论:type blog_comments

{"blog_id": 1, "comments": "Apple", "comment_id": 1}
{"blog_id": 1, "comments": "Orange", "comment_id": 2}
{"blog_id": 1, "comments": "Fruit", "comment_id": 3}
{"blog_id": 2, "comments": "Apple", "comment_id": 1}
{"blog_id": 2, "comments": "Orange", "comment_id": 2}
{"blog_id": 2, "comments": "Earth", "comment_id": 3}

Question: Is it possible using some "magic" queries to get#1 as the result when I searching "Apple Fruit" and get#2when I search "Apple Earth" ?

我正在考虑将所有评论加入到每个博客的一个新记录(新类型)中,然后在这个新类型上进行搜索。但是评论太多了(大概有1200万条评论),而且这些评论已经被elasticsearch索引到了,所以还是尽量利用这些数据吧。

4

1 回答 1

0

Ideally, you would need to change the mapping of your index, to be able to search all the comments from one blog post. You can't really search for documents and say that one particular blog id (which is a field in documents) matched over multiple documents at the same time. Elasticsearch knows how to match across multiple fields from the same document, not multiple.

There is one workaround, though. But it depends on what else you need to do with this query, apart from getting back JUST the blog ID.

GET /blog/entries/_search?search_type=count
{
  "query": {
    "match": {
      "comments": "Apple Earth"
    }
  },
  "aggs": {
    "unique": {
      "terms": {
        "field": "blog_id",
        "min_doc_count": 2
      }
    }
  }
}

The query above will return something like this:

"aggregations": {
      "unique": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": [
            {
               "key": 2,
               "doc_count": 2
            }
         ]
      }
   }

The idea of the query is to return just the blog_id ("key":2 under buckets), thus you see there an aggregation of type terms. And depending on how many terms you search (Apple Earth counts for two terms), you set min_doc_count to the number of terms. Meaning, you say that you want to search for apple earth in minimum two documents. The difference between your example and what this actually does is that it will return documents that have, for example apple earth for comments, not just apple in one document and earth in another.

But, as I said, ideally you'd want to change the mapping of your index.

于 2015-01-29T13:36:04.530 回答