Elasticsearch: 运用 Field collapsing 来减少基于单个字段的搜索结果

本文介绍Elasticsearch中Fieldcollapsing的使用方法,通过示例展示如何对搜索结果进行折叠,只显示每个分类下的最高评分项,以及如何扩展折叠结果以获取更多相关信息。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

允许根据字段值折叠搜索结果。 折叠是通过每个折叠键仅选择排序最靠前的文档来完成的。要想理解这个其实也并不难,我们就那百度音乐的页面例子来说:

我们可以看到在上面的页面中,它有展示很多喜欢的歌曲。其实这个歌曲可能是一个专辑里的最突出的一个。当我们做页面的时候,我们没有必要把一个专辑里所有的歌曲都放到这个封面的位置。我也许就只想放这个专辑里点击率最高的或者是最受欢迎的一首歌作为这个专辑的代表。当我们点击这个专辑的时候,我们还可以看到其它在这个专辑里的歌曲:

Field collapsing 就是为这个而生。这种情况也适用于有些新闻头条出现在标题栏中。当我们点击进去过,可以看到更多的相关类别的新闻。

下面我们来通过一个例子来展示如何使用。

准备数据

今天我们使用的数据是一个最好游戏的一个数据。我们可以从我的 github 项目里把这个数据下载下来:

git clon https://github.com/liu-xiao-guo/best_games_json_data

然后,我们通过如下的方式把我们下载的 JSON 数据导入到 Elasticsearch 中:

我们把这个 index 的名字叫做 best_games:

这样我们的数据就准备好了。整个索引共有500条数据。这个索引里的每一条数据就像:

{"id":"madden-nfl-2002-ps2-2001","name":"Madden NFL 2002","year":2001,"platform":"PS2","genre":"Sports","publisher":"Electronic Arts","global_sales":3.08,"critic_score":94,"user_score":7,"developer":"EA Sports","image_url":"http://www.mobygames.com/images/covers/l/202684-madden-nfl-2002-playstation-2-back-cover.png"}

它的 mapping 为:

{
  "best_games" : {
    "mappings" : {
      "_meta" : {
        "created_by" : "ml-file-data-visualizer"
      },
      "properties" : {
        "critic_score" : {
          "type" : "long"
        },
        "developer" : {
          "type" : "text"
        },
        "genre" : {
          "type" : "keyword"
        },
        "global_sales" : {
          "type" : "double"
        },
        "id" : {
          "type" : "keyword"
        },
        "image_url" : {
          "type" : "keyword"
        },
        "name" : {
          "type" : "text"
        },
        "platform" : {
          "type" : "keyword"
        },
        "publisher" : {
          "type" : "keyword"
        },
        "user_score" : {
          "type" : "long"
        },
        "year" : {
          "type" : "long"
        }
      }
    }
  }
}

Field collapsing

下面我们用 collapsing 的方法来对我们的数据进行搜索:

GET best_games/_search
{
  "query": {
    "match": {
      "name": "Final Fantasy"
    }
  },
  "collapse": {
    "field": "publisher"
  }, 
  "sort": [
    {
      "critic_score": {
        "order": "desc"
      }
    }
  ]
}

搜索的结果是:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 11,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
        "_index" : "best_games",
        "_type" : "_doc",
        "_id" : "E3JzF28BjrINWI3xtt80",
        "_score" : null,
        "_source" : {
          "id" : "final-fantasy-ix-ps-2000",
          "name" : "Final Fantasy IX",
          "year" : 2000,
          "platform" : "PS",
          "genre" : "Role-Playing",
          "publisher" : "SquareSoft",
          "global_sales" : 5.3,
          "critic_score" : 94,
          "user_score" : 8,
          "developer" : "SquareSoft",
          "image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg"
        },
        "fields" : {
          "publisher" : [
            "SquareSoft"
          ]
        },
        "sort" : [
          94
        ]
      },
      {
        "_index" : "best_games",
        "_type" : "_doc",
        "_id" : "wnJzF28BjrINWI3xtt40",
        "_score" : null,
        "_source" : {
          "id" : "final-fantasy-vii-ps-1997",
          "name" : "Final Fantasy VII",
          "year" : 1997,
          "platform" : "PS",
          "genre" : "Role-Playing",
          "publisher" : "Sony Computer Entertainment",
          "global_sales" : 9.72,
          "critic_score" : 92,
          "user_score" : 9,
          "developer" : "SquareSoft",
          "image_url" : "https://r.hswstatic.com/w_907/gif/finalfantasyvii-MAIN.jpg"
        },
        "fields" : {
          "publisher" : [
            "Sony Computer Entertainment"
          ]
        },
        "sort" : [
          92
        ]
      },
      {
        "_index" : "best_games",
        "_type" : "_doc",
        "_id" : "_nJzF28BjrINWI3xtt40",
        "_score" : null,
        "_source" : {
          "id" : "final-fantasy-xii-ps2-2006",
          "name" : "Final Fantasy XII",
          "year" : 2006,
          "platform" : "PS2",
          "genre" : "Role-Playing",
          "publisher" : "Square Enix",
          "global_sales" : 5.95,
          "critic_score" : 92,
          "user_score" : 7,
          "developer" : "Square Enix",
          "image_url" : "https://m.media-amazon.com/images/M/MV5BM2I4MDMyMDQtNjM2OC00ZWNkLTg0ODQtNzYxZjY0M2QxODQyXkEyXkFqcGdeQXVyNjY5NTM5MjA@._V1_.jpg"
        },
        "fields" : {
          "publisher" : [
            "Square Enix"
          ]
        },
        "sort" : [
          92
        ]
      },
      {
        "_index" : "best_games",
        "_type" : "_doc",
        "_id" : "FXJzF28BjrINWI3xtt80",
        "_score" : null,
        "_source" : {
          "id" : "final-fantasy-x-2-ps2-2003",
          "name" : "Final Fantasy X-2",
          "year" : 2003,
          "platform" : "PS2",
          "genre" : "Role-Playing",
          "publisher" : "Electronic Arts",
          "global_sales" : 5.29,
          "critic_score" : 85,
          "user_score" : 6,
          "developer" : "SquareSoft",
          "image_url" : "https://upload.wikimedia.org/wikipedia/en/thumb/6/6c/FFX-2_box.jpg/220px-FFX-2_box.jpg"
        },
        "fields" : {
          "publisher" : [
            "Electronic Arts"
          ]
        },
        "sort" : [
          85
        ]
      }
    ]
  }
}

上面的结果显示:

  • 我们搜索所有的名字为 Final Fantasy 的游戏,并按照 critic_score 降序排序。
  • 由于我们使用 collapse,并按照 publisher 来进行分类。它的意思就是每个 publisher 只能有一个搜索的结果,尽管每一 publisher 有很多款的游戏

比如,我们可以找到 publisher 为 SquareSoft 并且 name 里含有 Final Fantasy 的游戏,有三款之多:

GET best_games/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "name": "Final Fantasy"
          }
        },
        {
          "match": {
            "publisher": "SquareSoft"
          }
        }
      ]
    }
  },
  "sort": [
    {
      "critic_score": {
        "order": "desc"
      }
    }
  ]
}

上面的查询结果:

    "hits" : [
      {
        "_index" : "best_games",
        "_type" : "_doc",
        "_id" : "E3JzF28BjrINWI3xtt80",
        "_score" : null,
        "_source" : {
          "id" : "final-fantasy-ix-ps-2000",
          "name" : "Final Fantasy IX",
          "year" : 2000,
          "platform" : "PS",
          "genre" : "Role-Playing",
          "publisher" : "SquareSoft",
          "global_sales" : 5.3,
          "critic_score" : 94,
          "user_score" : 8,
          "developer" : "SquareSoft",
          "image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg"
        },
        "sort" : [
          94
        ]
      },
      {
        "_index" : "best_games",
        "_type" : "_doc",
        "_id" : "0nJzF28BjrINWI3xtt40",
        "_score" : null,
        "_source" : {
          "id" : "final-fantasy-viii-ps-1999",
          "name" : "Final Fantasy VIII",
          "year" : 1999,
          "platform" : "PS",
          "genre" : "Role-Playing",
          "publisher" : "SquareSoft",
          "global_sales" : 7.86,
          "critic_score" : 90,
          "user_score" : 8,
          "developer" : "SquareSoft",
          "image_url" : "https://gamingheartscollection.files.wordpress.com/2018/02/final-fantasy-8.png?w=585"
        },
        "sort" : [
          90
        ]
      },
      {
        "_index" : "best_games",
        "_type" : "_doc",
        "_id" : "SHJzF28BjrINWI3xtuA1",
        "_score" : null,
        "_source" : {
          "id" : "final-fantasy-tactics-ps-1997",
          "name" : "Final Fantasy Tactics",
          "year" : 1997,
          "platform" : "PS",
          "genre" : "Role-Playing",
          "publisher" : "SquareSoft",
          "global_sales" : 2.45,
          "critic_score" : 83,
          "user_score" : 8,
          "developer" : "SquareSoft",
          "image_url" : "https://www.thefinalfantasy.com/gallery/screenshots/ff-tactics/dynamic_previews/ff-tactics-screenshot-1_scale_800_700.jpg"
        },
        "sort" : [
          83
        ]
      }
    ]
  }

但是由于我们使用了collapse,只有一款游戏,并且是按照 critic_score 最高的那个被搜索出来。

注意:能够被 collapse 所使用的字段必须是数字或 keyword 字段,并且含有 doc_values

扩展 Collapse 结果

我们也可以通过使用 inner_hits 选项来扩展 Collapse 的热门匹配:

GET best_games/_search
{
  "query": {
    "match": {
      "name": "Final Fantasy"
    }
  },
  "collapse": {
    "field": "publisher",
    "inner_hits": {
      "name": "top 3 games",
      "size": 3,
      "sort": [{"user_score": "desc"}]
    }
  }, 
  "sort": [
    {
      "critic_score": {
        "order": "desc"
      }
    }
  ]
}

那么运行后的结果为:

  "hits" : [
      {
        "_index" : "best_games",
        "_type" : "_doc",
        "_id" : "E3JzF28BjrINWI3xtt80",
        "_score" : null,
        "_source" : {
          "id" : "final-fantasy-ix-ps-2000",
          "name" : "Final Fantasy IX",
          "year" : 2000,
          "platform" : "PS",
          "genre" : "Role-Playing",
          "publisher" : "SquareSoft",
          "global_sales" : 5.3,
          "critic_score" : 94,
          "user_score" : 8,
          "developer" : "SquareSoft",
          "image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg"
        },
        "fields" : {
          "publisher" : [
            "SquareSoft"
          ]
        },
        "sort" : [
          94
        ],
        "inner_hits" : {
          "top 3 games" : {
            "hits" : {
              "total" : {
                "value" : 3,
                "relation" : "eq"
              },
              "max_score" : null,
              "hits" : [
                {
                  "_index" : "best_games",
                  "_type" : "_doc",
                  "_id" : "0nJzF28BjrINWI3xtt40",
                  "_score" : null,
                  "_source" : {
                    "id" : "final-fantasy-viii-ps-1999",
                    "name" : "Final Fantasy VIII",
                    "year" : 1999,
                    "platform" : "PS",
                    "genre" : "Role-Playing",
                    "publisher" : "SquareSoft",
                    "global_sales" : 7.86,
                    "critic_score" : 90,
                    "user_score" : 8,
                    "developer" : "SquareSoft",
                    "image_url" : "https://gamingheartscollection.files.wordpress.com/2018/02/final-fantasy-8.png?w=585"
                  },
                  "sort" : [
                    8
                  ]
                },
                {
                  "_index" : "best_games",
                  "_type" : "_doc",
                  "_id" : "E3JzF28BjrINWI3xtt80",
                  "_score" : null,
                  "_source" : {
                    "id" : "final-fantasy-ix-ps-2000",
                    "name" : "Final Fantasy IX",
                    "year" : 2000,
                    "platform" : "PS",
                    "genre" : "Role-Playing",
                    "publisher" : "SquareSoft",
                    "global_sales" : 5.3,
                    "critic_score" : 94,
                    "user_score" : 8,
                    "developer" : "SquareSoft",
                    "image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg"
                  },
                  "sort" : [
                    8
                  ]
                },
                {
                  "_index" : "best_games",
                  "_type" : "_doc",
                  "_id" : "SHJzF28BjrINWI3xtuA1",
                  "_score" : null,
                  "_source" : {
                    "id" : "final-fantasy-tactics-ps-1997",
                    "name" : "Final Fantasy Tactics",
                    "year" : 1997,
                    "platform" : "PS",
                    "genre" : "Role-Playing",
                    "publisher" : "SquareSoft",
                    "global_sales" : 2.45,
                    "critic_score" : 83,
                    "user_score" : 8,
                    "developer" : "SquareSoft",
                    "image_url" : "https://www.thefinalfantasy.com/gallery/screenshots/ff-tactics/dynamic_previews/ff-tactics-screenshot-1_scale_800_700.jpg"
                  },
                  "sort" : [
                    8
                  ]
                }
              ]
            }
          }
        }
      },

我们可以看出来在每个 publisher 里,在 inner_hits 里同时含有3个 top 3 games。它们分别是按照 user_score 来进行分类的。

也可以为每个合拢的匹配请求多个 inner_hits。 当您想要获得 Collapse 后的匹配的多种表示形式时,此功能很有用。

GET best_games/_search
{
  "query": {
    "match": {
      "name": "Final Fantasy"
    }
  },
  "collapse": {
    "field": "publisher",
    "inner_hits": [
      {
        "name": "top user liked",
        "size": 3,
        "sort": [
          {
            "user_score": "desc"
          }
        ]
      },
      {
        "name": "top most recent games",
        "size": 3,
        "sort": [
          {
            "year": "desc"
          }
        ]
        
      }
    ]
  },
  "sort": [
    {
      "critic_score": {
        "order": "desc"
      }
    }
  ]
}

显示结果为:

    "hits" : [
      {
        "_index" : "best_games",
        "_type" : "_doc",
        "_id" : "E3JzF28BjrINWI3xtt80",
        "_score" : null,
        "_source" : {
          "id" : "final-fantasy-ix-ps-2000",
          "name" : "Final Fantasy IX",
          "year" : 2000,
          "platform" : "PS",
          "genre" : "Role-Playing",
          "publisher" : "SquareSoft",
          "global_sales" : 5.3,
          "critic_score" : 94,
          "user_score" : 8,
          "developer" : "SquareSoft",
          "image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg"
        },
        "fields" : {
          "publisher" : [
            "SquareSoft"
          ]
        },
        "sort" : [
          94
        ],
        "inner_hits" : {
          "top user liked" : {
            "hits" : {
              "total" : {
                "value" : 3,
                "relation" : "eq"
              },
              "max_score" : null,
              "hits" : [
                {
                  "_index" : "best_games",
                  "_type" : "_doc",
                  "_id" : "0nJzF28BjrINWI3xtt40",
                  "_score" : null,
                  "_source" : {
                    "id" : "final-fantasy-viii-ps-1999",
                    "name" : "Final Fantasy VIII",
                    "year" : 1999,
                    "platform" : "PS",
                    "genre" : "Role-Playing",
                    "publisher" : "SquareSoft",
                    "global_sales" : 7.86,
                    "critic_score" : 90,
                    "user_score" : 8,
                    "developer" : "SquareSoft",
                    "image_url" : "https://gamingheartscollection.files.wordpress.com/2018/02/final-fantasy-8.png?w=585"
                  },
                  "sort" : [
                    8
                  ]
                },
                {
                  "_index" : "best_games",
                  "_type" : "_doc",
                  "_id" : "E3JzF28BjrINWI3xtt80",
                  "_score" : null,
                  "_source" : {
                    "id" : "final-fantasy-ix-ps-2000",
                    "name" : "Final Fantasy IX",
                    "year" : 2000,
                    "platform" : "PS",
                    "genre" : "Role-Playing",
                    "publisher" : "SquareSoft",
                    "global_sales" : 5.3,
                    "critic_score" : 94,
                    "user_score" : 8,
                    "developer" : "SquareSoft",
                    "image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg"
                  },
                  "sort" : [
                    8
                  ]
                },
                {
                  "_index" : "best_games",
                  "_type" : "_doc",
                  "_id" : "SHJzF28BjrINWI3xtuA1",
                  "_score" : null,
                  "_source" : {
                    "id" : "final-fantasy-tactics-ps-1997",
                    "name" : "Final Fantasy Tactics",
                    "year" : 1997,
                    "platform" : "PS",
                    "genre" : "Role-Playing",
                    "publisher" : "SquareSoft",
                    "global_sales" : 2.45,
                    "critic_score" : 83,
                    "user_score" : 8,
                    "developer" : "SquareSoft",
                    "image_url" : "https://www.thefinalfantasy.com/gallery/screenshots/ff-tactics/dynamic_previews/ff-tactics-screenshot-1_scale_800_700.jpg"
                  },
                  "sort" : [
                    8
                  ]
                }
              ]
            }
          },
          "top most recent games" : {
            "hits" : {
              "total" : {
                "value" : 3,
                "relation" : "eq"
              },
              "max_score" : null,
              "hits" : [
                {
                  "_index" : "best_games",
                  "_type" : "_doc",
                  "_id" : "E3JzF28BjrINWI3xtt80",
                  "_score" : null,
                  "_source" : {
                    "id" : "final-fantasy-ix-ps-2000",
                    "name" : "Final Fantasy IX",
                    "year" : 2000,
                    "platform" : "PS",
                    "genre" : "Role-Playing",
                    "publisher" : "SquareSoft",
                    "global_sales" : 5.3,
                    "critic_score" : 94,
                    "user_score" : 8,
                    "developer" : "SquareSoft",
                    "image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg"
                  },
                  "sort" : [
                    2000
                  ]
                },
                {
                  "_index" : "best_games",
                  "_type" : "_doc",
                  "_id" : "0nJzF28BjrINWI3xtt40",
                  "_score" : null,
                  "_source" : {
                    "id" : "final-fantasy-viii-ps-1999",
                    "name" : "Final Fantasy VIII",
                    "year" : 1999,
                    "platform" : "PS",
                    "genre" : "Role-Playing",
                    "publisher" : "SquareSoft",
                    "global_sales" : 7.86,
                    "critic_score" : 90,
                    "user_score" : 8,
                    "developer" : "SquareSoft",
                    "image_url" : "https://gamingheartscollection.files.wordpress.com/2018/02/final-fantasy-8.png?w=585"
                  },
                  "sort" : [
                    1999
                  ]
                },
                {
                  "_index" : "best_games",
                  "_type" : "_doc",
                  "_id" : "SHJzF28BjrINWI3xtuA1",
                  "_score" : null,
                  "_source" : {
                    "id" : "final-fantasy-tactics-ps-1997",
                    "name" : "Final Fantasy Tactics",
                    "year" : 1997,
                    "platform" : "PS",
                    "genre" : "Role-Playing",
                    "publisher" : "SquareSoft",
                    "global_sales" : 2.45,
                    "critic_score" : 83,
                    "user_score" : 8,
                    "developer" : "SquareSoft",
                    "image_url" : "https://www.thefinalfantasy.com/gallery/screenshots/ff-tactics/dynamic_previews/ff-tactics-screenshot-1_scale_800_700.jpg"
                  },
                  "sort" : [
                    1997
                  ]
                }
              ]
            }
          }
        }
      },

这样针对每个 publisher,我们也可以得到每个 publisher 在 user 中最受欢迎的三个,同时显示最新的三个游戏。

参考:

【1】Request body search | Elasticsearch Guide [7.16] | Elastic

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值