ElasticSearch关系模型之嵌套类型和父子文档-ElasticSearch零基础到实战

当前位置:　首页>> 技术小册>> ElasticSearch零基础到实战

在Elasticsearch中，有两种关系模型：嵌套类型和父子文档。这两种关系模型都可以在索引中定义复杂的数据结构。在本文中，我们将介绍这两种关系模型的使用方法，并提供示例代码。

嵌套类型

嵌套类型是指将一个文档作为另一个文档的属性嵌套在内部。嵌套类型可以在一个文档中包含多个嵌套的文档，并且可以针对这些嵌套的文档进行搜索和过滤。

以下是如何使用Elasticsearch Python客户端库创建一个具有嵌套类型的索引的示例：

from elasticsearch import Elasticsearch
es = Elasticsearch()
index_name = 'my_index'
mapping = {
    'properties': {
        'title': {
            'type': 'text'
        },
        'authors': {
            'type': 'nested',
            'properties': {
                'name': {
                    'type': 'text'
                },
                'age': {
                    'type': 'integer'
                }
            }
        }
    }
}
es.indices.create(index=index_name, body={'mappings': mapping})

在上面的示例中，我们创建了一个名为“my_index”的索引，其中包含一个嵌套类型“authors”。该嵌套类型包含两个字段：name和age。

父子文档

父子文档是指一个文档可以作为另一个文档的父文档，这种关系在索引中通过父文档ID进行定义。父子文档的好处在于可以将相同类型的文档分组，同时可以在子文档中添加不同类型的数据。

以下是如何使用Elasticsearch Python客户端库创建一个具有父子文档的索引的示例：

from elasticsearch import Elasticsearch
es = Elasticsearch()
index_name = 'my_index'
mapping = {
    'properties': {
        'title': {
            'type': 'text'
        },
        'category': {
            'type': 'keyword'
        }
    }
}
es.indices.create(index=index_name, body={'mappings': mapping})
parent_doc = {
    'title': 'Parent Document',
    'category': 'category1'
}
child_doc1 = {
    'title': 'Child Document 1',
    'content': 'This is the content of child document 1'
}
child_doc2 = {
    'title': 'Child Document 2',
    'content': 'This is the content of child document 2'
}
# 添加父文档
parent_doc_id = es.index(index=index_name, body=parent_doc)['_id']
# 添加子文档1
es.index(index=index_name, body=child_doc1, parent=parent_doc_id)
# 添加子文档2
es.index(index=index_name, body=child_doc2, parent=parent_doc_id)

在上面的示例中，我们创建了一个名为“my_index”的索引，其中包含两个类型：父文档和子文档。首先，我们创建了一个父文档，然后，我们为父文档添加两个子文档。请注意，我们在添加子文档时使用了“parent”参数，该参数指定子文档的父文档ID。

父子文档的搜索和查询需要使用特殊的查询语句。以下是一个示例，演示如何搜索父文档及其所有子文档：

from elasticsearch import Elasticsearch
es = Elasticsearch()
index_name = 'my_index'
# 构造查询语句
query = {
    'query': {
        'has_child': {
            'type': 'child',
            'query': {
                'match': {
                    'content': 'child'
                }
            }
        }
    }
}
# 执行查询
results = es.search(index=index_name, body=query)
# 输出结果
for hit in results['hits']['hits']:
    print(hit['_id'])

在上面的示例中，我们使用了Elasticsearch的“has_child”查询，该查询可以搜索指定类型的子文档。在这种情况下，我们搜索所有包含“child”关键字的子文档，并返回它们的父文档ID。

小结

嵌套类型和父子文档都是Elasticsearch中用于处理复杂数据结构的强大工具。使用它们可以更有效地组织和搜索数据，同时避免数据冗余。本文提供了示例代码，演示了如何在Python中使用Elasticsearch客户端库创建具有嵌套类型和父子文档的索引，并进行搜索和查询。

该分类下的相关小册推荐：

ElasticSearch入门与实践