ElasticSearch

什么是ES

1
2
3

ES官网：https://www.elastic.co/cn/
ES镜像地址：https://elasticsearch.cn/download/
历史版本下载地址：https://www.elastic.co/cn/downloads/past-releases

中文文档地址：
1
https://elasticsearch.bookhub.tech/
ES(存储搜索数据)、Kibana(展示数据)、Beats和Logstash(采集和传输数据)三大组件组合成ELK Stack技术栈。
安装JDK，以及JDK国内镜像站
1
https://repo.huaweicloud.com/java/jdk/

安装ES

1. 下载
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-8.3.2-linux-x86_64.tar.gz
2. 解压
tar -zxvf elasticsearch-8.3.2-linux-x86_64.tar.gz -C /opt/module/
3. 运行
./elasticsearch   启动es
关闭ES，使用kill -9 结束进程

4. 目录介绍
bin:下面存放着Es启动文件         elasticsearch.bat/elasticsearch
config：配置目录
data:数据目录
jdk、lib：Java运行环境以及依赖包
logs:日志目录
modules、plugins：模块及插件目录，head插件可以存放在plugins目录下

启动错误处理

1.java.lang.RuntimeException: can not run elasticsearch as root
创建子用户，使用子用户启动es
useradd esroot -g esgroup -p esroot			#创建用户和用户组 esroot，设置用户密码为esroot
chown -R esroot:esgroup /opt/module/		#给用户权限

关闭防火墙

1 2	systemctl status firewalld # 查看防火墙状态 systemctl stop firewalld

修改ES中默认开启的SSL访问

1
2
3

https://blog.csdn.net/qq_17229141/article/details/123106584
修改elasticsearch.yml配置文件
将xpack.security.enabled设置为false

ES前后占用内存情况，8.3.2版本，默认配置文件启动，占用3.2G内存：

ES运行情况下：
[root@MiWiFi-CR6609-srv tmp]# free -h
              total        used        free      shared  buff/cache   available
Mem:           5.7G        3.5G        120M        8.2M        2.0G        1.9G
Swap:          819M        520K        819M

ES关闭情况下：
[root@MiWiFi-CR6609-srv tmp]# free -h
              total        used        free      shared  buff/cache   available
Mem:           5.7G        318M        3.4G        8.2M        2.0G        5.1G
Swap:          819M        520K        819M

ES 使用JAVA问题

1	elasticsearch默认启用时首先找系统安装的jdk，如果没有安装jkd的话就使用es自带的jdk。

ES结构

将ES对应到MySQL

Index（索引）— Database（数据库）？？？？

Type（类型）—– Table（表）（Type已经在ES7中删除了）

Documents（文档）—- Row（行）

Fields（字段）—-column（列）

倒排索引

一般结构化数据库：

id concent

1 name is zhang san

2 name is li si

倒排索引中：

id concent

name 1,2

is 1,2

zhang 1

san 1

li 2

si 2

ES-Rest风格操作

索引-创建(结构数据库中的表)

创建一个名称委shopping的索引：

1	curl --location --request PUT '192.168.2.69:9200/shopping'

返回结果：

{
    "acknowledged": true,
    "shards_acknowledged": true,
    "index": "shopping"
}

如果使用post命令会返回错误的结果：

1	curl --location --request POST '192.168.2.69:9200/shopping'

返回结果：

{
    "error": "Incorrect HTTP method for uri [/shopping] and method [POST], allowed: [DELETE, HEAD, GET, PUT]",
    "status": 405
}

ES命令的幂等性

命令类型PUT，ES支持GET、PUT、HEAD、DELETE这四个命令都是幂等性的，命令发送的前后不影响结果，例如连续发送两次上面的PUT创建索引命令，第二次发送会返回一个400错误，内容是资源已存在resource_already_exists_exception，得到的结果都一样就是创建一个索引(表)。

ES中的index索引和关系型数据库中的table表的区别

index等于table。

索引-查看详细信息

1	curl --location --request GET '192.168.2.69:9200/shopping'

返回结果：

{
    "shopping": {
        "aliases": {},
        "mappings": {},
        "settings": {
            "index": {
                "routing": {
                    "allocation": {
                        "include": {
                            "_tier_preference": "data_content"
                        }
                    }
                },
                "number_of_shards": "1",
                "provided_name": "shopping",
                "creation_date": "1658840843423",
                "number_of_replicas": "1",
                "uuid": "DSpChW2oQ5mUQS4S9YXjTQ",
                "version": {
                    "created": "8030299"
                }
            }
        }
    }
}

索引-查看所有索引

1	curl --location --request GET '192.168.2.69:9200/_cat/indices?v'

返回结果：

1
2
3

health status index    uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   shopping DSpChW2oQ5mUQS4S9YXjTQ   1   1          0            0       225b           225b

health对应的索引状态

集群健康状态	状态	说明
red	不是所有的主要分片都可用。	表示该集群中存在不可用的主分片。可以理解为某个或者某几个索引存在主分片丢失的情况。
yellow	所有主要分片可用，但不是所有副本分片都可用。	表示该集群中某个或者某几个索引存在副本分片存在丢失的情况。
green	所有主要分片和副本分片都可用。	表示集群中所有的索引都很健康，不存在丢失的分片。

索引-删除

1	curl --location --request DELETE '192.168.2.69:9200/shopping'

文档-新增

curl --location --request POST '192.168.2.69:9200/shopping/_doc' \
--header 'Content-Type: application/json' \
--data-raw '{
    "title": "小米手机",
    "category": "小米",
    "price": "3999.00"
}'

返回结果：

{
    "_index": "shopping",
    "_id": "xtyxOoIBZus9zN4wUsVr",
    "_version": 1,
    "result": "created",
    "_shards": {
        "total": 2,
        "successful": 1,
        "failed": 0
    },
    "_seq_no": 0,
    "_primary_term": 1
}

添加文档，可以添加重复数据，每次添加数据都会返回唯一的_id，所以使用post非幂等请求，同样的数据先后发送，返回的结果是不同的。同时es支持录入重复数据。

post不强制要求幂等性，但是put请求要求严格的幂等性，即put每次请求返回的_id要求一致，所以新增数据可以用post。

文档-新增-指定ID

curl --location --request POST '192.168.2.69:9200/shopping/_doc/123' \
--header 'Content-Type: application/json' \
--data-raw '{
    "title": "小米手机",
    "category": "小米",
    "price": "3999.00"
}'

将请求中的_doc换成_create也是可以的。

指定ID新增会导致每次新增ES都会先查找是否有重复ID，会影响速度，一般不用。

指定ID后，返回的结果里面存在_id字符，请求变成幂等性，可以使用put请求。

文档-查找-指定ID查找

查询_id为123的数据

1	curl --location --request GET '192.168.2.69:9200/shopping/_doc/123'

返回结果：

{
    "_index": "shopping",
    "_id": "123",
    "_version": 1,
    "_seq_no": 2,
    "_primary_term": 1,
    "found": true,
    "_source": {
        "title": "小米手机",
        "category": "小米",
        "price": "3999.00"
    }
}

无数据情况：

{
    "_index": "shopping",
    "_id": "888",
    "found": false
}

文档-查找全部数据

默认返回10条

1	curl --location --request GET '192.168.2.69:9200/shopping/_search'

返回结果：

{
    "took": 4,			# 耗费时间，单位毫秒
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 3,
            "relation": "eq"
        },
        "max_score": 1.0,
        "hits": [
            {
                "_index": "shopping",
                "_id": "xtyxOoIBZus9zN4wUsVr",
                "_score": 1.0,
                "_source": {
                    "title": "小米手机",
                    "category": "小米",
                    "price": "3999.00"
                }
            },
            {
                "_index": "shopping",
                "_id": "x9y0OoIBZus9zN4wGsWM",
                "_score": 1.0,
                "_source": {
                    "title": "小米手机",
                    "category": "小米",
                    "price": "3999.00"
                }
            },
            {
                "_index": "shopping",
                "_id": "123",
                "_score": 1.0,
                "_source": {
                    "title": "小米手机",
                    "category": "小米",
                    "price": "3999.00"
                }
            }
        ]
    }
}

文档-修改

修改方法1：指定ID新增，完全覆盖，是幂等性能的，可以用PUT，用的不多。其实就是指定ID新增命令。

curl --location --request PUT '192.168.2.69:9200/shopping/_doc/123' \
--header 'Content-Type: application/json' \
--data-raw '{
    "title": "小米手机",
    "category": "小米",
    "price": "5999.00"
}'

修改方法2：局部更新，真正修改数据方法

curl --location --request POST '192.168.2.69:9200/shopping/_update/123' \
--header 'Content-Type: application/json' \
--data-raw '{
    "doc": {
        "title": "小米手机",
        "category": "小米",
        "price": "5999.00"
    }
}'

文档-删除

1	curl --location --request DELETE '192.168.2.69:9200/shopping/_doc/123'

查找-条件查询

默认情况下，ES查询会拆字然后每个字倒排模糊匹配，返回所有模糊配置到的结果

查询属性`category`为小米的数据URL写法

1	curl --location --request GET '192.168.2.69:9200/shopping/_search?q=category:小米'

数据库中总计有15条，但是只返回前10条。

使用请求体Body写法

curl --location --request GET '192.168.2.69:9200/shopping/_search' \
--header 'Content-Type: application/json' \
--data-raw '{
    "query" : {
        "match" : {
            "category" : "小米"
        }
    }
}'

query：查询

match：匹配查询

使用请求体Body写法-无条件查询

在Body中不写条件，查询所有数据

curl --location --request GET '192.168.2.69:9200/shopping/_search' \
--header 'Content-Type: application/json' \
--data-raw '{
    "query" : {
        "match_all" : {
        }
    }
}'

还是返回默认10条

分页查询

{
    "query" : {
        "match" : {
            "category" : "小米"
        }
    },
    "from": 1,
    "size" : 3 
}

from：代表起始位置，默认0开始

size：代表查询几条，默认10条

默认情况下

第一页默认为：
"from": 0,
"size" : 10 

第二页为：
"from": 10,
"size" : 10

分页查询+获取指定参数

curl --location --request GET '192.168.2.69:9200/shopping/_search' \
--header 'Content-Type: application/json' \
--data-raw '{
    "query" : {
        "match" : {
            "category" : "小米"
        }
    },
    "from": 1,
    "size" : 3 ,
    "_source": ["title"]
}'

在body里面增加一个_source参数，值是一个数组，即为指定要显示出来的字段。

返回值：

{
    "took": 10,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 15,
            "relation": "eq"
        },
        "max_score": 0.063497394,
        "hits": [
            {
                "_index": "shopping",
                "_id": "ytzFOoIBZus9zN4wIsXi",
                "_score": 0.063497394,
                "_source": {
                    "title": "小米手机"
                }
            },
            {
                "_index": "shopping",
                "_id": "y9zFOoIBZus9zN4wJMX8",
                "_score": 0.063497394,
                "_source": {
                    "title": "小米手机"
                }
            },
            {
                "_index": "shopping",
                "_id": "zNzFOoIBZus9zN4wJ8UA",
                "_score": 0.063497394,
                "_source": {
                    "title": "小米手机"
                }
            }
        ]
    }
}

分页查询+排序

增加sort字段

curl --location --request GET '192.168.2.69:9200/shopping/_search' \
--header 'Content-Type: application/json' \
--data-raw '{
    "query" : {
        "match" : {
            "category" : "小米"
        }
    },
    "from": 1,            # 起始位置
    "size" : 3 ,          # 每页数据量
    "sort": {
        "price" : {
            "order" : "desc"
        }
    }
}'

如果字段不是数值而是text会报错。

复杂查询

满足条件，must 数组里面的条件需要都满足，category属性值是小米

{
    "query" : {		#query 查询
        "bool" :{   #条件
            "must" : [   # must 多个条件，需要同时成立 类似and，or 是 should
                { "match": {   # match 每一个匹配规则
                     "category" : "小米"
                     } 
                }
            ]
        }
    }
}

and条件

条件：品牌是小米手机并且价格必须是3889.00才行

{
    "query" : {
        "bool" :{
            "must" : [
                { "match": {"category" : "小米"} },
                { "match": {"price" : "3889.00"} }
            ]
        }
    }
}

or条件

条件：品牌是华为或者价格为3889.00

{
    "query" : {
        "bool" :{
            "should" : [
                { "match": {"category" : "华为"} },
                { "match": {"price" : "3889.00"} }
            ]
        }
    }
}

价格范围查询

价格大于5000

{
    "query" : {
        "bool" :{
            "must" : [
                { "match": {"category" : "华为"} }
            ],
            "filter":{
                "range" :{"price" : {"gt": 5000}}
            }
        }
    }
}

单字查询，ES自动会根据分词匹配，主要是match关键字导致的，根据分词的结果搜索，可以改成match_phrase即为精确匹配。

搜索“小米”，结果15条

搜索“小”或者“米”，结果15条

搜索“华为”，结果1条

搜索“华”或者“为”，结果1条

搜索“米华”，结果16条

curl --location --request GET '192.168.2.69:9200/shopping/_search' \
--header 'Content-Type: application/json' \
--data-raw '{
    "query" : {
        "match" : {
            "category" : "小华"
        }
    }
}'

模糊匹配精确匹配

ES中，汉字会被拆解然后倒排索引，即小米拆解出来的小、米都可以查询出来，

包括小华都会查询出来数据，会模糊查询小和华将匹配的数据都查询出来。

精确匹配

curl --location --request GET '192.168.2.69:9200/shopping/_search' \
--header 'Content-Type: application/json' \
--data-raw '{
    "query" : {
        "match_phrase" : {
            "category" : "华为"
        }
    }
}'

高亮显示

curl --location --request GET '192.168.2.69:9200/shopping/_search' \
--header 'Content-Type: application/json' \
--data-raw '{
    "query" : {
        "match_phrase" : {
            "category" : "华为"
        }
    },
    "highlight": {
        "fields": {
            "category" : {}
        }
    }
}'

{
    "took": 53,
    "timed_out": false,
    "_shards": {

    },
    "hits": {

        "max_score": 4.8554964,
        "hits": [
            {
                "highlight": {
                    "category": [
                        "<em>华</em><em>为</em>"
                    ]
                }
            }
        ]
    }
}

统计分析-聚合操作group by

curl --location --request GET '192.168.2.69:9200/shopping2/_search' \
--header 'Content-Type: application/json' \
--data-raw '{
    "aggs" : {      // 聚合操作
        "price_group" : {    // 名称随便起
            "terms" : {      // 分组操作
                "field" : "price"   // 分组字段
            }
        }
    },
    "size" :0   // 接口不返回原始数据
}'

注意如果字段不是数字类型，会报错，默认文本不能聚合的。

默认返回的JSON里面还包含原始数据，可以增加参数不显示统计结果。

集合操作返回值：

{
    "took": 1,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 5,
            "relation": "eq"
        },
        "max_score": null,
        "hits": []
    },
    "aggregations": {
        "price_group": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
                {
                    "key": 3999.0,
                    "doc_count": 4
                },
                {
                    "key": 13999.0,
                    "doc_count": 1
                }
            ]
        }
    }
}

统计分析-聚合操作avg

curl --location --request GET '192.168.2.69:9200/shopping2/_search' \
--header 'Content-Type: application/json' \
--data-raw '{
    "aggs" : {      // 聚合操作
        "price_avg" : {    // 名称随便起
            "avg" : {      // 分组操作
                "field" : "price"   // 分组字段
            }
        }
    },
    "size" :0   // 接口不返回原始数据
}'

返回结果：

{
    "took": 2,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 5,
            "relation": "eq"
        },
        "max_score": null,
        "hits": []
    },
    "aggregations": {
        "price_avg": {
            "value": 5999.0
        }
    }
}

字段设置

{
    "properties": {
        "name": {
            "type": "text",
            "index": true,
        },
        "sex": {
            "type": "keyword",
            "index": true
        },
        "tel": {
            "type": "keyword",
            "index": false
        }
    }
}

type：text 正常文本类型，可以分词查询。

type：keyword 关键词，不可以分词查询。

index:false 不可以被查询？？？，报错(7.X)，可以被搜索出但是不支持分词匹配(8.X)

JAVA集成ES

ES集群化部署

软件安装解压
创建用户
修改配置文件

ElasticSearch

ES结构

倒排索引

ES-Rest风格操作

索引-创建(结构数据库中的表)

ES命令的幂等性

ES中的index索引和关系型数据库中的table表的区别

索引-查看详细信息

索引-查看所有索引

索引-删除

文档-新增

文档-新增-指定ID

文档-查找-指定ID查找

文档-查找全部数据

文档-修改

文档-删除

查找-条件查询

查询属性category为小米的数据URL写法

使用请求体Body写法

使用请求体Body写法-无条件查询

分页查询

分页查询+获取指定参数

分页查询+排序

复杂查询

and条件

or条件

价格范围查询

模糊匹配精确匹配

高亮显示

统计分析-聚合操作group by

统计分析-聚合操作avg

字段设置

JAVA集成ES

ES集群化部署

查询属性`category`为小米的数据URL写法