0

生产环境中偶尔会出现以下异常,

2020-01-29 17:10:46.085 ERROR 2852 --- [o-8022-exec-258] c.c.p.common.dao.SearchDao               : Search person by id failed

java.net.SocketTimeoutException: 30,000 milliseconds timeout on connection http-outgoing-832 [ACTIVE]
        at org.elasticsearch.client.RestClient.extractAndWrapCause(RestClient.java:789) ~[elasticsearch-rest-client-7.1.1.jar!/:7.1.1]
        at org.elasticsearch.client.RestClient.performRequest(RestClient.java:225) ~[elasticsearch-rest-client-7.1.1.jar!/:7.1.1]
        at org.elasticsearch.client.RestClient.performRequest(RestClient.java:212) ~[elasticsearch-rest-client-7.1.1.jar!/:7.1.1]
        at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1433) ~[elasticsearch-rest-high-level-client-7.1.1.jar!/:7.1.1]
        at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1403) ~[elasticsearch-rest-high-level-client-7.1.1.jar!/:7.1.1]
        at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1373) ~[elasticsearch-rest-high-level-client-7.1.1.jar!/:7.1.1]
        at org.elasticsearch.client.RestHighLevelClient.get(RestHighLevelClient.java:699) ~[elasticsearch-rest-high-level-client-7.1.1.jar!/:7.1.1]

Caused by: java.net.SocketTimeoutException: 30,000 milliseconds timeout on connection http-outgoing-832 [ACTIVE]
        at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.timeout(HttpAsyncRequestExecutor.java:387) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
        at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:92) ~[httpasyncclient-4.1.4.jar!/:4.1.4]
        at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:39) ~[httpasyncclient-4.1.4.jar!/:4.1.4]
        at org.apache.http.impl.nio.reactor.AbstractIODispatch.timeout(AbstractIODispatch.java:175) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
        at org.apache.http.impl.nio.reactor.BaseIOReactor.sessionTimedOut(BaseIOReactor.java:263) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
        at org.apache.http.impl.nio.reactor.AbstractIOReactor.timeoutCheck(AbstractIOReactor.java:492) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
        at org.apache.http.impl.nio.reactor.BaseIOReactor.validate(BaseIOReactor.java:213) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
        at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:280) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
        at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
        at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591) ~[httpcore-nio-4.4.11.jar!/:4.4.11]

但这只是一个简单的查询,而不是一个复杂的查询

 curl 'http://localhost:9201/person/_doc/30154410564?pretty'

在这段时间内,负载非常低 在此处输入图像描述

那么为什么会存在这些超时异常呢?并且有很多搜索查询但是为什么只有这个简单的query by id可能会导致这个异常?

人物索引是从Oracle DB同步的,有一个定时任务,每隔10分钟会同步变化的人物索引,如果在这段时间内访问人物索引,会导致30,000 milliseconds timeout. 那么如何解决呢?而且好像Java客户端访问会出现这种现象,而curl命令行访问则不会出现这种现象。

PS:

health status index               uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   person              jb3msRw5S9ixgXN5SLd6bw   1   0  140754205     19239587     19.8gb         19.8gb

并且在这个时候有为人员索引写的索引 在此处输入图像描述

休息客户端配置:

private final RestHighLevelClient restHighLevelClient;
restHighLevelClient = new RestHighLevelClient(RestClient.builder(new HttpHost(host, port)));
4

1 回答 1

1

通过调用 hot_threads api

 curl 'http://localhost:9201/_nodes/hot_threads?pretty'

得到以下信息:

   100.9% (504.4ms out of 500ms) cpu usage by thread 'elasticsearch[node-1][get][T#6]'
     8/10 snapshots sharing following 33 elements
       app//org.apache.lucene.index.SingletonSortedNumericDocValues.nextDoc(SingletonSortedNumericDocValues.java:53)
       app//org.apache.lucene.codecs.lucene80.IndexedDISI.writeBitSet(IndexedDISI.java:196)
       app//org.apache.lucene.codecs.lucene80.Lucene80DocValuesConsumer.writeValues(Lucene80DocValuesConsumer.java:214)
       app//org.apache.lucene.codecs.lucene80.Lucene80DocValuesConsumer.addNumericField(Lucene80DocValuesConsumer.java:111)
       app//org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.addNumericField(PerFieldDocValuesFormat.java:109)
       app//org.apache.lucene.index.ReadersAndUpdates.handleDVUpdates(ReadersAndUpdates.java:368)
       app//org.apache.lucene.index.ReadersAndUpdates.writeFieldUpdates(ReadersAndUpdates.java:570)
       app//org.apache.lucene.index.ReaderPool.writeAllDocValuesUpdates(ReaderPool.java:228)
       app//org.apache.lucene.index.IndexWriter.writeReaderPool(IndexWriter.java:3308)
       app//org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:520)
       app//org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:294)
       app//org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:269)
       app//org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:259)
       app//org.apache.lucene.index.FilterDirectoryReader.doOpenIfChanged(FilterDirectoryReader.java:112)
       app//org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:140)
       app//org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:156)
       app//org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:58)
       app//org.apache.lucene.search.ReferenceManager.doMaybeRefresh(ReferenceManager.java:176)
       app//org.apache.lucene.search.ReferenceManager.maybeRefreshBlocking(ReferenceManager.java:253)
       app//org.elasticsearch.index.engine.InternalEngine.refresh(InternalEngine.java:1548)
       app//org.elasticsearch.index.engine.InternalEngine.get(InternalEngine.java:652)
       app//org.elasticsearch.index.shard.IndexShard.get(IndexShard.java:916)
       app//org.elasticsearch.index.get.ShardGetService.innerGet(ShardGetService.java:169)
       app//org.elasticsearch.index.get.ShardGetService.get(ShardGetService.java:93)
       app//org.elasticsearch.index.get.ShardGetService.get(ShardGetService.java:84)
       app//org.elasticsearch.action.get.TransportGetAction.shardOperation(TransportGetAction.java:106)
       app//org.elasticsearch.action.get.TransportGetAction.shardOperation(TransportGetAction.java:45)

似乎当通过id查询人时,内部自动执行刷新,并且从官方文档中得到

默认情况下,get API 是实时的,不受索引刷新率的影响(当数据对搜索可见时)。如果文档已更新但尚未刷新,get API 将就地发出刷新调用以使文档可见。这也将使自上次刷新后更改的其他文档可见。为了禁用实时 GET,可以将实时参数设置为 false。

注意:每次访问人员详情页面,都会更新此人的viewCount。

所以我明确禁用了实时

        GetRequest getRequest = new GetRequest(personIndex, id.toString());
        getRequest.realtime(false);

这样做时,超时问题就解决了。

于 2020-02-01T08:30:22.553 回答