1

我有以下格式,并希望使用 elasticsearch 进行批量预处理。

{"title":"April","url":"https://simple.wikipedia.org/wiki/April", "abstract":"April is the 4th month of the year, and comes between March and May. It is one of four months to have 30 days.","sections":["The Month","April in poetry","Events in April","Fixed Events","Moveable Events","Selection of Historical Events","Trivia","References"]}
{"title":"August","url":"https://simple.wikipedia.org/wiki/August", "abstract":"August (Aug.) is the 8th month of the year in the Gregorian calendar, coming between July and September.","sections":["The Month","August observances","Fixed observances and events","Moveable and Monthlong events","Selection of Historical Events","Trivia","References"]}

我正在尝试在我的每一行之前添加索引、类型行。

{"index":{"_index":"myindex","_type":"wiki","_id":"1"}}

在阅读之前的帖子时,我使用的是Kevin Marsh 的帖子,如下所示:

cat file.json jq -c '.[] | {"index": {"_index": "myindex", "_type": "wiki", "_id": .id}}, .' 

我没有使用管道,因为我试图找出之前的错误。我收到错误 jq:no such file or directory。然后我用jq --version and get jq-1.5-1-a5b5cbe.

任何帮助深表感谢。

4

2 回答 2

1

干得好。这对我有用。让我知道这是否有帮助。

cat data.json | jq -c '. | {"index": {"_index": "json", "_type": "json"}}, .'  | curl -XPOST localhost:9200/_bulk --data-binary @-

了解有关jq 的更多信息:轻量级且灵活的命令行 JSON 处理器

于 2017-10-17T19:57:17.920 回答
0

We're found it is necessary to specify Content-Type in the curl header; the suggested solution should be of the form:

cat data.json | jq -c '. | {"index": {"_index": "json", "_type": "json"}}, .' | curl -H "Content-Type: application/json" -XPOST localhost:9200/_bulk --data-binary @-

于 2018-10-22T14:52:54.930 回答