python - 将 ripgrep 的输出通过管道传输到 Python 进行过滤（将文件名与匹配项分开）

Question

我需要使用 ripgrep 来找到某种模式。这将是一个描述化学反应的字符串。ripgrep 的输出如下所示：

~ rg -U --only-matching --vimgrep --replace='$1' '```smiles\n(.+)\n```'

Testing Smiles.md:5:1:OC(=O)CCC(=O)O>CCO.[H+]>CCOC(=O)CCC(=O)OCC
Another Smiles.md:5:1:CO>BrP(Br)Br>CBr

凉爽的！但现在我需要使用 Python 脚本过滤掉这些结果。所以我可以将这些结果通过管道传输到 Python 并从标准输入中读取。但是有一个问题：如何保证分隔符？如果我编写 Python 脚本将第三个冒号之后的所有内容作为输入字符串，我如何保证文件本身的名称中没有冒号？当我通过管道传输到 python 时，如何正确地将文件名与匹配项分开？

谢谢，

score 0 · Accepted Answer

在 ripgrep 执行之前添加一个预检查阶段怎么样：

dir="."        # assign to your target directory
for f in "$dir"/*.md; do
    if [[ $f = *:* ]]; then             # if the file contains ":"
        badlist+=("$f")                 # then add the filename to the badlist
    fi
done
if (( ${#badlist[@]} > 0 )); then       # if the badlist is not empty...
    echo "These file(s) contain a colon character. Rename them and run again."
    printf "    %s\n" "${badlist[@]}"
    exit
fi

rg -U --only-matching --vimgrep --replace='$1' '```smiles\n(.+)\n```' "$dir"/*.md | python-script

如果文件名中包含任何文件，上面的代码会在主ripgrep阶段之前立即停止执行:。如果找到，您可以重命名文件名。

python - 将 ripgrep 的输出通过管道传输到 Python 进行过滤（将文件名与匹配项分开）

1 回答 1

Related

Reference