1

有两种不同类型的 wikitext 超链接:

[[stack]]
[[heap (memory region)|heap]]

我想删除超链接但保留文本:

stack
heap

目前,我正在运行两个阶段,使用两个不同的正则表达式:

public class LinkRemover
{
    private static final Pattern
    renamingLinks = Pattern.compile("\\[\\[[^\\]]+?\\|(.+?)\\]\\]");

    private static final Pattern
    simpleLinks = Pattern.compile("\\[\\[(.+?)\\]\\]");

    public static String removeLinks(String input)
    {
        String temp = renamingLinks.matcher(input).replaceAll("$1");
        return simpleLinks.matcher(temp).replaceAll("$1");
    }
}

有没有办法将两个正则表达式“融合”成一个,达到相同的结果?

如果您想检查您提出的解决方案的正确性,这里有一个简单的测试类:

public class LinkRemoverTest
{
    @Test
    public void test()
    {
        String input = "A sheep's [[wool]] is the most widely used animal fiber, and is usually harvested by [[Sheep shearing|shearing]].";
        String expected = "A sheep's wool is the most widely used animal fiber, and is usually harvested by shearing.";
        String output = LinkRemover.removeLinks(input);
        assertEquals(expected, output);
    }
}
4

1 回答 1

2

您可以制作零件,直到管道可选:

\\[\\[(?:[^\\]|]*\\|)?([^\\]]+)\\]\\]

并且要确保您始终位于方括号之间,请使用字符类。

小提琴(单击 Java 按钮)

图案细节:

\\[\\[         # literals opening square brackets
(?:            # open a non-capturing group
    [^\\]|]*   # zero or more characters that are not a ] or a |
    \\|        # literal |
)?             # make the group optional
([^\\]]+)      # capture all until the closing square bracket
\\]\\]
于 2015-04-28T11:47:47.337 回答