1

我正在尝试解析文本格式。`我想用反引号 ( )标记内联代码,就像 SO 一样。规则应该是,如果你想在内联代码元素内使用反引号,你应该在内联代码周围使用双反引号。

像这样:

``用反引号(`)标记内联代码``

由于某种原因,我的解析器似乎完全跳过了双反引号。下面是执行内联代码解析的函数的代码:

    private string ParseInlineCode(string input)
    {
        for (int i = 0; i < input.Length; i++)
        {
            if (input[i] == '`' && input[i - 1] != '\\')
            {
                if (input[i + 1] == '`')
                {
                    string str = ReadToCharacter('`', i + 2, input);
                    while (input[i + str.Length + 2] != '`')
                    {
                        str += ReadToCharacter('`', i + str.Length + 3, input);
                    }
                    string tbr = "``" + str + "``";
                    str = str.Replace("&", "&amp;");
                    str = str.Replace("<", "&lt;");
                    str = str.Replace(">", "&gt;");
                    input = input.Replace(tbr, "<code>" + str + "</code>");
                    i += str.Length + 13;
                }
                else
                {
                    string str = ReadToCharacter('`', i + 1, input);
                    input = input.Replace("`" + str + "`", "<code>" + str + "</code>");
                    i += str.Length + 13;
                }
            }
        }
        return input;
    }

如果我在某些东西周围使用单个反引号,它会<code>正确地将其包装在标签中。

4

2 回答 2

4

while循环中

while (input[i + str.Length + 2] != '`')
{
    str += ReadToCharacter('`', i + str.Length + 3, input);
}

您查看了错误的索引 -i + str.Length + 2而不是i + str.Length + 3- 反过来您必须在正文中添加反引号。应该是

while (input[i + str.Length + 3] != '`')
{
    str += '`' + ReadToCharacter('`', i + str.Length + 3, input);
}

但是您的代码中还有一些错误。IndexOutOfRangeException如果输入的第一个字符是反引号,则以下行将导致。

 if (input[i] == '`' && input[i - 1] != '\\')

IndexOutOfRangeException如果输入包含奇数个单独的反引号并且输入的最后一个字符是反引号,则以下行将导致一个。

if (input[i + 1] == '`')

您可能应该将您的代码重构为更小的方法,而不是在单个方法中处理许多情况——这很容易出现错误。如果您还没有为代码编写单元测试,我强烈建议您这样做。并且由于解析器不是很容易测试,因为您必须为各种无效输入做好准备,您可以看看PEX - 一个通过分析所有分支点并尝试获取每个代码来自动为您的代码生成测试用例的工具可能的代码路径。

我迅速启动 PEX 并针对代码运行它 - 它找到了IndexOutOfRangeException我想到的以及更多内容。当然,NullReferenceExceptions如果输入是空引用,PEX 会发现很明显。以下是 PEX 发现导致异常的输入。

case1 = "`"

case2 = "\0`"

case3 = "\0``"

case4 = "\0`\0````````````\u0001``````````````\0\0\0\0\0\0\0\0\0\0\0````"

case5 = "\0`\0````````````\u0001``````````````\0\0\0\0\0\0\0\0\0\0\0```\0````````````\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0`"

case6 = "\0`\0````````````\u0001``````````````\0\0\0\0\0\0\0\0\0\0\0```\0````````````\0\0\0\0\0\0\0\0\0\0``<\0\0`````````````````````````````````````````````````````````````````````````````````````\0\0\0\0\0\0\0\0\0\0``<\0\0```````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````\0\0\0\0\0\0\0\0\0`\0```````````````"

我对您的代码的“修复”更改了导致异常的输入(并且可能还引入了新的错误)。PEX 在修改后的代码中捕获了以下内容。

case7 = "\0```"

case8 = "\0`\0````````````\u0001``````````````\0\0\0\0\0\0\0\0\0\0\0```\0`\0"

case9 = "\0`\0````````````\u0001``````````````\0\0\0\0\0\0\0\0\0\0\0```\0````````````\0\0\0\0\0\0\0\0\0\0``<\0\0`````````````````````````````````````````````````````````````````````````````````````\0\0\0\0\0\0\0\0\0\0``\0`\0`\0``"

所有三个输入都没有导致原始代码中的异常,而案例 4 和 6 不再导致修改后的代码中的异常。

于 2010-05-25T20:29:57.730 回答
1

这是在 LinqPad 中测试的一小段代码,可帮助您入门

void Main()
{
    string test = "here is some code `public void Method( )` but ``this is not code``";
    Regex r = new Regex( @"(`[^`]+`)" );

    MatchCollection matches = r.Matches( test );

    foreach( Match match in matches )
    {
        Console.Out.WriteLine( match.Value );
        if( test[match.Index - 1] == '`' )
            Console.Out.WriteLine( "NOT CODE" );
            else
        Console.Out.WriteLine( "CODE" );
    }
}

输出:

`public void Method( )`
CODE
`this is not code`
NOT CODE
于 2010-05-25T19:50:41.413 回答