2

这个正则表达式的东西变老了。:( 还有一个问题:我需要计算段落中的单词数和句子数。我尝试使用的代码是这样的:

my $sentencecount = $file =~ s/((^|\s)\S).*?(\.|\?|\!)/$1/g;
my $count = $file =~ s/((^|\s)\S)/$2/g;
print "Input file $ARGV[1] contains $sentencecount sentences and $count words.";

我的结果对这两个计数都返回 63。我知道这是不正确的,至少就字数而言。这是使用替代计数过程的结果吗?如果是这样,我该如何纠正?

4

3 回答 3

2

我建议查看 perlsplit函数,请参阅perlfunc(1)

           If EXPR is omitted, splits the $_ string.  If PATTERN is also
           omitted, splits on whitespace (after skipping any leading
           whitespace).  Anything matching PATTERN is taken to be a
           delimiter separating the fields.  (Note that the delimiter may
           be longer than one character.)
于 2011-01-31T01:19:24.397 回答
1
my $wordCount = 0;
++$wordCount while $file =~ /\S+/g;

my $sentenceCount = 0;
++$sentenceCount while $file =~ /[.!?]+/g;

//g我们在这里一样在标量上下文中进行匹配避免了构建所有单词或所有句子的巨大列表,如果文件很大,可以节省内存。句子计数代码会将任意数量的句尾分隔符计为一个句子(例如Hello... world!,将计为 2 个句子。)

于 2011-01-31T01:51:02.633 回答
0

这从$file

$file="This is praveen worki67ng in RL websolutions";
my $count = () = $file =~ /\S+/g;
my $counter = () = $file =~ /\S/g;
于 2013-04-27T09:06:20.730 回答