如果您查看在text_enfieldType 中应用的所有标记过滤器、分析器等,这使得它不太适合排序。对于字符串值的排序,最好使用特定的 fieldType 进行排序。过去,我使用以下 fieldType 对字符串字段进行排序。
<fieldType name="lowercase_sort" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.KeywordTokenizerFactory" />
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.TrimFilterFactory" />
</analyzer>
</fieldType>
Solr 示例模式还包括以下用于排序的字段类型:
<fieldType name="alphaOnlySort" class="solr.TextField"
sortMissingLast="true" omitNorms="true">
<analyzer>
<!-- KeywordTokenizer does no actual tokenizing, so the entire
input string is preserved as a single token
-->
<tokenizer class="solr.KeywordTokenizerFactory"/>
<!-- The LowerCase TokenFilter does what you expect, which can be
when you want your sorting to be case insensitive
-->
<filter class="solr.LowerCaseFilterFactory" />
<!-- The TrimFilter removes any leading or trailing whitespace -->
<filter class="solr.TrimFilterFactory" />
<!-- The PatternReplaceFilter gives you the flexibility to use
Java Regular expression to replace any sequence of characters
matching a pattern with an arbitrary replacement string,
which may include back references to portions of the original
string matched by the pattern.
See the Java Regular Expression documentation for more
information on pattern and replacement string syntax.
http://java.sun.com/j2se/1.6.0/docs/api/java/util/regex/package-summary.html
-->
<filter class="solr.PatternReplaceFilterFactory"
pattern="([^a-z])" replacement="" replace="all"
/>
</analyzer>
</fieldType>
然后定义一个额外的字段进行排序,可能像下面这样:
<field name="Name_Sort" type="lowercase_sort" indexed="true" stored="false"/>
使用 copyField 填充此字段
<copyField src="Name" dest="Name_Sort"/>
Name_Sort然后对查询中的这个新字段进行排序。