sparql - 比较联合查询中的标签

Question

我有一个正在运行的 Wikibase 实例。我能够成功地使用 Wikidata 运行联合查询。我有一些比较这样的标签的查询：

PREFIX xwdt: <http://www.wikidata.org/prop/direct/>
PREFIX xwd: <http://www.wikidata.org/entity/>
PREFIX xpq: <http://www.wikidata.org/prop/qualifier/>
PREFIX xps: <http://www.wikidata.org/prop/statement/>
PREFIX xp: <http://www.wikidata.org/prop/>

select ?item  ?wditem ?itemLabel ?wid ?wditemlabel
where {
  ?item wdt:P17 wd:Q39.
  ?item wdt:P31 wd:Q5.
  optional {
    ?item wdt:P14 ?wid .
  }
  ?item rdfs:label ?itemLabel.   
  SERVICE <https://query.wikidata.org/sparql> {
    ?wditem xwdt:P27 xwd:Q258.
    ?wditem xwdt:P106 xwd:Q937857.
    ?wditem rdfs:label ?wditemlabel.
    filter(LANGMATCHES(LANG(?wditemlabel), "en")).
  }
  filter(contains(?wditemlabel, ?itemLabel))
}
group by ?item ?itemLabel ?wid ?wditem ?wditemlabel

但是，以上内容可以通过标签进行匹配和匹配：

1）我最初filter(contains(?wditemlabel, ?itemLabel))在 SERVICE 子句中，它没有返回任何结果。但是，如果我对其中一个变量（例如filter(contains("test string", ?itemLabel))）使用静态字符串，它似乎可以工作。为什么比较变量和字符串而不是两个变量时会起作用？

2）我希望查询在最后没有“分组依据”的情况下工作。但看起来没有它，会发生某种交叉连接/笛卡尔积，并且每个匹配的项目都会重复总次数（n * n）。查询的哪一部分导致了这种情况？

score 1 · Accepted Answer

执行联合查询，您的本地 Blazegraph 对 Wikidata 执行此类查询：

SELECT ?wditem ?wditemlabel
WHERE {
    ?wditem wdt:P27 wd:Q258.
    ?wditem wdt:P106 wd:Q937857.
    ?wditem rdfs:label ?wditemlabel.
    filter(LANGMATCHES(LANG(?wditemlabel), "en"))
    filter(contains(?wditemlabel, ?itemlabel))
}
VALUES () {
( ) ( ) ( ) ( ) ( )  ( ) ( ) ( ) ( ) ( )  ( ) ( ) ( ) ( ) ( )  ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( )  ( ) ( ) ( ) ( ) ( )  ( ) ( ) ( ) ( ) ( )  ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( )  ( ) ( ) ( ) ( ) ( )  ( ) ( ) ( ) ( ) ( )  ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( )  ( ) ( ) ( ) ( ) ( )  ( ) ( ) ( ) ( ) ( )  ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( )  ( ) ( ) ( ) ( ) ( )  ( ) ( ) ( ) ( ) ( )  ( ) ( ) ( ) ( ) ( )
} # 100 values

如您所见，Blazegraph “忘记”将本地绑定传递给?itemLabelinto VALUES（可能是因为?itemLabel不会出现在远程三元组模式中），但“认为”它们已传递。

此错误会导致您的两个问题：

在 Wikidata 上尝试上述查询（0 个结果）
在没有contains（82800 结果而不是 828）的情况下尝试在 Wikidata 上进行上述查询

解决方法

使用提示强制查询执行顺序：

select ?item ?wditem ?itemLabel ?wditemlabel
where {
  hint:Query hint:optimizer "None"
  SERVICE <https://query.wikidata.org/sparql> {
    ?wditem wdt:P27 wd:Q258.
    ?wditem wdt:P106 wd:Q937857.
    ?wditem rdfs:label ?wditemlabel.
    filter(lang(?wditemlabel)= "en").
  } 
  ?item wdt:P17 wd:Q39.
  ?item wdt:P31 wd:Q5.
  ?item rdfs:label ?itemLabel.
  filter(contains(?wditemlabel, ?itemLabel))
}

或者

select ?item ?wditem ?itemLabel ?wditemlabel
where {
  ?item wdt:P17 wd:Q39.
  ?item wdt:P31 wd:Q5.
  ?item rdfs:label ?itemLabel.
  SERVICE <https://query.wikidata.org/sparql> {
    ?wditem wdt:P27 wd:Q258.
    ?wditem wdt:P106 wd:Q937857.
    ?wditem rdfs:label ?wditemlabel.
    filter(lang(?wditemlabel)= "en").
  }
  hint:Prior hint:runFirst true .
  filter(contains(?wditemlabel, ?itemLabel))
}

顺便说一句，您可以在原始查询中DISTINCT使用GROUP BY，或者使用额外的本地过滤，即filter(lang(?itemLabel)='ast').

比较

在 GraphDB 中，原始查询运行良好，但应替换contains(?wditemlabel, ?itemLabel)为contains(str(?wditemlabel), str(?itemLabel)).

也可以看看

Federated Query (Blazegraph wiki)
Speed up federated query (question on SO)

sparql - 比较联合查询中的标签

1 回答 1

Related

Reference