1

我正在尝试获取分组 BigQuery 表中变量的分位数,但出现此错误:

Error: Job 'xxxxx' failed
Syntax error: Expected end of input but got keyword WITHIN at [1:45] [invalidQuery]

代表如下。

# NOTE: for reprex to work, you must have BIGQUERY_TEST_PROJECT envvar set to name of project which has billing set up and to which you have write access

library(DBI)
library(bigrquery)
library(dplyr)

billing <- bq_test_project()

con <- dbConnect(
  bigrquery::bigquery(),
  project = "publicdata",
  dataset = "samples",
  billing = billing
)

natality <- tbl(con, "natality")
   
natality %>%
  group_by(year) %>%
  summarize(q25 = quantile(weight_pounds,0.25),
            q50 = median(weight_pounds),
            q75 = quantile(weight_pounds,0.75)
  )

任何人都知道一种解决方法,也许是通过sql()summarise()调用中提供 SQL 代码?

谢谢!

4

1 回答 1

1

一位同事通过sql()summarize()通话中提供 SQL 代码找到了答案:

# NOTE: for reprex to work, you must have BIGQUERY_TEST_PROJECT envvar set to name of project which has billing set up and to which you have write access

library(DBI)
library(bigrquery)
library(dplyr)

billing <- bq_test_project()

con <- dbConnect(
  bigrquery::bigquery(),
  project = "publicdata",
  dataset = "samples",
  billing = billing
)

natality <- tbl(con, "natality")
   
natality %>%
  group_by(year) %>%
  summarize(q25 = sql("approx_quantiles(weight_pounds,4)[offset(1)]"),
            q50 = sql("approx_quantiles(weight_pounds,2)[offset(1)]"),
            q75 = sql("approx_quantiles(weight_pounds,4)[offset(3)]")
  )
于 2020-09-29T16:14:33.850 回答