有(至少)两种可能性来达到预期的结果。
使用多个 value.vars 进行投射
data.table
允许将多个指定value.var
为参数的最新版本dcast()
:
library(data.table) # version 1.10.4 used
dcast(DT, P ~ Stat, value.var = list("V", "Points"))
# P V_Assists V_Goals Points_Assists Points_Goals
#1: 1 1 2 3 10
#2: 2 1 1 3 5
如果只需要一Points
列,则需要添加点并删除不必要的列。通过链接,这可以在一个语句中完成,但不是很简洁。
dcast(DT, P ~ Stat, value.var = list("V", "Points"))[
, Points := Points_Assists + Points_Goals][
, c("Points_Assists", "Points_Goals") := NULL][]
# P V_Assists V_Goals Points
#1: 1 1 2 13
#2: 2 1 1 8
投射并加入
或者,点的 dcastV
和聚合可以在单独的步骤中完成,然后将结果连接起来:
# dcast
temp1 <- dcast(DT, P ~ Stat, value.var = "V")
temp1
# P Assists Goals
#1: 1 1 2
#2: 2 1 1
# sum points by P
temp2 <- DT[, .(Points = sum(Points)), by = P]
temp2
# P Points
#1: 1 13
#2: 2 8
现在需要连接两个结果:
temp1[temp2, on = "P"]
# P Assists Goals Points
#1: 1 1 2 13
#2: 2 1 1 8
或合并在一个语句中:
dcast(DT, P ~ Stat, value.var = "V")[DT[, .(Points = sum(Points)), by = P], on = "P"]
该代码看起来比第一个变体更直接和简洁。
数据
library(data.table)
DT <- fread(
"P Stat V Points
1 Goals 2 10
1 Assists 1 3
2 Goals 1 5
2 Assists 1 3")
请注意,fread()
默认情况下返回一个 data.table 对象。如果DT
它仍然是一个 data.frame 它需要被强制转换为 data.table
setDT(DT)