我用数据框的一个子集成功地做到了这一点,但我似乎无法让它与我的另一个子集一起工作。有大约 4000 个订单的信息,范围为 0-8 个月,情绪为 0-5。
目标是融合 id 为“order”和“month.of.service”的数据,并汇总该月的平均情绪。数据框如下所示:
order | month | sentiment
123 | 0 | 3
123 | 0 | 4
123 | 1 | 3
124 | 0 | 2
我希望它看起来像这样:
123 | 0 | 3.5
123 | 1 | 3
124 | 0 | 2
这是我使用的实际代码:
sentiment.md <- melt(sentiment, id = c('Related.order', 'Lifespan'))
sentiment.dc <- dcast(sentiment.md, Related.order + Lifespan ~ value, sum)
> head(sentiment.md)
Related.order Lifespan variable value
1 12771 0 Sentiment 5
2 11188 1 Sentiment 3
3 12236 3 Sentiment 5
4 12925 0 Sentiment 5
5 12151 3 Sentiment 5
6 12338 0 Sentiment 5
> head(sentiment.dc)
Related.order Lifespan 0 1 2 3 4 5
1 4976 0 NaN NaN NaN 3 NaN NaN
2 4976 1 NaN NaN NaN 3 NaN NaN
3 4976 2 NaN NaN NaN NaN 4 NaN
4 4976 3 NaN NaN NaN NaN 4 NaN
5 4976 4 NaN NaN NaN NaN 4 NaN
6 4976 5 NaN NaN NaN NaN 4 NaN
为了进一步演示我希望它看起来像什么,这里是完全相同的东西,使用我想要的格式的数据框中的唯一其他列,交互:
interactions.md <- melt(interactions, id = c('Related.order', 'Lifespan'))
interactions.dc <- dcast(interactions.md, Related.order + Lifespan ~ value, sum)
> head(interactions.md)
Related.order Lifespan variable value
1 12771 0 Event 1
2 11188 1 Event 1
3 12236 3 Event 1
4 12925 0 Event 1
5 12151 3 Event 1
6 12338 0 Event 1
> head(interactions.dc)
Related.order Lifespan 1
1 4976 0 6
2 4976 1 3
3 4976 2 3
4 4976 3 1
5 4976 4 2
6 4976 5 2
我想也许我使用了错误的结构或其他东西,但无法识别任何东西。作为参考,这里是 R-studio 的截图: