0

我在按每组中的相对计数对数据集中不同组的列重新排序时遇到问题。tibble 形式的数据集如下。它有 3 个组,每组具有不同的设备类型和频率计数:

library(tidyverse) 
library(ggplot2)

dy2 <- tibble(generation = c("All Devices","All Devices","All Devices","All Devices","All Devices","All Devices", 
                      "First Gen", "First Gen","First Gen","First Gen","First Gen","First Gen",
                      "Subsequent Gen","Subsequent Gen","Subsequent Gen","Subsequent Gen","Subsequent Gen"),
       device_type = as.factor(c("Accessories", "Aspiration_catheter", "Guidewire","Microcatheter", "Sheath", "Stentretriever",
                                 "Accessories", "Aspiration_catheter", "Guidewire","Microcatheter", "Sheath", "Stentretriever",
                                 "Accessories", "Aspiration_catheter", "Guidewire", "Sheath", "Stentretriever")),
       N = c(6,36,26,4,18,39,3,20,17,4,8,14,3,16,9,10,25))

当我在 ggplot 中绘制数据集时,我试图通过增加 N 来排列不同组中每种设备类型的顺序,并获得一个geom_textabove 。我只能让它为第一组(“所有设备”)工作。该图的代码如下:

dy2 %>% 
  ggplot(aes(x= generation, y= N, fill= reorder(device_type,N, function(x){sum(x)}))) +
  geom_bar(position= position_dodge(), alpha= 0.85, stat = "identity")+
  geom_text(data= ~ subset(.x, generation %in% c("All Devices")), position=position_dodge(0.9), aes(y= N+0.8, label= N), size= 3, show_guide= FALSE)+
  geom_text(data= ~ subset(.x, generation %in% c("First Gen")), position=position_dodge(0.9), aes(y= N+0.8, label= N), size= 3, show_guide= FALSE)+ 
  geom_text(data= ~ subset(.x, generation %in% c("Subsequent Gen")), position=position_dodge(0.9), aes(y= N+0.8, label= N), size= 3, show_guide= FALSE)+
  scale_fill_manual(name= NULL,
                    values = c("blue", "black", "red", "green3", "cyan4", "purple"),
                    breaks = c("Accessories", "Aspiration_catheter", "Guidewire",
                               "Microcatheter", "Sheath", "Stentretriever"),
                    labels = c("Accessories", "Aspiration catheter", "Guidewire",
                               "Microcatheter", "Sheath", "Stentretriever")) +
  #scale_x_discrete(breaks= c("All Devices", "First Gen", "Subsequent Gen"),
                 #  labels= c("All<br>Devices", "First<br>Gen", "Sub<br>Gen"))+
  theme_classic()

这给出了情节:情节1

如您所见,“First Gen”和“Subsequent Gen”的设备类型列的顺序不正确,而geom_text每列上方的 N 的位置正确但与关联的列不匹配。

我一直在尝试分解数据集以及不同的重新排序命令,但都无济于事。

无论我如何尝试在scale_fill_manual.

我确信存在一些我遗漏的因素问题,但任何帮助将不胜感激。

4

1 回答 1

2

一种选择是使用辅助列

  1. generation通过和排列您的数据N
  2. 创建一个帮助列。我只是粘贴generation在一起device_type
  3. 使用例如,按照数据集的顺序设置辅助列的级别forcats::fct_inorder
  4. group在aes上映射辅助列
library(dplyr)
library(forcats)
library(ggplot2)

dy2 <- dy2 %>%
  arrange(generation, N) %>%
  mutate(
    device_type2 = paste(generation, device_type, sep = "_"),
    device_type2 = fct_inorder(device_type2)
  )

ggplot(dy2, aes(x = generation, y = N, fill = device_type, group = device_type2)) +
  geom_bar(position = position_dodge(), alpha = 0.85, stat = "identity") +
  geom_text(position = position_dodge(0.9), aes(y = N + 0.8, label = N), size = 3, show.legend = FALSE) +
  scale_fill_manual(
    name = NULL,
    values = c("blue", "black", "red", "green3", "cyan4", "purple"),
    breaks = c(
      "Accessories", "Aspiration_catheter", "Guidewire",
      "Microcatheter", "Sheath", "Stentretriever"
    )
  ) +
  theme_classic()

于 2022-03-03T19:33:02.133 回答