一种方法是一次使用mapply
每个单词并将其传递给qdap::synonyms
. 'synonyms' 的结果可以使用paste0
函数 with折叠在列中collapse = "|"
。现在数据准备好了。用于tidyr::separate
将列分隔为Syn1
等Syn2
。
注意: synonyms
用两个参数调用return.list = FALSE, multiwords = FALSE
下面的代码对最大10
同义词有限制,但可以改进解决方案以动态处理数字。
library(tidyverse)
library(qdap)
df %>%
mutate(Synonyms =
mapply(function(x)paste0(
head(synonyms(x, return.list = FALSE, multiwords = FALSE),10), collapse = "|"),
tolower(.$Word))) %>%
separate(Synonyms, paste("Syn",1:10), sep = "\\|", extra = "drop" )
结果:
# SlNO Word Syn 1 Syn 2 Syn 3 Syn 4 Syn 5 Syn 6 Syn 7 Syn 8 Syn 9 Syn 10
# 1 1 get achieve acquire attain bag bring earn fetch gain glean inherit
# 2 2 Free buckshee complimentary gratis gratuitous unpaid footloose independent liberated loose uncommitted
# 3 3 Joshi <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 4 4 Hello <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
# 5 5 New advanced all-singing all-dancing contemporary current different fresh ground-breaking happening latest
数据
df <- read.table(text =
"SlNO Word
1 get
2 Free
3 Joshi
4 Hello
5 New",
header = TRUE, stringsAsFactors = FALSE)