我有一个包含拼写错误和不一致的大学名称列表。我需要将它们与大学名称的官方列表进行匹配,以将我的数据链接在一起。
我知道模糊匹配/加入是我要走的路,但我对正确的方法有点迷失。任何帮助将不胜感激。
d<-data.frame(name=c("University of New Yorkk", "The University of South
Carolina", "Syracuuse University", "University of South Texas",
"The University of No Carolina"), score = c(1,3,6,10,4))
y<-data.frame(name=c("University of South Texas", "The University of North
Carolina", "University of South Carolina", "Syracuse
University","University of New York"), distance = c(100, 400, 200, 20, 70))
我想要一个让它们尽可能紧密地融合在一起的输出
matched<-data.frame(name=c("University of New Yorkk", "The University of South Carolina",
"Syracuuse University","University of South Texas","The University of No Carolina"),
correctmatch = c("University of New York", "University of South Carolina",
"Syracuse University","University of South Texas", "The University of North Carolina"))