1

所以,我试图让 arules 处理我的数据,我有 transaction_ID、Item_name 和 Item_ID。但是如果我为 item_name 和 transaction_ID 调用 apriori 函数,太慢了,但如果我用 item_id 和 transaction_ID 调用它真的很快。那么,有没有办法用 item_id 创建规则,然后用 ids 代替它的真实姓名?这是一个可以使用的代码示例:

library(arules)
library(arulesViz)

products <- c(1,1,1,3,4,5,6,4)
transaction_id <- c(2,2,3,3,3,4,4,4)
dataset <- data.frame(products ,transaction_id)
dataset
transaction <- as(split(dataset[,"products"],dataset[,"transaction_id"]), "transactions")

rule <- apriori(transaction, parameter = list(supp = 0.001, conf = 0.8))

inspect(rule)

products_id <- c(1,3,4,5,6)
names <- c("nail","Black Hammer 127","White desk 12","green desk","pink car")

cod <- data.frame(products = products_id, names)
4

1 回答 1

2

最好的方法是使用函数替换交易中的项目标签itemLabels

itemLabels(transaction)
  [1] "1" "3" "4" "5" "6"
itemLabels(transaction) <- c("nail","Black Hammer 127","White desk 12","green desk","pink car")
rule <- apriori(transaction, parameter = list(supp = 0.001, conf = 0.8))
inspect(rules)
      lhs                                 rhs                support   confidence
 [1]  {Black Hammer 127}               => {nail}             0.3333333 1 
 ...

split比较慢。中的示例?transactions说明了使用拆分:

## Note: This is very slow for large datasets. It is much faster to 
## read transactions in this format from disk using read.transactions() 
## with format = "single".
于 2018-01-09T04:48:08.583 回答