第一次尝试可能如下所示:
//First Calculate how often any product pair was bought together
//The time/memory should be about Sum over all Customers of Customer.BoughtProducts^2
Dictionary<Pair<ProductID,ProductID>> boughtTogether=new Dictionary<Pair<ProductID,ProductID>>();
foreach(Customer in Customers)
{
foreach(product1 in Customer.BoughtProducts)
foreach(product2 in Customer.BoughtProducts)
{
int counter=boughtTogether[Pair(product1,product2)] or 0 if missing;
counter++;
boughtTogether[Pair(product1,product2)]=counter;
}
}
boughtTogether.GroupBy(entry.Key.First).Select(group.OrderByDescending(entry=>entry.Value).Take(10).Select(new{key.Second as ProductID,Value as Count}));
首先,我计算每对产品一起购买的频率,然后按产品对它们进行分组,并选择与它一起购买的前 20 种其他产品。结果应放入某种由产品 ID 键入的字典中。
对于大型数据库,这可能会变得太慢或占用太多内存。