Not same generation as paper #21

Romathonat · 2018-01-08T16:41:55Z

In the paper from Agrawal, the generation of candidates is different (formulated in sql query).
It is said that from itemsets (1,2,3), (1,2,4), (1,3,4), (1,3,5), (2,3,4), you have only (1,2,3,4) and (1,3,4,5) generated. With your approach, you have another, which is (1,3,4,5).

def joinSet(itemSet, length):
  """Join a set with itself and returns the n-element itemsets"""
  return set([i.union(j) for i in itemSet for j in itemSet if len(i.union(j)) == length])
  
  
itemset = [frozenset({1, 2, 3}), frozenset({1, 2, 4}), frozenset({1, 3, 4}), frozenset({1, 3, 5}), frozenset({2, 3, 4})]

print(joinSet(itemset, 4))

{frozenset({1, 2, 3, 4}), frozenset({1, 2, 3, 5}), frozenset({1, 3, 4, 5})}

Moreover, I think you switched the name of variables currentLSet and currentCSet (L for Large itemset and C for Candidate itemset) here

wangych6 · 2021-10-10T16:41:05Z

I have the same confusion on the code, I think u're right.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not same generation as paper #21

Not same generation as paper #21

Romathonat commented Jan 8, 2018 •

edited

Loading

wangych6 commented Oct 10, 2021

Not same generation as paper #21

Not same generation as paper #21

Comments

Romathonat commented Jan 8, 2018 • edited Loading

wangych6 commented Oct 10, 2021

Romathonat commented Jan 8, 2018 •

edited

Loading