It’s been a long time since my last post, here I am back again.

**Introduction to Data Mining** by Pang-Ning Tan, Michael Steinbach and Vipin Kumar [1] is a good introductory textbook in Data Mining. The book has been translated into Hungarian and will hopefully be published in my country this year. Actually, I am one of the translators of the Hungarian edition.

Apriori is a classic algorithm for mining association rules. Chapter 6 of the book discusses the Apriori algorithm. Unfortunately, I found that the pseudocodes for the rule generation step (see Algorithm 6.2 and 6.3 on pages 351 and 352) do not work as expected. These two pseudocodes are the following:

Here denotes the support count of the itemset and the function `apriori-gen`

generates the set of frequent -itemsets from the set of frequent -itemsets.

The main problem is that Algorithm 6.2 and 6.3 above will never generate rules with 1-item consequents. In the original paper that introduces the Apriori algorithm [2] the set on line 2 of Algorithm 6.2 is defined as the set of consequents of rules derived from with one item in the consequent. However, this implicitly assumes that rules with 1-item consequents are already available. [2] also states that a separate algorithm is required to generate these rules (see page 14):

The rules having one-item consequents in step 2 of this algorithm can be found by using a modified version of the preceding

`genrules`

function in which steps 8 and 9 are deleted to avoid the recursive call.

It should also be noted that line 2 of Algorithm 6.2 is simply equivalent to .

Finally, the formula on line 2 of Algorithm 6.3 is misleading, since vertical bars are traditionally used to denote cardinality. (In our case is not the cardinality of set .) I think that the first two lines of Algorithm 6.3 are unnecessary and can be omitted.

Since the book is widely used as a textbook the above problems should be corrected. I have reported the problems to the authors, I hope that they will update the errata of the book accordingly.

Algorithm 6.2 and 6.3 can be modified as follows:

These modified algorithms work as expected and will generate all rules including the ones with 1-item consequents.

## References

- Pang-Ning Tan, Michael Steinbach and Vipin Kumar. Introduction to Data Mining. Addison-Wesley, 2005.
- Rakesh Agrawal and Ramakrishnan Srikant. Fast Algorithms for Mining Association Rules. Proceedings of the 20th International Conference on Very Large Data Bases, Santiago, Chile, September 1994.