Frequent Itemsets, Closed Itemsets, and Association Rules
Let I =fI1, I2, : : : , Img be a set of items. Let D, the task-relevant data, be a set of database
transactions where each transaction T is a set of items such that T I . Each transaction
is associated with an identifier, called TID. Let A be a set of items. A transaction T is
said to contain A if and only if A T. An association rule is an implication of the form
A)B, where AI , BI , and AB=f. The rule A)B holds in the transaction set D
with support s, where s is the percentage of transactions in D that contain A[B (i.e., the
union of sets A and B, or say, both A and B). This is taken to be the probability, P(A[B).1
The rule A ) B has confidence c in the transaction set D, where c is the percentage of
transactions in D containing A that also contain B. This is taken to be the conditional
probability, P(BjA). That is,
support(A)B) = P(A[B) (5.2)
confidence(A)B) = P(BjA): (5.3)
Rules that satisfy both a minimum support threshold (min sup) and a minimum confidence
threshold (min conf) are called strong. By convention, we write support and confidence
values so as to occur between 0% and 100%, rather than 0 to 1.0.
A set of items is referred to as an itemset.2 An itemset that contains k items is a
k-itemset. The set fcomputer, antivirus softwareg is a 2-itemset. The occurrence
frequency of an itemset is the number of transactions that contain the itemset. This is
also known, simply, as the frequency, support count, or count of the itemset. Note that
the itemset support defined in Equation (5.2) is sometimes referred to as relative support,
whereas the occurrence frequency is called the absolute support. If the relative support
of an itemset I satisfies a prespecified minimum support threshold (i.e., the absolute
support of I satisfies the corresponding minimum support count threshold), then I is a
frequent itemset.3 The set of frequent k-itemsets is commonly denoted by Lk.4
From Equation (5.3), we have
confidence(A)B) = P(BjA) = support(A[B)
support(A)
= support count(A[B)
support count(A) : (5.4)
Equation (5.4) shows that the confidence of rule A)B can be easily derived from the
support counts of A and A[B. That is, once the support counts of A, B, and A[B are
Frequent Itemsets, Closed Itemsets, and Association RulesLet I =fI1, I2, : : : , Img be a set of items. Let D, the task-relevant data, be a set of databasetransactions where each transaction T is a set of items such that T I . Each transactionis associated with an identifier, called TID. Let A be a set of items. A transaction T issaid to contain A if and only if A T. An association rule is an implication of the formA)B, where A I , B I , and AB=f. The rule A)B holds in the transaction set Dwith support s, where s is the percentage of transactions in D that contain A[B (i.e., theunion of sets A and B, or say, both A and B). This is taken to be the probability, P(A[B).1The rule A ) B has confidence c in the transaction set D, where c is the percentage oftransactions in D containing A that also contain B. This is taken to be the conditionalprobability, P(BjA). That is,support(A)B) = P(A[B) (5.2)confidence(A)B) = P(BjA): (5.3)Rules that satisfy both a minimum support threshold (min sup) and a minimum confidencethreshold (min conf) are called strong. By convention, we write support and confidencevalues so as to occur between 0% and 100%, rather than 0 to 1.0.A set of items is referred to as an itemset.2 An itemset that contains k items is ak-itemset. The set fcomputer, antivirus softwareg is a 2-itemset. The occurrencefrequency of an itemset is the number of transactions that contain the itemset. This isalso known, simply, as the frequency, support count, or count of the itemset. Note thatthe itemset support defined in Equation (5.2) is sometimes referred to as relative support,whereas the occurrence frequency is called the absolute support. If the relative supportof an itemset I satisfies a prespecified minimum support threshold (i.e., the absolutesupport of I satisfies the corresponding minimum support count threshold), then I is afrequent itemset.3 The set of frequent k-itemsets is commonly denoted by Lk.4From Equation (5.3), we haveconfidence(A)B) = P(BjA) = support(A[B)support(A)= support count(A[B)support count(A) : (5.4)Equation (5.4) shows that the confidence of rule A)B can be easily derived from thesupport counts of A and A[B. That is, once the support counts of A, B, and A[B are
การแปล กรุณารอสักครู่..

Frequent itemsets, Closed itemsets, and Association Rules
Let FI1 = I, I2,:::, Img be a SET of items. Let D, the Task-relevant Data, Database be a SET of
transactions where each Transaction T is a SET of items such that T? I. Each Transaction
is associated with an Identifier, TID Called. Let A be a set of items. A Transaction T is
said to contain A if and only if A? T. An association Rule is an implication of the form
A) B, where A? I, B? I, and A B = F. The Rule A) B holds in the Transaction SET D
Support with s, where s is the PERCENTAGE of transactions in D that contain A [B (IE, the
Union of sets A and B, or Say, both A and B). This is taken to be the probability, P (A [B) .1
The Rule A) B has confidence in the Transaction SET D C, where C is the PERCENTAGE of
transactions in D containing A that also contain B. This is taken to. be the conditional
probability, P (BJA). That is,
Support (A) B) = P (A [B) (5.2)
confidence (A) B) = P (BJA): (5.3)
Rules that Satisfy both a minimum Support THRESHOLD (min sup) and a minimum confidence.
threshold (min conf) are called strong. By Convention, we Write Support and confidence
values so as to occur between 0% and 100%, rather than 0 to 1.0.
A SET of items is referred to as an Itemset.2 An Itemset that contains items K is a
K-Itemset. The set fcomputer, antivirus softwareg is a 2-itemset. The occurrence
frequency of an Itemset is the Number of transactions that contain the Itemset. This is
also Known, simply, as the frequency, Support COUNT, or COUNT of the Itemset. Note that
the Itemset Support defined in Equation (5.2) is sometimes referred to as Relative Support,
whereas the occurrence frequency is Called the Absolute Support. If the Relative Support
of an Itemset I satisfies a prespecified minimum THRESHOLD Support (IE, the Absolute
of Support I Support satisfies the corresponding minimum COUNT THRESHOLD), then I is a
Frequent Itemset.3 The K-SET of Frequent itemsets is commonly denoted by. Lk.4
From Equation (5.3), we have
confidence (A) B) = P (BJA) Support = (A [B)
Support (A)
= Support COUNT (A [B)
Support COUNT (A): (5.4).
Equation (5.4) shows that the confidence of Rule A) B Can be easily derived from the
counts of Support A and A [B. That is, once the support counts of A, B, and A [B are.
การแปล กรุณารอสักครู่..
