TY - GEN
T1 - OP-cluster
T2 - 3rd IEEE International Conference on Data Mining, ICDM '03
AU - Liu, Jinze
AU - Wang, Wei
PY - 2003
Y1 - 2003
N2 - Clustering is the process of grouping a set of objects into classes of similar objects. Because of unknownness of the hidden patterns in the data sets, the definition of similarity is very subtle. Until recently, similarity measures are typically based on distances, e.g Euclidean distance and cosine distance. In this paper, we propose a flexible yet powerful clustering model, namely OP-Cluster (Order Preserving Cluster). Under this new model, two objects are similar on a subset of dimensions if the values of these two objects induce the same relative order of those dimensions. Such a cluster might arise when the expression levels of (coregulated) genes can rise or fall synchronously in response to a sequence of environment stimuli. Hence, discovery of OP-Cluster is essential in revealing significant gene regulatory networks. A deterministic algorithm is designed and implemented to discover all the significant OP-Clusters. A set of extensive experiments has been done on several real biological data sets to demonstrate its effectiveness and efficiency in detecting co-regulated patterns.
AB - Clustering is the process of grouping a set of objects into classes of similar objects. Because of unknownness of the hidden patterns in the data sets, the definition of similarity is very subtle. Until recently, similarity measures are typically based on distances, e.g Euclidean distance and cosine distance. In this paper, we propose a flexible yet powerful clustering model, namely OP-Cluster (Order Preserving Cluster). Under this new model, two objects are similar on a subset of dimensions if the values of these two objects induce the same relative order of those dimensions. Such a cluster might arise when the expression levels of (coregulated) genes can rise or fall synchronously in response to a sequence of environment stimuli. Hence, discovery of OP-Cluster is essential in revealing significant gene regulatory networks. A deterministic algorithm is designed and implemented to discover all the significant OP-Clusters. A set of extensive experiments has been done on several real biological data sets to demonstrate its effectiveness and efficiency in detecting co-regulated patterns.
UR - http://www.scopus.com/inward/record.url?scp=78149310670&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78149310670&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:78149310670
SN - 0769519784
SN - 9780769519784
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 187
EP - 194
BT - Proceedings - 3rd IEEE International Conference on Data Mining, ICDM 2003
Y2 - 19 November 2003 through 22 November 2003
ER -