TY - GEN
T1 - Biclustering in gene expression data by tendency
AU - Liu, Jinze
AU - Yang, Jiong
AU - Wang, Wei
PY - 2004
Y1 - 2004
N2 - The advent of DNA microarray technologies has revolutionized the experimental study of gene expression. Clustering is the most popular approach of analyzing gene expression data and has indeed proven to be successful in many applications. Our work focuses on discovering a subset of genes which exhibit similar expression patterns along a subset of conditions in the gene expression matrix. Specifically, we are looking for the Order Preserving clusters (OP-Cluster), in each of which a subset of genes induce a similar linear ordering along a subset of conditions. The pioneering work of the OPSM model[3], which enforces the strict order shared by the genes in a cluster, is included in our model as a special case. Our model is more robust than OPSM because similarly expressed conditions are allowed to form order equivalent groups and no restriction is placed on the order within a group. Guided by our model, we design and implement a deterministic algorithm, namely OPC-Tree, to discover OP-Clusters. Experimental study on two real datasets demonstrates the effectiveness of the algorithm in the application of tissue classification and cell cycle identification. In addition, a large percentage of OP-Clusters exhibit significant enrichment of one or more function categories, which implies that OP-Clusters indeed carry significant biological relevance.
AB - The advent of DNA microarray technologies has revolutionized the experimental study of gene expression. Clustering is the most popular approach of analyzing gene expression data and has indeed proven to be successful in many applications. Our work focuses on discovering a subset of genes which exhibit similar expression patterns along a subset of conditions in the gene expression matrix. Specifically, we are looking for the Order Preserving clusters (OP-Cluster), in each of which a subset of genes induce a similar linear ordering along a subset of conditions. The pioneering work of the OPSM model[3], which enforces the strict order shared by the genes in a cluster, is included in our model as a special case. Our model is more robust than OPSM because similarly expressed conditions are allowed to form order equivalent groups and no restriction is placed on the order within a group. Guided by our model, we design and implement a deterministic algorithm, namely OPC-Tree, to discover OP-Clusters. Experimental study on two real datasets demonstrates the effectiveness of the algorithm in the application of tissue classification and cell cycle identification. In addition, a large percentage of OP-Clusters exhibit significant enrichment of one or more function categories, which implies that OP-Clusters indeed carry significant biological relevance.
KW - Biclustering
KW - Gene expression data
KW - Microarray data
KW - Order preserving
UR - http://www.scopus.com/inward/record.url?scp=14044267561&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=14044267561&partnerID=8YFLogxK
M3 - Conference contribution
C2 - 16448012
AN - SCOPUS:14044267561
SN - 0769521940
SN - 9780769521947
T3 - Proceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004
SP - 182
EP - 193
BT - Proceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004
T2 - Proceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004
Y2 - 16 August 2004 through 19 August 2004
ER -