INTERACTIVE CLUSTERING MODEL FOR EXPLORATION OF GENOMIC DATA

    The complete genomic sequences for many organisms, particularly primitive organisms with relatively small genomes (prokaryotes), are now available. We describe an iterative clustering approach to support interactive exploration of patterns in genomic data. We compare the use of different data representations of gene sequences and different distance measures within this approach. The effectiveness of the representation and clustering approach were compared using a study of regulatory sites in archaea. The clustering approach combining positional weight matrices, the k-means clustering algorithm, and a visualization tool was shown to be the most effective. Users interact with the system by examining a visualization of the “average” pattern found in each cluster of the sequences under consideration and determining if further clustering or modified clustering is desired. We applied this algorithm to the problem of studying regulatory sites of genomic data of four different organisms and obtained some interesting findings about gene translation initiation patterns and gene transcription initiation patterns in archaea.

CLUSTERING RESULTS FOR ARCHAEA GENOMIC DATA

INPUT FORMAT

GC Content of Genome:
Number of sequences:
Length of sequences:
Window start site:
Window stop site:
Sequences


How To Cite:

1. Wan, X. 2002. Interactive clustering for exploration of genomic data. (Project report for Master degree in Computer Science)

2. Wan, X., S. M. Bridges, J. A. Boyle, and A. P. Boyle. 2002. Interactive clustering for exploration of genomic data. In Intelligent engineering systems through artificial neural networks, vol 12, eds. Dagli et al., 753-758. New York, NY: ASME Press.pdf

3. Wan, X., S. M. Bridges, and J. A. Boyle. 2004. Revealing gene transcription and translation initiation patterns in archaea using a interactive clustering model. Accepted for publication in Extremophiles.

 

If you have any questions or comments, please contact bridges@cse.msstate.edu