hgplogo1_5.jpg (12899 bytes)

Bioinformatics Abstracts

DOE Human Genome Program
Contractor-Grantee Workshop VIII
February 27-March 2, 2000  Santa Fe, NM

Home

Author Index

Sequencing
Table of Contents
Abstracts   

Instrumentation
Table of Contents
Abstracts

Mapping 
Table of Contents
Abstracts

Bioinformatics
Table of Contents
Abstracts

Function and cDNA Resources
Table of Contents
Abstracts

Microbial Genome Program
Table of Contents
Abstracts

Ethical, Legal, and Social Issues
Table of Contents
Abstracts

Infrastructure

Table of Contents
Abstracts

Ordering Information

Abstracts from
Past Meetings

72. Classification of Multi-Aligned Sequence Using Monotone Linkage Clustering and Alignment Segmentation

Eric Poe Xing, Ilya Muchnik1, Manfred Zorn, and Sylvia Spengler

Center for Bioinformatics and Computational Genomics, Lawrence Berkeley National Laboratory, Berkeley, CA 94720 and 1DIMACS, Rutgers University, Piscataway, NJ 08855

EPXing@lbl.gov

Optimal clustering of a set of sequence based on arbitrary set function is often of exponential complexity. In this paper, a low order polynomial procedure, which is based on the quasi-concavity of a special type of objective functions, was developed to cluster the multi-aligned sequences based on each of the segments resulted from the aforementioned segmentation process. It clusters sequences according to their degree of similarity to a pre-specified reference pattern (i.e. a consensus sequence or a particular organismal sequence of choice). A combination of such clustering from multiple segments results in a fairly fine-grained classification of all the sequences in the alignment, with a general pattern that is reminiscent of the branching order in a corresponding phylogenetic tree, but with additional information regarding the assumption of modular evolution. This algorithm can be applied to a broad spectrum of molecular sequence analysis purposes such as phylogenetic subtree construction or recognition, tree updating and labeling, and can serve as a framework to organize sequence data in an efficient and easily searchable manner.

hgplogo1_5.jpg (12899 bytes) The online presentation of this publication is a special feature of the Human Genome Project Information Web site.