The genetic algorithm scheme for consensus sequences

Joshua W. Gilkerson, Jerzy W. Jaromczyk

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

A consensus sequence is a single sequence that represents characteristics of a family of sequences. Such synopses are most commonly used in the bioinformatics for sequence analysis. For example, algorithms that determine high quality consensus sequences are useful to construct a multiple alignment and consequently, a sequence logo (another representation that attempts to capture the important features of sequences). The determination of optimal consensus sequences is NP-hard (Gusfield). We present two new algorithms and compare them to earlier, published methods of determining consensus sequences. The first, CONSENSIZE, is an application of the Genetic Algorithm Scheme (GAS). The other is a simple steepest descent search, usually not very useful for NP-Hard problems, but surprisingly successful for this application. We discuss both algorithms and experimentally compare their accuracy and efficiency with the Simulated Annealing, Multiple Alignment and Center String approaches. Test results are presented on both synthetic data and biological sequences.

Original languageEnglish
Title of host publication2007 IEEE Congress on Evolutionary Computation, CEC 2007
Pages3870-3878
Number of pages9
DOIs
StatePublished - 2007
Event2007 IEEE Congress on Evolutionary Computation, CEC 2007 - , Singapore
Duration: Sep 25 2007Sep 28 2007

Publication series

Name2007 IEEE Congress on Evolutionary Computation, CEC 2007

Conference

Conference2007 IEEE Congress on Evolutionary Computation, CEC 2007
Country/TerritorySingapore
Period9/25/079/28/07

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Theoretical Computer Science

Fingerprint

Dive into the research topics of 'The genetic algorithm scheme for consensus sequences'. Together they form a unique fingerprint.

Cite this