Protein motifs are most commonly extracted from an initial multiple sequence alignment, but sometimes the training sequences are not strictly homologous, or the sequences contain repeated sequences, rearrangements, or other common situations that disrupt alignment approaches. For nucleic acid sequences, it is usually difficult or impossible to construct an accurate multiple alignment due to the small nucleic acid alphabet, four letters, and the rapid evolution of non-coding regions (such as promoter or enhancer regions in which one might want to find transcription factor binding sites).

Protein motifs are characteristics of protein families and can be used as tools for predicting protein functions. The focus of motif analysis seems to be shifting from metabolic enzymes to regulatory and structural proteins that contain more different motifs. The consideration of structural information greatly helps to identify motifs and their sensitivities. Genome sequencing provides the possibility to systematically analyze all the motifs present in a specific organism. Now, CD ComputaBio provides you with the help of protein motif prediction service!


Given a set of functionally related sequences, the main purpose of the motif discovery algorithm is to find frequent, unexpected or interesting new and a priori unknown motifs based on some formal standard. The methods used to find such motifs follow the same general pattern and can be divided into two categories: alignment-based methods and methods that search for motifs in unaligned sequences.

