Meta-MEME documentation
Meta-MEME is a software toolkit for building and using motif-based hidden Markov models of DNA and proteins. The input to Meta-MEME is a set of similar DNA or protein sequences, as well as a set of motif models discovered by MEME. Meta-MEME combines these models into a single, motif-based hidden Markov model and uses this model to search a sequence database for homologs.
Visit the Meta-MEME home page at the San Diego Supercomputer Center.
The Meta-MEME toolkit consists of four primary programs:
- mhmm: Build a motif-based HMM.
- mhmms: Search a sequence database using a motif-based HMM.
- mhmmscan: Search a sequence database using a motif-based HMM. This program is similar to mhmms, but allows multiple matches per sequence, and arbitrarily long sequences. The algorithm implemented in
mhmmscanhas also been namedMCAST.- tomtom: Measure the similarity of motifs.
The Meta-MEME software distribution also includes the following auxiliary programs:
- fimo: Scans a sequence database for occurences of given motifs using a postion weight matrix.
- motiph: Scans multiple sequence alignments for occurrences of given motifs, taking into account the phylogenetic tree relating the sequences.
- shadow: Performs phylogenetic shadowing on an alignment, using a given tree.
- mcast: A Perl script which uses mhmmscan to search a sequence database for statistically significant clusters of non-overlapping "hits" to a set of motifs.
- mhmm2html: Convert the text output from one of the five primary programs into HTML format.
- mhmme: Randomly generate sequences according to a given Meta-MEME model.
- draw-mhmm: Convert a motif-based HMM into a format suitable for drawing by the graphviz program from AT&T Research.
- fasta-get-markov: Estimate a Markov model from a FASTA file of sequences.
- transfac2meme: Converts a TRANSFAC matrix file to MEME output format.
- clustalw2fasta: Converts a Clustalw multiple alignment into FASTA format.
Additional documentation is available concerning
- the format for the FASTA sequence files required by mhmms and mhmmscan(with a sample),
- the format for the MEME file required by mhmm (with a sample),
- the format for the background file,
- the format for the TRANSFAC file used by mcast and transfac2meme, and
- how Meta-MEME determines motif order and spacing information from MEME output.
Sample output files are available for
- the motif-based HMMs produced by mhmm, and
- the database search results produced by mhmms.
Meta-MEME was developed by William Stafford Noble in the Department of Genome Sciences the University of Washington, and by Timothy Bailey at the University of Queensland, with input from Charles Elkan in the Department of Computer Science and Engineering at the University of California, San Diego and Michael Gribskov at the San Diego Supercomputer Center. Meta-MEME is funded by the National Biomedical Computation Resource.
Copyright information. Please send comments and questions to Charles Grant at cegrant@u.washington.edu