LEGGLE - LEGume Gene and Locus Evolution

LEGGLE is a collection of gene families and related analyses and tools, focusing on gene families and phylogenies in the legumes. The families should extend back to the eudicots -- i.e. should include sequences (when available) from Arabidopsis and grape, as well as from soybean, Medicago, Lotus, and some other legume transcript sets. Method details: More [+]

  • The gene families are based on the Phytozome v.7 "core eudicot"-level families, although have been refined with respect to legume sequences: the current Medicago gene models (v3.5.2) replace the Phytozome v3.0 models, Lotus (v2.5) models have been added, alignments and phylogenies have been accordingly recalculated, and transcriptome assemblies have been added for selected other species. Included species in the core alignments are Arabidopsis thaliana (TAIR release 9), Glycine max (v.1.01), Vitis vinifera (March 2010 release), Medicago truncatula (v3.5.2), and Lotus japonicus (v2.5). Alignments for each gene family were generated using MUSCLE on peptide sequences. HMM profiles were calculated using hmmer 3.0 (hmmbuild), and sequences were realigned using hmmalign. Alignments were cleaned of poorly aligning columns (those outside the "match states" in the HMM), and then alignments with too few aligned characters were removed (when containing fewer than 40% of the match-state residues). Phylogenetic trees were calculated using maximum likelihood, implemented in RAxML.

Each gene family page contains alignments, gene trees, annotations, genomic locations, and other resources. Search for a gene family by subitting a keyword or phrase. Or, search for gene families with similarity to a sequence of interest:

Search by keywords

Search for a gene family by submitting a keyword, phrase, or gene symbol. Use * as a wildcard
e.g. Glyma03g33160, AT1G03370, glucose, Tfl1, homeobox leucine zipper, ...

Search by sequence

Use a single sequence (protein or nucleotide) to look for gene families with similar sequences. The query is compared to gene family consensus sequences.
Repeat Filter: E_value: