MACSIMS: User Guide
MACSIMS (Multiple Alignment of Complete Sequences Information Management System) is a multiple alignment-based information management system that combines the advantages of both knowledge-based and ab initio sequence analysis methods. Structural and functional information is mined automatically from the public databases. In the MACS, homologous regions are identified and the mined data is evaluated and propagated from known to unknown sequences with these reliable regions. MACSIMS provides a unique environment that facilitates knowledge extraction and the presentation of the most pertinent information to the biologist.
You can input an alignment in any of the most widely used formats, including MSF, FASTA and ClustalW. For sequences with Swissprot/TrEMBL identifiers, information is automatically mined from the public databases. The retrieved sequence features will be cross-validated and consistent features will be propagated from the characterised sequences to the uncharacterised ones. A number of ab initio predictions are also performed, including low complexity regions, coiled coil regions and transmembrane helices.
The processing may take a few minutes. You can either wait for the results to appear, or you can bookmark the page and come back to it later. Results wil be kept on the server for one week. You can retrieve previous results from the home page by entering the job ID number in the "Load previous session".
The MACSIMS results are displayed graphically using the JalView applet. The applet will allow you to view the alignment, select sequence features to be displayed, edit the alignmnet, build a phylogenetic tree, etc. You can also save the alignment in a number of different output formats.
MACSIMS has been validated in two large scale tests. All the alignments are available for viewing with JalView here :
MACSIMS takes advantage of the MAO Multiple Alignment Ontology to integrate the information mined from the public databases in the framewoek of the multiple alignment. The current ontology elements annotated by MACSIMS are:
MAO element name | Description | Data source |
aln-name | name of the MACS | user supplied |
aln-score | quality score for the MACS | calculated by NorMD in MACSIMS |
MAO element name | Description | Data source |
seq-name | sequence name | MACS |
accession | sequence accession number | Uniprot/PDB |
nid | sequence identifier | Uniprot/PDB |
definition | sequence function definition | Uniprot/PDB |
goxref | GO cross-reference | Uniprot |
organism | organism | Uniprot/PDB |
taxid | NCBI taxid | Uniprot |
taxon | organism lineage | Uniprot |
hydrophobicity | sequence mean hydrophobicity | calculated in MACSIMS |
Sequence features are associated with a specified segment of a sequence and are defined by their type, start and end positions, and a textual annotation.
Feature type | Description | Data source | Feature class |
DOMAIN | structural or functional domain | Uniprot | text from Uniprot |
PFAM-A | Pfam structural or functional domain | Pfam | PFnnnnn |
PROSITE | Prosite domain or motif pattern | Prosite | PSnnnnn |
STRUCT | Secondary structure element | Uniprot/PDB | HELIX or STRAND |
MODRES | Modified residue | Uniprot | text from Uniprot |
SITE | Active site | Uniprot | text from Uniprot |
VARSPLIC | Splicing variant | Uniprot | text from Uniprot |
VARIANT | Residue variants or mutations | Uniprot | text from Uniprot |
BLOCK | Conserved core block | calculated in MACSIMS | SBLOCK |
REGION | Homologous region | calculated in MACSIMS | REGION |
LOWCOMP | Low complexity segment | calculated in MACSIMS | LOWC |
TRANSMEM | Potential transmembrane helix | calculated in MACSIMS | TM |
COIL | Potential coiled coil | calculated in MACSIMS | COIL |
The feature class is prefixed by PROP_ if the information has been propagated from another sequence and the source sequence name is appended to the feature name.
The feature class is prefixed by PRED_ if the information has been predicted by an ab initio method in MACSIMS.