Ordalie is dedicated to the analysis of protein multiple sequence alignments. Although it can read DNA/RNA alignments, most of its functionalities will be disabled. Ordalie can still be used to view or edit such alignments.
Ordalie can read and write the fasta, MSF, RSF, ClustalW, Macsim/XML and ORD (Ordalie file format) file formats. Once the alignment is loaded, Ordalie tries to recognize if the sequences names are UniProt, RefSeq, or Protein Data Bank (PDB) accessions names. If a sequence name is prefixed by a database identifier (for example, sw for swissprot, gi for Gene Identifier, pdb for PDB) the prefix will be removed by default. Thus, the sequence name >sw|P12345
will appear as P12345
in Ordalie. The list of recognized bank prefixes and their separator can be changed through the 'Preferences' menu item.
If sequence names are proper databases accession, Ordalie can then fetch information on these databases.
All the different menus are described in detail in section 6 of this manual. In short, the “File” menu manages input/output files, as well as adding sequences or printing. The “View” menu controls the appearence of the user interface. It contains options to toggle on or off parts of the main window, to change the font size, or to toggle the full-screen mode. The “Sequence” menu allows to change the sequence names, browse, edit or retrieve sequence information, search for sequence motif, compute sequences identity, access the vectorial representation of protein. The “Alignment” menu gives access to all tools linked to the alignment: creation of a Macsim, alignment editor, clustering, phylogenetic tree, features editor, ... The “Structure” menu is dedicated to the structural analysis of the alignment if any sequence corresponding to a 3D structure is present. The menu gives access to a structure superposition module, the 3D viewer, a secondary structure coloring scheme according to sequence conservation, and allows to save PDB files.
Below the menus, the icon bar gives a direct access to some of the most useful menu items. When the mouse pointer is above a button, a small message box describing the button's action appears.
As previously mentioned, working with an alignment may lead to several trials in terms of sequence clustering or even amino acid alignments. A trial can be saved as a snapshot of the loaded alignment. A given snapshot can also contain a different set of sequences than the original loaded alignment in case of deletion or addition of sequences.
From left to right, the combobox allows to select a given snapshot. The “Annotation” button shows or hides the annotation of the current snapshot if they exist. Annotations are created through the “Annotate Alignment” item in the Alignment menu. The “View Zone” button toggles the zone used to make the clustering of the given snapshot if it has been clustered. The “Info” button pops up a windows displaying the information relative the the snapshot. These information are sought when creating the snapshot. The “Reset” button will reload the current snapshot which will erase all changes made so far. The “Overwrite” button saves the current changes to the current snapshot while the “New” button creates a new snapshot.
The sequence names are displayed on the left part of the frame, the amino acid sequences on the right part.
|
|||
The sequence names highlighted in red correspond to PDB sequences. If there is information associated to a given sequence (present in Macsim/XML, ORD files or retrieved on-line, see 6.4.4) a yellow message window containing a description of the current sequence appears above the sequence pointed by the mouse pointer. A right-click (mouse button-3) on a given sequence name displays a more detailed message window containing the accession, the bank ID, the organism, the length and the description of the sequence.
Below the sequence names, an entry box allows the user to search a sequence by its name, or part of its name. After hitting <Return> the first sequence found will be displayed as the top sequence in the window.
The right part of the frame contains the alignment itself (amino acids sequences), the ruler, indicating the position of the column, the horizontal and vertical scrollbars and the position counter. Any mouse motion above the amino acid sequences will update the position counter that shows two positions for the residue below the mouse pointer: the 'seq' position is the position of the residue inside its sequence, the 'gen' position corresponds to the position of that residue inside the alignment.
|
|||
When a given feature is displayed, moving the mouse over the feature will display the note associated with it, for example, in the case of a PFAM domain, the description of the domain will be shown. If there are severeal features superposed, the first description corresponds to the top feature.
This frame is not shown by default. When residue conservation has been computed, a score is assigned to each column of the alignment and groups. The Scores frame shows these normalized scores (between 0 and 100) for each column, the colour of the score line corresponding to the group color, the black line corresponding to the whole alignment.
It is also possible to jump from position to position using the numeric keypad and the left and right arrows. For example, by typing '200' + <Right Arrow> key, the window will go 200 positions to the right. Similarly, typing '500' + <Left Arrow> key will scroll the alignment 500 positions to the left.
The Control Panel is at the bottom of the main window. When available, this frame contains buttons corresponding to the available features of the current alignment, one button for one feature. Pressing a button will render the button red, and display the feature on the alignment.
|
|||
When changing tool, the content of the Control panel will change according to the tool. The content of the Control panel will be described in each tool section.
Features are a central concept in Ordalie. A Feature can be defined as a characteristic attached to a sequence, a group of sequences or to the global alignment. A sequence / group / alignment feature can contain several items (for example, a sequence feature can contain several PFAM domains). One of the strength of Ordalie is its ability to investigate these features in different contexts, for example in the structural context of the protein.
Features are imported into Ordalie through the Macsims program XML output file [10], or using a dedicated feature file format (see section 7.4), or manually defined using the 'Feature tool'. This tool allows feature creation, edition or deletion (see section 4.5).
In Ordalie, a feature is defined by the sequence(s) it applies to, a start and stop position, a color, an associated score, a note and a coordinates system ("global" for alignment position or "local" for sequence relative position).
Ordalie can display, modify or create new features or items of features. Features can usually be displayed and selected in all modes (a special mode, the 'Feature mode' is dedicated to features editing).
Ordalie is arranged around tools. To achieve an action in Ordalie, one should enter the corresponding tool. For example, editing an alignment requires to go into the 'Editor' tool, computing a phylogenetic tree to enter the 'Tree' tool, etc ... All tools will be described in detail in section 4.
|
|||
In this manual, the mouse left, middle and right buttons will be designed as <B1> or <Button-1>, <B2> or <Button-2>, <B3> or <Button-3> respectively. Any words enclosed by '<' and '>' refer to the corresponding keyboard key.
Within tools, sequence names selection and aminoacid sequence range selectio are always achieved using the same mechanisms :
Sequences names can be selected by left-clicking on their names. The selection mechanism obeys standard rules :
|
Sequences Cut/Copy/Paste is available at any time, and allows the user to duplicate, remove or change sequence order.
|
|||
|
|
|||
Several zones can be defined one after the other, either by left/right clicks and/or feature selection.
In order to manage snapshots, features, 3D structures, etc... Ordalie internally embeds a SQLite database [3]. This database is lightweight, and can easily be copied or moved around. The Ordalie file format (.ord extension) is in fact the SQLite database itself.
The scheme of the database can be found in Appendix 7.2.
In short, the database contains :
As Ordalie files (the SQLite database) contain all the information, it should be prefered as being the default working format.