Subsections

3 Ordalie Basics

This section will shortly present some of the fundamental aspects of Ordalie. All the following sub-sections will be treated in more detail in subsequent sections of this manual.

3.1 Alignments and sequences names

Ordalie is dedicated to the analysis of protein multiple sequence alignments. Although it can read DNA/RNA alignments, most of its functionalities will be disabled. Ordalie can still be used to view or edit such alignments.

Ordalie can read and write the fasta, MSF, RSF, ClustalW, Macsim/XML and ORD (Ordalie file format) file formats. Once the alignment is loaded, Ordalie tries to recognize if the sequences names are UniProt, RefSeq, or Protein Data Bank (PDB) accessions names. If a sequence name is prefixed by a database identifier (for example, sw for swissprot, gi for Gene Identifier, pdb for PDB) the prefix will be removed by default. Thus, the sequence name >sw|P12345 will appear as P12345 in Ordalie. The list of recognized bank prefixes and their separator can be changed through the 'Preferences' menu item.

If sequence names are proper databases accession, Ordalie can then fetch information on these databases.

3.2 The Main Window

The Ordalie main window can be separated in several parts, from top to bottom (see fig. 2).

Figure 2: Ordalie main window
Image ordalie_mainwindow

3.2.1 The Menus and the icons bar

All the different menus are described in detail in section 6 of this manual. In short, the “File” menu manages input/output files, as well as adding sequences or printing. The “View” menu controls the appearence of the user interface. It contains options to toggle on or off parts of the main window, to change the font size, or to toggle the full-screen mode. The “Sequence” menu allows to change the sequence names, browse, edit or retrieve sequence information, search for sequence motif, compute sequences identity, access the vectorial representation of protein. The “Alignment” menu gives access to all tools linked to the alignment: creation of a Macsim, alignment editor, clustering, phylogenetic tree, features editor, ... The “Structure” menu is dedicated to the structural analysis of the alignment if any sequence corresponding to a 3D structure is present. The menu gives access to a structure superposition module, the 3D viewer, a secondary structure coloring scheme according to sequence conservation, and allows to save PDB files.

Below the menus, the icon bar gives a direct access to some of the most useful menu items. When the mouse pointer is above a button, a small message box describing the button's action appears.

3.2.2 The Snapshot bar

As previously mentioned, working with an alignment may lead to several trials in terms of sequence clustering or even amino acid alignments. A trial can be saved as a snapshot of the loaded alignment. A given snapshot can also contain a different set of sequences than the original loaded alignment in case of deletion or addition of sequences.

From left to right, the combobox allows to select a given snapshot. The “Annotation” button shows or hides the annotation of the current snapshot if they exist. Annotations are created through the “Annotate Alignment” item in the Alignment menu. The “View Zone” button toggles the zone used to make the clustering of the given snapshot if it has been clustered. The “Info” button pops up a windows displaying the information relative the the snapshot. These information are sought when creating the snapshot. The “Reset” button will reload the current snapshot which will erase all changes made so far. The “Overwrite” button saves the current changes to the current snapshot while the “New” button creates a new snapshot.

3.2.3 The Alignment Frame

The sequence names are displayed on the left part of the frame, the amino acid sequences on the right part.


   
Image ampoule2
$<$Mouse-wheel$>$ scrolls names and sequences up and down.
$<$Control$>$ + <Mouse wheel> scrolls the amino acid sequences horizontally.
   

3.2.3.1 Sequence names


The sequence names highlighted in red correspond to PDB sequences. If there is information associated to a given sequence (present in Macsim/XML, ORD files or retrieved on-line, see 6.4.4) a yellow message window containing a description of the current sequence appears above the sequence pointed by the mouse pointer. A right-click (mouse button-3) on a given sequence name displays a more detailed message window containing the accession, the bank ID, the organism, the length and the description of the sequence.

Below the sequence names, an entry box allows the user to search a sequence by its name, or part of its name. After hitting <Return> the first sequence found will be displayed as the top sequence in the window.

3.2.3.2 Amino acid sequences


The right part of the frame contains the alignment itself (amino acids sequences), the ruler, indicating the position of the column, the horizontal and vertical scrollbars and the position counter. Any mouse motion above the amino acid sequences will update the position counter that shows two positions for the residue below the mouse pointer: the 'seq' position is the position of the residue inside its sequence, the 'gen' position corresponds to the position of that residue inside the alignment.


   
Image ampoule2
The position within the sequence is referred to as the local position, the position within the alignment is referred to as the global position.
   

When a given feature is displayed, moving the mouse over the feature will display the note associated with it, for example, in the case of a PFAM domain, the description of the domain will be shown. If there are severeal features superposed, the first description corresponds to the top feature.

3.2.4 The Scores frame

This frame is not shown by default. When residue conservation has been computed, a score is assigned to each column of the alignment and groups. The Scores frame shows these normalized scores (between 0 and 100) for each column, the colour of the score line corresponding to the group color, the black line corresponding to the whole alignment.

It is also possible to jump from position to position using the numeric keypad and the left and right arrows. For example, by typing '200' + <Right Arrow> key, the window will go 200 positions to the right. Similarly, typing '500' + <Left Arrow> key will scroll the alignment 500 positions to the left.

3.2.5 The Control Panel

The Control Panel is at the bottom of the main window. When available, this frame contains buttons corresponding to the available features of the current alignment, one button for one feature. Pressing a button will render the button red, and display the feature on the alignment.


   
Image ampoule2
The features are displayed in the order the buttons are pressed. To put a feature over an other one, play with the buttons !
   

When changing tool, the content of the Control panel will change according to the tool. The content of the Control panel will be described in each tool section.

3.3 Features

Features are a central concept in Ordalie. A Feature can be defined as a characteristic attached to a sequence, a group of sequences or to the global alignment. A sequence / group / alignment feature can contain several items (for example, a sequence feature can contain several PFAM domains). One of the strength of Ordalie is its ability to investigate these features in different contexts, for example in the structural context of the protein.

Features are imported into Ordalie through the Macsims program XML output file [10], or using a dedicated feature file format (see section 7.4), or manually defined using the 'Feature tool'. This tool allows feature creation, edition or deletion (see section 4.5).

In Ordalie, a feature is defined by the sequence(s) it applies to, a start and stop position, a color, an associated score, a note and a coordinates system ("global" for alignment position or "local" for sequence relative position).

Ordalie can display, modify or create new features or items of features. Features can usually be displayed and selected in all modes (a special mode, the 'Feature mode' is dedicated to features editing).

3.4 Tools

Ordalie is arranged around tools. To achieve an action in Ordalie, one should enter the corresponding tool. For example, editing an alignment requires to go into the 'Editor' tool, computing a phylogenetic tree to enter the 'Tree' tool, etc ... All tools will be described in detail in section 4.


   
Image attention
The user must always leave a tool before entering an other one ! There are few exceptions to this rules.
   

3.5 Conventions

3.5.1 Mouse Buttons

In this manual, the mouse left, middle and right buttons will be designed as <B1> or <Button-1>, <B2> or <Button-2>, <B3> or <Button-3> respectively. Any words enclosed by '<' and '>' refer to the corresponding keyboard key.

3.5.2 Selections

Within tools, sequence names selection and aminoacid sequence range selectio are always achieved using the same mechanisms :

3.5.2.1 Sequence names selections


Sequences names can be selected by left-clicking on their names. The selection mechanism obeys standard rules :


Table 1: Keys combinaison to select sequences names
Keys Action
$<$Mouse-Left$>$ Selects the sequence under the mouse pointer
$<$Control + Mouse-Left$>$ If the sequence name under the mouse pointer is UNSELECTED, add this sequence to the current selection
  If the sequence name under the mouse pointer is SELECTED, remove this sequence from the current selection
$<$Shift + Mouse-Left$>$ Adds all sequences from the previously selected one up to the current sequence to the selection
$<$Control + a$>$ Selects all sequences
$<$Control + x$>$ Cut
$<$Control + c$>$ Copy
$<$Control + v$>$Paste  


Sequences Cut/Copy/Paste is available at any time, and allows the user to duplicate, remove or change sequence order.


   
Image attention
If a sequence is duplicated using Cut, Copy then Paste, its name will be suffixed by __<n> where n is the copy number.
   

3.5.2.2 Selecting a residue range


By default, no residue selection or edition is allowed. This can only be achieved in particular tools, like 'Editor', 'Cluster', 'Phylogenetic Tree', or 'Superposition' tools. In such mode, zones of residues are selected by :


Table 2: Keys for aminoacid sequence selection
Keys Action
$<$Mouse-Left$>$ Sets the starting point of the zone to be selected,
$<$Mouse-Right$>$ Sets the end of the zone, the selected zone becomes grey,
$<$Control + Mouse-Right$>$Unselects the zone under the mouse pointer.  
$<$Control + Mouse-Left$>$ Selects the feature under the mouse pointer
$<$Control + Mouse-right$>$ Unselect the feature under the mouse pointer



   
Image ampoule2
It is possible to select the zone corresponding to a feature item (for example a PFAM domain) by clicking on this feature item with <Control + B1>.
   

Several zones can be defined one after the other, either by left/right clicks and/or feature selection.

3.5.3 The database and the Ordalie file format

In order to manage snapshots, features, 3D structures, etc... Ordalie internally embeds a SQLite database [3]. This database is lightweight, and can easily be copied or moved around. The Ordalie file format (.ord extension) is in fact the SQLite database itself.

The scheme of the database can be found in Appendix 7.2.

In short, the database contains :

As Ordalie files (the SQLite database) contain all the information, it should be prefered as being the default working format.