AnnotSV is a standalone program designed for annotating and ranking Structural Variations (SV). This tool is implemented in Tcl, with a ready to start installation, and runs in command line on all platforms.
AnnotSV compiles functionally, regulatory and clinically relevant information and aims at providing annotations useful to i) interpret SV potential pathogenicity and ii) filter out SV potential false positives.
Different types of SV exist including deletions, duplications, insertions, inversions, translocations or rearrangements that are more complex. They can be either balanced or unbalanced. When unbalanced and resulting in a gain or loss of material, they are called Copy Number Variations (CNV). CNV can be described by coordinates on one chromosome, with the start and end positions of the SV (deletions, insertions, duplications). Complex rearrangements with breakends can arbitrary be summarized as a set of novel adjacencies, as described in the Variant Call Format specification VCFv4.3 (May 2020).
AnnotSV takes as an input file a classical VCF or BED file describing the SV coordinates. The output file contains the overlaps of the SV with relevant genomic features where the genes refer to NCBI RefSeq genes.
AnnotSV provides numerous relevant annotations:
- A genes-based annotation (OMIM, Gene intolerance, Haploinsufficiency…)
- An annotation with features overlapping the SV (DGV, 1000 genomes…)
- An annotation with features overlapped with the SV (pathogenic SV from dbVar, promoters, enhancers, TAD…)
- An annotation of the SV breakpoints (GC content, repeats…)
In addition to these annotations, AnnotSV also provides a systematic SV classification:
AnnotSV uses the same type of categories delineated by the American College of Medical Genetics and Genomics (ACMG):
- Class 1 = benign
- Class 2 = likely benign
- Class 3 = VOUS (variant of unknown significance)
- Class 4 = likely pathogenic
- Class 5 = pathogenic
This program is well detailed in the README file.
Thank you for taking the time to use AnnotSV, your feedback is greatly appreciated.
AnnotSV supports as well the VCF (Variant Call Format) or the commonly used BED (Browser Extensible Data) input format to describe the SV to annotate. It allows the program to be easily integrated into any bioinformatics pipeline dedicated to NGS analysis.
Giving a BED or VCF SV file, AnnotSV produces a tab-separated values file which can be easily integrated in bioinformatics pipelines or directly read in a spreadsheet program.
A user-friendly web server interface is available to run AnnotSV: click here
A typical AnnotSV use would be to first look at the annotation and ranking of each SV as a whole (i.e. “full”) and then focus on the content of that SV. Indeed, there are 2 types of lines produced by AnnotSV (cf the “AnnotSV type” output column):
- An annotation on the “full” length of the SV:
Every SV are reported, even those not covering a gene. This type of annotation gives an estimate of the SV event itself.
- An annotation of the SV “split” by gene:
This type of annotation gives an opportunity to focus on each gene overlapped by the SV. Thus, when a SV spans over several genes, the output will contain as many annotations lines as covered genes (cf example in FAQ). This latter annotation is extremely powerful to shorten the identification of mutation implicated in a specific gene.
The annotations columns available in the output file are detailed here and in the README file.