RetinoBase

From Wikili
Revision as of 15:55, 27 July 2007 by Ripp (talk | contribs)
Jump to: navigation, search
 go to the RetinoBase website   mail to RaviKiran Reddy

What is RetinoBase

RetinoBase is a web site and a relational database currently combining 27 different sets of microarray experiments in vision research preformed in 4 different organisms.

Datasets in RetinoBase

Retinobase stores the expression profiles of genes from a microarray experiment.

The database contains a total of 20 publicly available experiments, GEO data GSE 1816, 4756, 1835, 3791, 2868 as well as 7 additional experiments that are not publicly available which can be accessed in the near future, are performed under different conditions such as knockout models, treatments and time series experiments performed on different organisms such as mice, rats, zebra fish and humans.

Out of these 20 experiments, 2 experiments (experiment 8 and 9) have partial data at the level of fold change due to unavailability of raw data (.CEL) or signal intensity data. Data was downloaded via FTP from Gene Expression Omnibus (GEO) and after preprocessing has been uploaded to RetinoBase using SQL scripts via pgAdminIII.

Data pre-processing

Raw data has been obtained in two different formats either as .CEL files or at the level of signal intensities. Data obtained at the level of .CEL files have been analysed with three different normalization softwares - RMA, dChip and MAS5 using R statistical package (http://www.r-project.org) and Bioconductor. R is an open platform for statistical computation and Bioconductor is a microarray data analysis in R. The signal intensities thus obtained were integrated into Retinobase. The fold-change in gene expression was calculated as the ratio between the signal intensities of a given gene in the treated (or knockout) with respect to the control. In the case of experiments performed in replicates, signal intensities were averaged before calculation of the ratios and finally incorporated into Retinobase. All the experiments in Retinobase are clustered using K-means method from both Functional And Statistical Analysis of Biological Data (FASABI) software developed in-house and TM4, a free, open-source system for microarray data management and analysis as well as mixture model method through FASABI. K-means method in FASABI uses density of points clustering and that of in TMEV uses dot product to determine the distance between gene vectors.

Architecture

The website is powered by an Apache web server, PHP and Javascript for dynamic web pages and a PostgreSQL relational database as the backend to store data. Retinobase uses open-source tools.