We defined a 'Chromatin Accessibility Score' for each variation to evaluate the impact of variations (especially for non-coding variations) based on chromatin accessibility data. This information can be found at 'Search for Variation by Gene' page and 'Search for Variation information by Variation ID' page.
RiceVarMap v2.0 is a comprehensive database for rice genomic variation and its functional annotation. It provides curated information of 17,397,026 genomic variations (including 14,541,446 SNPs and 2,855,580 small INDELs ) from sequencing data of 4,726 rice accessions. These variations were identified using GATK software based on the assembly Os-Nipponbare-Reference-IRGSP-1.0. (Note: you can still access to RiceVarMap v1.0 for querying variations based on the old assembly Nipponbare MSU v6.1.)
High quality and complete genotype data. The genotypes of all accessions were imputed and evaluated, resulting in an overall missing data rate of <3% and an estimated accuracy greater than 99%. The SNP/INDEL genotypes of all accessions are available for online queries and download. To facilitate population genetic analysis, RiceVarMap also offers ancestral allele information and allele distribution data of subpopulations.
Comprehensive annotations of genomic variations. RiceVarMap now provides more precise variations and annotations. Software packages, snpEff, CooVar and PolyPhen-2, were used to evaluate the impact of missense variations based on haplotypes and conservation information. We also collected and integrated chromatin accessibility data generated by ATAC-seq or DNase-seq of representative rice accessions, which could be used to evaluate the possible risks of variations (especially for those in non-coding regions) on regulating gene expression. Moreover, GWAS results were integrated to curate the possible functions of variants. This information can be queried at this page.
Phenotype data and GWAS results. The database provides geographical details and phenotype images, agronomic and metabolic traits for some rice accessions. Plant scientists and breeders can also search for significant SNPs associated with various traits to develop useful molecular markers or pick up candidate genes.
Currently, we collected sequencing data from three sets of rice germplasms consisting of totally 4,726 accessions of cultivated rice (Oryza sativa L.):
The first set of germplasm consisted of 533 accessions selected to represent both the usefulness in rice improvement and the genetic diversity in the cultivated species. We sequenced the 533 accessions using the Illumina HiSeq 2000 in the form of 90-bp paired-end reads to generate high-quality sequences of more than one gigabase per accession (>2.5x per genome, total 6.7 billion reads). These raw data is available in NCBI with BioProject accession number PRJNA171289. We provide phenotype images, agronomic and metabolic traits for these accessions.
The second set of germplasm was 950 rice accessions sequenced by Huang et al. (2012, Nat Genet, 44:32-39) that were downloaded from the EBI European Nucleotide Archive (accession number ERP000106 and ERP000729), which consists of 4.6 billion 73-bp paired-end reads (~1x per genome).
The third set of germplasm was 3243 rice accessions from 3,000 Rice Genomes Project (2014, GigaScience, 3:7) that were downloaded from the EBI European Nucleotide Archive (accession number PRJEB6180), which has an average sequencing depth of 14x per genome.
Phenotype data and GWAS results:
At the moment RiceVarMap provides phenotype data and GWAS results for 13 agronomic traits (including heading date, plant height, and grain weight et al.) and 840 metabolite traits which were produced by our institute (Xie et al., 2015, Proc Natl Acad Sci USA, 112: E5411-E5419; Chen et al., 2014, Nat Genet, 46:714-721). Phenotype information can be queried at this page.
Data used in PolyPhen-2:
We extracted common missense SNPs (MAF >0.05) for PloyPhen-2 analysis, The searches of homologous proteins were performed against Uniport UniRef 100 using BLAST (e-value <1e-3, identity from 0.3 to 0.95).
Chromatin accessibility data:
Data of chromatin accessibility was downloaded from PlantDHS database.
The recommended browsers are Chrome, Firefox, Safari, and Edge ( IE8 and earlier have poorer support and may give a lesser experience).
Researchers who wish to use RiceVarMap are encouraged to refer to our publication or more:
Zhao H, Yao W, Ouyang Y, Yang W, Wang G, Lian X, Xing Y, Chen L, Xie W. RiceVarMap: a comprehensive database of rice genomic variations. Nucleic Acids Res, 2015, 43: D1018-1022