Blood samples were collected from about 20,000 individuals, of which DNA samples from 10,074 individuals were subjected to whole genome sequencing. Photo: genomeindia.in
Preliminary findings of the GenomeIndia project, which genotyped 10,074 healthy and unrelated Indians from 85 populations — 32 tribal and 53 non-tribal populations — across India, were published in the journal Nature Genetics on Tuesday (April 8, 2025). Blood samples were collected from about 20,000 individuals, of which DNA samples from 10,074 individuals were subjected to whole genome sequencing.
About 100 samples were collected from each population with a median of 159 samples from each non-tribal group and 75 samples from each tribal group to estimate the relatively rare variants that are important to understand complex diseases.
After excluding two populations, the preliminary findings are based on the genetic information of 9,772 individuals — 4,696 male participants and 5,076 female participants. The genome sequence data are deposited in the Indian Biological Data Centre (IBDC) housed in Regional Centre for Biotechnology, in Faridabad, Haryana.
The genomes of Tibeto-Burman tribe, Indo-European tribe, Dravidian tribe, Austro-Asiatic tribe, and continentally admixed outgroup were sequenced. In the case of non-tribes, the genomes of Tibeto-Burman non-tribe, Indo-European non-tribe, and Dravidian non-tribe were sequenced.
180 million genetic variants
In all, 180 million genetic variants were found when the genomes of 9,772 individuals were sequenced. Of the 180 million, 130 million variants are found in the non-sex chromosomes (22 pairs of autosomes) and 50 million variants are in the sex chromosomes X and Y. Some of the variants are associated with diseases; some others are rare while some variants are unique to India; and some others are unique to particular communities or small populations.
“We are now trying to find out the implications of these variants,” said Dr. Kumarasamy Thangaraj of the Centre for Cellular and Molecular Biology (CSIR-CCMB), Hyderabad, and one of the corresponding authors of the Comment piece. “We are looking for variants which are functionally relevant — related to diseases, those associated with therapeutic responses or no responses, and those that are causing adverse effects to therapeutic agents.” Efforts will be directed at constructing a panel of variants that would be useful for fetching missing data in the future small scale genotyping or low depth sequencing. This would also be useful for correlating diseases-genetics associations in the Indian population.
Elaborating further, Dr. Thangaraj said, “Some of the variants may be associated with individuals susceptible to infectious diseases, while some variants might be responsible for resistance to infectious diseases. It is also possible that some variants will be associated with adaptations to particular environments such as high altitudes and low oxygen concentration.”
Explaining how the genome data will be put to real-time applications, Dr. Thangaraj said the information on variants associated with specific diseases can be utilised for developing low-cost diagnostic kits and for personalised medicine. “In-depth analyses of 9,772 diverse genomes along with the blood biochemistry and anthropometry data will improve disease diagnostics, predict the genetic basis of drug responses, and kickstart precision medicine efforts in India,” the authors write. A detailed paper will be published in the next couple of months.
The GenomeIndia is a collaborative effort of 20 institutions. Genome sequencing was carried out by the Centre for Brain Research at IISc Bangalore, Centre for Cellular Microbiology Biology in Hyderabad, Institute of Genomics & Integrative Biology in Delhi, National Institute of Biomedical Genomics in Kolkata, and Gujarat Biotechnology Research Centre in Gandhinagar.
Published – April 08, 2025 08:22 pm IST