This project not only opens up a valuable database for the research community but also lays the foundation for biomedical research and precision medicine in Vietnam. This success also plays a vital role in bringing early disease diagnosis closer to the average Vietnamese in the future.

This is the first genome-wide dataset to ensure representativeness and the universality of the Vietnamese population, consistent with the geographical population distribution (the North 37 percent; Central 22 percent; the South 41 percent) and gender. The genomic dataset is fully annotated for biological functions as well as disease risks.

The MASH Portal is open for public access, providing users references and a trial version of rapid whole human genome data analysis tools

VinBigData initiated the project in December 2018 with the participation of more than 40 scientists who are internal and local experts, as well as senior consultants from leading research organizations such as Harvard University, the University of Chicago, the University of Texas MD Anderson Cancer Center as well as hundreds of experts and volunteers in Viet Nam and over the world.

Over the past three years, the research team has sequenced and analyzed the whole genomes of over 1,000 healthy individuals between 35 and 55 years of age, biologically unrelated, with sufficient phenotypic and demographic information.

The team has also partially analyzed the genomes of more than 4,000 cases related to common diseases and drug reactions. As a result, more than 40 million genetic variants, including nearly 2 million common and unique genetic variants of the Vietnamese population were detected.

The analysis pipeline used by VinBigData is recognized as the world’s most advanced. Sample processing is carried out at Vinmec International Hospital’s laboratory that meets the international standard ISO 15189 (a regulatory standard for medical labs).

Large amounts of data are processed, using modern technology from Google, Illumina (the world’s leading technology company in gene sequencing) and NVIDIA (American multinational technology corporation).

The 1,000 Vietnamese genomes project was developed by more than 40 leading scientists and engineers at VinBigData.

VinBigData also partners with approximately 20 leading research organizations and hospitals domestically and internationally such as Johns Hopkins University (US), the Golden Helix Foundation (UK), Hanoi Medical University, and Military Central Hospital 108.

“The majority of research and clinical practice in Viet Nam is currently based on genetic databases of other populations in the world. This affects the results of scientific publications as well as the quality of medical examination and treatment for Vietnamese people,” said Professor Vu Ha Van – scientific director of VinBigData.

“With the completion of a Vietnamese genetic variation database, VinBigData expects to open up new directions, paving the way for the growth of precision medicine in Vietnam.”

With a total investment of over US$4.5 million, the scale of Vingroup’s human genome project is the largest in the country so far. VinBigData has become the pioneer in sequencing the whole Vietnamese genome.

Assessing the significance of the study, Professor Ta Thanh Van (Chairman of the Council of Hanoi Medical University) said: “Currently, doctors mainly rely on clinical, biochemical tests or medical history to examine and make treatment decisions.”

“If genomic data is available, this will be valuable information; helping to increase the accuracy and efficiency of disease diagnosis and treatment. On a national scale, the Vietnamese genome database is expected to be a premise to promote the development of preventive medicine and precision medicine practice, thereby contributing to public health improvement.”

After completion, a part of the project’s data is opened for public access through the Management Analysis Sharing Harmonization Portal (MASH Portal) ( Users around the world can use it as references for their research. The system also provides a trial version of rapid whole genome data analysis tools (from 30 minutes to 1 hour)

The project will be directly applied to genetic products with outstanding accuracy and speed, uniquely for Vietnamese. The study will create a premise to support the exploration of an individual’s physiological and psychological characteristics, as well as predict the risk of common diseases, recessive genetic diseases, and assess the level of future drug responses.

According to