Nearly 100,000 highly diverse whole genome sequences are now available through the National Institutes of Health’s All of Us Research Program. About 50% of the data is from individuals who identify with racial or ethnic groups that have historically been underrepresented in research. This data will enable researchers to address yet unanswerable questions about health and disease, leading to new breakthroughs and advancing discoveries to reduce persistent health disparities.
“Until now, over 90% of participants from large genomics studies have been of European descent. The lack of diversity in research has hindered scientific discovery,” said Josh Denny, M.D., chief executive officer of the All of Us Research Program. “All of Us participants are leading the way toward more equitable representation in medical research through their involvement. And this is just the beginning. Over time, as we expand our data and add new tools, this dataset will become an indispensable resource for health research.”
The genomic data is available via a cloud-based platform, the All of Us Researcher Workbench, and also includes genotyping arrays from 165,000 participants. Whole genome sequencing provides information about almost all of an individual’s genetic makeup, while genotyping arrays, the more commonly used genetic testing approach, capture a specific subset of the genome.
In addition to the genomic data, the Workbench contains information from many of the participants’ electronic health records, Fitbit devices and survey responses. The platform also links to data from the Census Bureau’s American Community Survey to provide more details about the communities where participants live. This combination of data will allow researchers to better understand how genes can cause or influence diseases in the context of other health determinants. The ultimate goal is to enable more precise approaches to health care for all populations. To protect participants’ privacy, the program has removed all direct identifiers from the data and upholds strict requirements for researchers seeking access.
“There is a unique depth and dimensionality to the All of Us platform that sets it apart from other resources in the field. It’s also designed with team science in mind, allowing researchers to explore topics in an open and collaborative way,” said Gail Jarvik, M.D., Ph.D., head of the Division of Medical Genetics at the University of Washington School of Medicine, Seattle. “As the Researcher Workbench matures, it will create nearly endless possibilities for discovery to understand the role of genes and variants, as well as many other factors that combine to affect health and disease.”
The Researcher Workbench is made possible through the generous contributions of All of Us participants. Beyond making genomic data available for research, All of Us participants have the opportunity to receive personal DNA results at no cost to them. So far, the program has offered genetic ancestry and trait results to more than 100,000 participants. Plans are underway to begin to share health-related DNA results on hereditary disease risk and medication-gene interactions later this year.
With this release of genomic data, All of Us now ranks among other large genomic research efforts worldwide, including the UK BIOBANK, the Million Veteran Program and the NIH’s Trans-Omics for Precision Medicine (TOPMed) program.
All of Us works with a consortium of partners across the U.S. to help reach participants and collect data and samples, including community organizations, medical centers and others. The Researcher Workbench is managed by Vanderbilt University Medical Center in collaboration with the Broad Institute of MIT and Harvard and Verily. The program’s genome centers generate the genomic data and process about 5,000 participant samples each week. These centers include Baylor College of Medicine, Johns Hopkins University, the Broad Institute, the Northwest Genomics Center at the University of Washington and partners. Color, a health technology company, works with the program to return personalized results to participants on genetic ancestry and traits, and the forthcoming health-related genetic results.