News

Biomedical Research Hub selected as key partner for international genomic data standards initiative

The Biomedical Research Hub, developed by the University of Chicago Center for Translational Data Science, is one of ten new Driver Projects for the Global Alliance for Genomics and Health (GA4GH).

The Global Alliance for Genomics and Health (GA4GH) has named the Biomedical Research Hub as one of ten genomic data initiatives with clinical connections as its newest Driver Projects. The collaborations will allow genomic data standards to make new inroads into medicine and biomedical research, including applying machine learning to data from diverse regions around the globe.

GA4GH is a not-for-profit alliance that includes more than 500 leading organizations in healthcare, life science research, and information technology that builds technical standards, policy frameworks, and tools that will expand responsible, voluntary, and secure use of genomic and other related health data.  Its Driver Projects are real-world initiatives that help build and implement GA4GH standards, tools, and frameworks. They give voice to the broader genomics community and ensure GA4GH products serve real needs.

The Biomedical Research Hub (BRH) comprises what is sometimes called a data fabric, consisting of multiple data commons, data repositories, and cloud based computational tools for exploring and analyzing data from multiple projects and research communities. It is designed so that different organizations can set up data commons containing their own data easily, and then decide who can access it and how it maintains security with other data platforms in the fabric. It also supports research via federation — the process of running computational analyses on data remotely, rather than a researcher downloading or accessing raw data.

The BRH was developed by the University of Chicago Center for Translational Data Science (CTDS), a hub for developing the discipline of data science and its applications to problems in biology, medicine, healthcare, and the environment. Led by Robert L. Grossman, PhD, Frederick H. Rawson Distinguished Service Professor in Medicine and Computer Science and the Jim and Karen Frank Director of the CTDS, the Center operates large scale data platforms to support research that engages collaborators around the world and enables the use of machine learning and AI in the data platforms it operates.

The BRH is based upon the Gen3 Data Platform, an open-source data platform for working with sensitive biomedical or healthcare data developed by the CTDS. It is one of the first large scale data fabrics spanning data from multiple National Institutes of Health (NIH) systems, spanning multiple Gen3 data commons and providing access to research data from more than 400,000 research participants.

“The Biomedical Research Hub is one of the largest and most diverse biomedical data fabrics and we are very happy that we have been selected as a GA4GH Driver Project,” Grossman said. “It will give us a great opportunity for collaborating on standards for data fabrics, ranging from standards that support federated machine learning across data commons from around the world to standards that make it easier to connect environmental sensors and other new sources of data-to-data commons to understand the interactions of the environment with health.”

Six other projects have joined GA4GH as new Driver Projects, including the Human Pangenome Project (HPP), Immunotherapy Centers of Research Excellence (imCORE®), the International Precision Child Health Partnership (IPCHiP), NHLBI BioData Catalyst® (BDC), the NIH Cloud Platform Interoperability (NCPI) effort, and the Repository of the International Fetal Genomics Consortium (RIFGC). Three more are expected to join in the coming weeks in the areas of cancer research and national infrastructure: the European Genomic Data Infrastructure (GDI), Qatar Genome Program of Qatar Foundation, and European Open Science Cloud for Cancer (EOSC4Cancer).

“Together, all GA4GH Driver Projects provide access to more than 3.2 million genomes and 15 petabytes of real data. That kind of data power will drive an enormous transformation in human health,” said Ewan Birney, Chair of GA4GH, and Deputy Director General of the European Molecular Biology Laboratory, in a press release from the group. “It’s fantastic to see the wide range of initiatives that have joined GA4GH as new Driver Projects and are committing to ramping up responsible genomic data use.”

Explore the Biological Sciences Division