The Global Alliance for Genomics and Health (GA4GH) has named the Biomedical Research Hub as one of ten genomic data initiatives with clinical connections as its newest Driver Projects. The collaborations will allow genomic data standards to make new inroads into medicine and biomedical research, including applying machine learning to data from diverse regions around the globe.
GA4GH is a not-for-profit alliance that includes more than 500 leading organizations in healthcare, life science research, and information technology that builds technical standards, policy frameworks, and tools that will expand responsible, voluntary, and secure use of genomic and other related health data. Its Driver Projects are real-world initiatives that help build and implement GA4GH standards, tools, and frameworks. They give voice to the broader genomics community and ensure GA4GH products serve real needs.
The Biomedical Research Hub (BRH) comprises what is sometimes called a data fabric, consisting of multiple data commons, data repositories, and cloud based computational tools for exploring and analyzing data from multiple projects and research communities. It is designed so that different organizations can set up data commons containing their own data easily, and then decide who can access it and how it maintains security with other data platforms in the fabric. It also supports research via federation — the process of running computational analyses on data remotely, rather than a researcher downloading or accessing raw data.
The BRH was developed by the University of Chicago Center for Translational Data Science (CTDS), a hub for developing the discipline of data science and its applications to problems in biology, medicine, healthcare, and the environment. Led by Robert L. Grossman, PhD, Frederick H. Rawson Distinguished Service Professor in Medicine and Computer Science and the Jim and Karen Frank Director of the CTDS, the Center operates large scale data platforms to support research that engages collaborators around the world and enables the use of machine learning and AI in the data platforms it operates.
The BRH is based upon the Gen3 Data Platform, an open-source data platform for working with sensitive biomedical or healthcare data developed by the CTDS. It is one of the first large scale data fabrics spanning data from multiple National Institutes of Health (NIH) systems, spanning multiple Gen3 data commons and providing access to research data from more than 400,000 research participants.
“The Biomedical Research Hub is one of the largest and most diverse biomedical data fabrics and we are very happy that we have been selected as a GA4GH Driver Project,” Grossman said. “It will give us a great opportunity for collaborating on standards for data fabrics, ranging from standards that support federated machine learning across data commons from around the world to standards that make it easier to connect environmental sensors and other new sources of data-to-data commons to understand the interactions of the environment with health.”