News

New statistical method identifies hidden gene programs linked to poor survival in aggressive pancreatic cancer

Researchers from UChicago built a statistical method to better understand the complexity of pancreatic cancer.

Pancreatic ductal adenocarcinoma (PDAC) is one of the deadliest forms of cancer, known for its aggressive behavior and resistance to treatment. Treating PDAC is challenging because the tumors are extraordinarily complex, with a chaotic mix of cells with different behaviors, vulnerabilities, and gene expression patterns.

In a study published in Nature Genetics, researchers from the University of Chicago developed a powerful statistical method called Generalized Binary Covariance Decomposition (GBCD) to better understand this complexity. This method enables researchers to analyze massive single-cell RNA sequencing datasets and uncover recurring patterns of gene activity across diverse patient tumors.

By applying GBCD to thousands of individual cancer cells from multiple studies, the team discovered previously hidden gene expression programs, particularly a stress-response signature that is linked to poor survival outcomes. This new method could be a powerful tool for identifying high-risk patients and developing more personalized treatment strategies.

Understanding tumors’ mixed make-up through variations in gene expression

Genetic and epigenetic alterations are major drivers of fatal diseases like cancer. These alterations impact cellular function by changing gene expression and leading to transcriptional heterogeneity—a phenomenon in which patterns of gene activity differ across cells. Every tumor contains a complicated mix of cell types and states. Some cells grow rapidly, others remain dormant; Some respond to drugs, others are resistant to treatment.

“A big question in cancer is, how does gene expression predict or causally impact the way a tumor develops or how patients respond to therapy?” said senior author Matthew Stephens, PhD, FRS, Professor and Chair of the Department of Statistics and Professor of Human Genetics at UChicago. “If we understood how transcriptional variation predicts prognosis and therapy response, then we could improve therapies because transcriptional variation is relatively easy to measure.”

Sometimes it is not just a single gene, but a set of genes that exhibit coordinated transcriptional changes. These are known as gene expression programs, which can characterize cancer molecular subtypes, influence tumor progression, and affect therapy response.

Revealing layers within tumor complexity

Historically, most transcriptomic data came from bulk RNA sequencing, which measures gene expression in a group of cells rather than in individual cells. The issue with bulk transcriptomic data is that it provides measurement at the patient level. In contrast, single-cell RNA sequencing (scRNA-seq) data measures gene expression at single-cell resolution, revealing transcriptional heterogeneity among cells from the same patient.

Modern technologies like scRNA-seq allow researchers to measure the activity of thousands of genes in individual cancer cells, offering much higher resolution of the transcriptional heterogeneity within tumors. However, the resulting massive datasets pose a challenge: distinguishing meaningful signals from noise.

In the current study, the team focused on analyzing transcriptional variation using scRNA-seq data. “Studying single-cell data allows us to see much more structure than what we can observe with bulk RNA-seq,” said Yusha Liu, PhD, a former postdoctoral researcher in the Stephens lab who led the study. Unified analysis of scRNA-seq data from multiple studies and patients can identify recurrent patterns of transcriptional variation related to cancer etiology, such as molecular subtypes. There may also be other biological activities or cellular processes occurring in tumor cells that aren’t directly tied to known molecular subtypes but still have significant implications for patient outcomes.

“Although tumor subtype identification is helpful in guiding treatment choices, the molecular landscape is very complex,” Stephens said. “We are trying to identify gene expression programs using Generalized Binary Covariance Decomposition (GBCD), a statistical method designed to analyze transcriptional heterogeneity in single-cell RNA data and detect patterns beyond known subtypes.” This method allows researchers to break down the complex variation in gene activity into distinct, interpretable components.

Stress response signature offers clues to pancreatic cancer survival

To test the new GBCD method, researchers analyzed 35,000 cancer cells from 59 patients with PDAC. GBCD not only identified the well-known “classical” and “basal” PDAC subtypes but also uncovered a previously underappreciated stress-response program strongly associated with poor survival.

They found that many critical genes in this stress-response program are regulated by ATF4, a key transcription factor involved in the “integrated stress response,” a survival mechanism exploited by cancer cells under harsh conditions. Importantly, this stress program predicted worse outcomes in cancer patients independent of tumor stage or known subtype. Identifying this signature could help explain PDAC’s aggressive nature and inform treatment strategies.

“The benefit of our approach is that we can perform a unified or integrative analysis of single-cell RNA-seq data from multiple samples across studies,” Liu said. “This greatly increases the number of patient samples and cells, enhancing our power to identify shared transcriptional patterns. The challenge, however, is the high degree of variability between patients, which can mask these subtle shared patterns and that is what GBCD is designed to tackle.”

According to researchers, the GBCD approach offers a deeper characterization of the transcriptional landscape in pancreatic cancer and could be applied to other cancer types. They hope the tool will advance understanding of cancer biology and etiology, while offering valuable insights to study tumor progression and metastasis.

The study “Dissecting tumor transcriptional heterogeneity from single-cell RNA-seq data by generalized binary covariance decomposition” was supported by grants from the National Institutes of Health, the University of Chicago Medicine Comprehensive Cancer Center, the Leona M. and Harry B. Helmsley Charitable Trust, the Neuroendocrine Tumor Research Foundation, an Ullman Family Dream Team Award, and the Government of Ontario.

Additional authors included Scott A. Oakes and Kay F. Macleod from the University of Chicago, and Jason Willwerscheid from Providence College, RI.

Explore the Biological Sciences Division