Data Analysis and Deep Learning
Annice Najafi
Graduate Research Assistant
Texas A&M University
Houston, Texas, United States
Mohit Jolly
Assistant Professor
Indian Institute of Science, United States
Jason George
Assistant professor
Texas A&M University, United States
Abstract:
Breast cancer is the second most common cancer and the second leading cause of cancer-related morbidity in women in the US. Metastasis is the primary cause of breast cancer mortality and morbidity. The Epithelial-Mesenchymal Transition (EMT) and the corresponding reverse process (MET) is a well-known contributor to cancer metastasis. Recent experimental advances have generated temporally resolved single-cell transcriptomics-level analyses of cancer, but computational methods of inference of EMT states and trajectories from single-cell RNA sequencing data are lacking. Here, we introduce and apply our novel computational framework, Cell-line specific Optimization Method of EMT Trajectories (COMET) [1] to single-cell RNA Sequencing breast cancer cell line data. COMET reveals the temporal dynamics and transition rates associated with EMT and MET in each cell line. In addition, COMET provides invaluable insights into the complexities of EMT specially in less-studied cell lines. We show that COMET is a reliable method for the prediction of cancer treatment response through association with EMT and MET status.
Introduction:
EMT is a dynamics cellular process underlying cancer metastasis. Previously thought of as a binary process, it is now well established that cells can stably exist in a hybrid state with features falling on a spectrum between epithelial and mesenchymal states. The hybrid state is characterized by enhanced phenotypic plasticity and environmental adaptation and is often associated with enhanced tumor aggressiveness. Therefore, predicting EMT time-course trajectories from empirical data provides a more nuanced characterization of tumor metastasis and patient outcomes.
We applied COMET to time-course single-cell RNA sequencing data of MCF10A cell line treated with TGFB and validated EMT states and trajectories through comparison to flow cytometry data. We then used the identified states and labelled several single-cell RNA sequencing breast cancer cell data and proceeded by applying COMET on sequencing data of cell lines treated with therapeutic drugs. Lastly, we interrogated the role of post-transcriptional modifications co-occurring with EMT such as alternative polyadenylation and revealed several cases which can possibly result in driving tumor progression and metastasis.
Results:
COMET reliably identified EMT states from single-cell RNA sequencing data consistent with literature and other well-known metrics. We identified that highly variable EMT genes are more likely to be shared within the metastatic cell line or non-metastatic cell line groups. In addition, our analysis revealed breast cancer cell lines treated with drugs undergoing MET are associated with better treatment response. Furthermore, our analysis revealed the occurrence of post-transcriptional modifications such as the alternative polyadenylation of S100A2 may co-occur with EMT and result in tumor proliferation.
Conclusions:
We have developed the first stochastic computational framework to infer and track the dynamics of EMT from single-cell RNA sequencing data. Here, we applied our method to breast cancer cell line data treated with drugs and showed that MET is associated with treatment response. In addition, we tracked post-transcriptional processes and showed that EMT can contribute to tumor aggressiveness and metastasis through various non-genetic mechanisms.
[1] Najafi A, Jolly MK, George JT. Population Dynamics of EMT Elucidates the Timing and Distribution of Phenotypic Intra-tumoral Heterogeneity. In Press. iScience 2023., doi: https:// doi.org/10.1016/j.isci.2023.106964.