Ovarian cancer (OC) is the second most common cause of cancer-related death in women, with estimated 239,000 new cases and 152,000 deaths each year 12. The highest rate of OC is reported in Europe 1. An overwhelming majority of OC get diagnosed at later stage/metastatic stage of the disease 3. Majority of patients undergo chemotherapy and surgical removal of the tumour but eventually disease relapse with resistant tumor with 30% 5-year survival rate 4. The immune system has been shown to play an important role in tumor development and progression 5. Immune infiltration has been proposed as one of the hallmarks of cancer 6. Recently, Immunotherapy (PD-L1 inhibitor) is emerging as an effective option for advance OC treatment 7. Immunotherapy which increases tumor- infiltrating T cells in the tumor, help in the clearing of tumor cells effective in OC because of immuno-active nature of these tumors 8910. Melanomas with high expression immune-related genes have been shown to respond better to immunotherapy 1112. Even though immunotherapy has proved to be effective in cancer treatment, response rate of patents is very low 13. Thus, it is important to identify the patients with high immunological activity for immunotherapy. The correlation between well lymphocyte infiltration and patient’s survival have been well established in OC 14. Lymphocyte infiltration level can be identified by gene expression profiling of immune-related genes which may prove useful in prognosis prediction of patients. However, molecular nature of tumor-immune system interaction is poorly exploited for their prognostic potential in OC. A number of studies have developed genes expression-based prognostic signatures for prognosis prediction and stratification of OC patients 15. Unfortunately, none of them utilize the immune-related gene’s expression to predict survival and immune filtration which can be utilized for immunotherapy decisions.
In this work, we have utilized two datasets with 682 patients to develop and validate Immune Prognostic Signature (IPS) for OC. We have used various statistical techniques to show the robustness of IPS in prognostication of OC patients. We also show that five components of IPS are differentially expressed in tumor compared to normal and involve in immune modulation. Most importantly, we showed that patients with high IPS have less immune activity and high activity of different oncogenic pathways.
Materials and methods
We have used 682 patient’s data for this study. Expression and clinical data for GSE9899 was downloaded using Oncomine and used and training cohort 16. Clinical data from TCGA was downloaded using Oncomine, whereas expression data was downloaded from TCGA website. TCGA patient data was used as validation cohort. The training dataset has similar age distribution (Median age 59 year), follow-up time (Median follow up time- 28.5 year) compared to validation cohort (Median age -59 year, median follow-up time 29.4 months).
Cox regression analysis and development of IPS
Cox regression analysis was done using survival package in R (https://CRAN.R-project.org/package=survival). In Cox analysis, we identified five genes with significant association with the survival. We then performed multivariate cox regression analysis to obtain the correlation coefficient. The Immune prognostic score was calculated for each using following formula-
IPS= (0.231 X C1QTNF3 expression) + (0.272 X ALK expression) + (.437 X ADA expression) + (-0.588 X CASP8 expression) + (-0.231 X C6 expression)
Kaplan-Meier and Multivariate analysis
IPS score was divided at median to perform the Cox regression analysis using Graph Pad Prism 7.0. Multivariate analysis was performed using SPSS version 25. For multivariate analysis forward conditional method was used. We used stage and grade as categorical variable in the analysis.
Differential expression analysis and GSEA analysis
We used MiPanda software (http://www.mipanda.org) for the differential expression analysis of the five component genes. Q-value obtained from MiPanda database. Correlation coefficient was calculated using correl command in excel 2016. Correlation heat map was plotted in Graph Pad version 7.0. For GO analysis we used GOrilla database and output was used as input for REVIGO software 17. REVIGO output was plotted in R and shown 18. For GSEA analysis, differentially expressed genes were identified by performing T-test between high and Low IPS group. Significantly differentially expressed genes were then ranked and used for Preranked GSEA analysis 19. Enrichment plots are shown.
In this study, we have included a total of 683 patient’s samples with median age range of 59 22-87 years. We used 279 patients with median age range of 58.522-80.5 years from GSE9899 dataset as training cohort and 404 patients with median age range of 59 30-87 years from TCGA dataset as validation cohort. GSE9899 dataset had stage and grade data available whereas TCGA samples had only grade data available. The Clinicopathological data of patients from both the cohorts are given in table1. There was no significant difference in age of patients between two cohort of patients (table1).
Development of Immune prognostic signature (IPS)
Firstly, to develop the IPS, we extracted the expression level of Immune genes (n=824) downloaded from http://www.innatedb.com. The expression of samples was then subjected to Cox regression analysis using survival package in R to identify the immune related genes whose expression was significantly correlating with survival. We identified five immune related genes with high significant correlation with survival. The expression of these five genes was then used for calculation of risk score, Immune Prognostic Score (IPS) using risk score formula for each patient (Material and methods, Supplementary Table1). Expression pattern of all the five genes with increasing IPS is shown in the heatmap (Figure 1A). Samples were divided into High IPS and low IPS at the median. The pattern of IPS distribution is given in figure 1B. Patients with high IPS showed significant more deaths compared to low IPS patients (Fisher’s exact test, P-value<0.01) (Figure 1 C). Most importantly, high IPS patients showed significantly lower survival compared to low IPS patients (HR=2.75, CI=1.89-4.01, p-value<0.01) (Figure 1 C). Further, we divided patients into three groups (33% samples each) based on IPS score and showed the presence of an intermediate population with the significant difference in survival compared to high IPS and Low IPS patients (p-value<0.01, Supplementary figure 1). Validation of prognostic ability of IPS in OC We showed that IPS is a significant predictor of survival in OC. Further, we validated the prognostic ability of IPS in a cohort of 404 patient samples obtained from TCGA. We downloaded the RNA-seq and clinical data from TCGA database and calculated the read counts and FPKM value. We then extracted the FPKM values for five genes which constitute the IPS. Expression value were used to calculate the IPS in validation cohort using same formula which was used in training cohort. Patients were divided into high IPS and Low IPS at the median. Expression pattern of these five genes are shown in heat map (figure 2 A). Distribution of IPS in validation cohort is show in figure 2B. Correlation with survival and IPS showed that patents belonging to low IPS group tend to survive more (Figure 2 C), which was then validated in a Kaplan- Meier analysis (Figure 2 C). KM analysis showed that patients with high IPS had significantly less survival compared to patients with low IPS (HR=1.52, CI=1.16-1.98, p-value<0.01). IPS is an independent predictor of survival To show that IPS is independent predictor of survival, we performed univariate and multivariate cox regression analysis in training cohort with IPS and other clinical markers age, grade and stage as variable. In univariate analysis, we show that only IPS and age were the predictor of prognosis of OC patients (IPS- HR:2.74, CI:2.03-3.69, p-value<0.01; Age- HR: 1.03, CI: 1.01-1.05, p-value =0.02) (Figure 3A). In a multivariate analysis with IPS and age as variable, we showed that IPS is an independent predictor of survival in training cohort (IPS- HR:2.63, CI:1.95-3.55, p-value<0.01; Age- HR: 1.02, CI: 1.01-1.04, p-value =0.04) (Figure 3A and B). Further, to validate that IPS is an independent predictor of survival in validation cohort, we performed univariate and multivariate cox regression analysis using age, grade and IPS as variable. In a univariate analysis, we found that similar to training cohort, in validation cohort also only age and IPS was significant predictor of survival. Multivariate analysis between age and IPS showed that both age and IPS were independent predictor of survival in validation cohort. Component genes of IPS are differentially regulated and effect immune system First, to check the regulation of IPS component genes, we utilised miPANDA expression database. Interestingly, we found that C1QTNF3, ADA, ALK and CASP8 are overexpressed in OC compared to normal, whereas C6 was downregulated in OC compared to normal. The expression pattern of these genes suggest the important role in OC development and progression. Further, we found that the correlation among the expression of these genes was very poor (Figure 4B), suggesting that these genes independently involve in prognosis prediction and are not co-regulated with each other. To understand the functions regulated by these genes, we identified the list of genes co-regulated with IPS component genes. This gene list was then used to identify the enriched GO terms using GOrilla database. The output of data was then used as input for REVIGO analysis. REVIGO analysis identified immune system process as one of the top reached pathways, suggesting the role of IPOS component genes in immune regulation. Other enriched pathways were Response to other organism, microtubule based process, anatomical structure development, vascular process in circulatory system, localization, regulation of biological quality, transport. Pathways regulated by correlating genes with C1QTNF3, ALK, ADA, C6 and CASP8 are given in supplementary figure 1, 2, 3, 4, 5 respectively. Immune activity of the patients with high IPS We identified in previous analysis that patients with high IPS survived significantly lower than patients with low IPS. To understand the molecular mechanism behind the different survival of these groups of patients, we performed a differential expression analysis between high and low IPS group of patients. The differentially expressed genes were then ranked in decreasing order of fold change and used as input in Preranked GSEA analysis using C1 Hallmark as gene set (Figure 5 A). Interestingly, we found that patients with high IPS were significantly negatively enriched in interferon alpha and interferon gamma response and significantly positive enriched with epithelial mesenchymal transition, hypoxia and KRAS signalling (Figure 5 A, B, C, D, E, F). Suggesting the immune inactive and high oncogenic nature of high IPS patients. Discussion Ovarian cancer (OC) is leading cause of gynaecological cancer related deaths12. Similar to many other cancers, late detection of OC is the primary hurdle in therapy 20. While due to advancement in surgical technologies and chemotherapy significant difference has been made, the survival of OC patients has moderately changed 21. Majority of women affected with OC show relapse of disease and more than 50% of these patients die within five years 20. Recently, immunotherapy has emerged as important option in cancer treatment 10. Immunotherapy treatment enhances the tumor suppressive nature of immune system. Immunotherapy has been recently approved by FDA for treatments of Melanoma, Non-Small Cell Lung Cancer, renal cell carcinoma, bladder cancer and classical Hodgkin lymphoma 10. Two most important factors responsible for immunotherapy response is 1) availability of immune cell in tumor environment and 2) nature of immune checkpoint pathway in antitumor immunity. Ovarian cancer patients are good candidate for immunotherapy because of high activity of immune cells in tumor milieu 8 1. But, there are no studies done to identify the patients with poor prognosis based on the immune system activity in tumor. Here in this work, we have utilized expression data available on GEO and TCGA to develop and validate an immune prognostic score (IPS). We have identified five genes, C1QTNF3, ALK, ADA, C6 and CASP8 as prognostically important genes in OC. IPS, a score generated using expression of these genes is an independent prognostic factor in training and validation cohorts. Patients divided into low and high IPS show significant difference in survival in both the cohorts. Detailed in-silico study of component of IPS showed that all the five genes are differentially expressed in OC compared to normal tumors. Further, we showed that there was no correlation among the five genes, suggesting that all the genes predict survival independently and not selected because of co-regulation. GO terms associated with co-regulated genes indicated the immune related function of these genes among the others. Interestingly, many of the five genes are already been shown to be involve in cancer. ALK is one of the most important genes in NSCLC 22. ALK is a tyrosine kinase and is a driver genetic aberration in NSCLC 22. ALK inhibitors have been approved for the treatment of EML4-ALK fusion positive tumors 22. ALK has also been proposed to be important target for ovarian cancer treatment 23. ADA is enzyme involve in hydrolysis of adenosine to inosine. Various mutation of this genes have been associated with many different kind of diseases. Genetic variation in ADA genes have been shown to be involve in uterine leiomyomas 24. ADA gene has been proposed to be an important prognostic marker in patients with malignant pleural effusion 25. C6 protein is important factor in complement cascade and involve in membrane attack complex 26. C6 genotype rs9200 is shown to be associate with hepatocellular carcinoma recurrence 27. C6 gene is also underexpressed in oesophageal carcinoma 28. CASP8 is a member of caspase family and involve in apoptosis process. CASP8 have been shown to be involve in triggering and sensing DNA damage in liver cancer a non-apoptotic function29. C1QTNF3 is not well studied protein in context of cancer. But, C1QTNF3 is important gene in type II Diabetes 30. Most importantly, in this study, we have identified various pathways activated in different IPS groups. GSEA analysis using differentially expressed genes between high and low IPS groups showed that IPS high group have activated EMT, hypoxia and KRAS signalling pathways suggesting the reason why this group have more aggressive tumors with poor survival. Also, we found that immune related pathways interferon alpha and interferon gamma responses are negatively enriched, indicating that this group of patients have less immune activity and may not be ideal candidate for the immunotherapy. Limitation of our study is the retrospective nature. Although we have included two completely different datasets as training and testing cohort, further validation of signature would be required to validate the usefulness study. Also, we have no performed molecular biology lab work to show the function of IPS component genes. In summary, we have developed and validated a robust immune prognostic signature for prognostication of ovarian tumors. We also showed that the component genes are associated with immune modulation. Importantly, we also showed that patients with poor survival also have poor immune activation in milieu