A new study published in Genome Medicine led by Koohy Group has developed a novel deep learning-based workflow to predict the immune targets of CD8+ T-cells.
The immune response is orchestrated by both innate and adaptive immune cells. Among these are T-cells, which are important for fighting diseases like cancer and infections. Finding the specific parts of proteins (called epitopes) that T-cells recognize is a challenging but important task.
T-cells protect against cancer and pathogens.
CD8+ T-cells, also known as cytotoxic T-cells, act like elite soldiers defending against infections and diseases. Their primary mission is to seek out and destroy cells that have been infected by viruses, bacteria or even cancer by recognizing specific markers called antigens on the surface of infected cells.
Precise antigen recognition is essential for several reasons. For example, it helps the body eliminate the source of an infection, preventing it from spreading further. It also enables the immune system to 'remember' the specific antigens associated with a particular pathogen, facilitating long-term immunity. If the same pathogen attacks again, CD8+ T-cells can quickly recognize it and mount a rapid defence, a concept utilized in vaccine development.
Identifying immune targets of CD8+ T-cells
In this study, led by Dr Chloe Hyun-Jung Lee and Professor Hashem Koohy (MRC HIU), the authors developed a robust deep learning-based workflow called TRAP (T-cell recognition potential of HLA-I presented peptides) to better detect the antigens of the cytotoxic T-cells.
Accurate epitope prediction has posed a significant challenge, given the limited experimental data, dissimilarities in antigen sequences and genetic factors that influence T-cell binding. The peptide-MHC-TCR complexes underpinning the T-cell recognition of the antigens also complicate the development of predictive models. However, the authors used several novel strategies to make TRAP more accurate and reliable.
- They focused on specific positions within the peptide sequences that are involved in T-cell binding - to avoid bias caused by certain genetic factors such as HLA molecules that present antigens on the surface of infected cells. This ensured that the algorithm was trained on a larger dataset and was applicable to a broader population without bias.
- They used advanced language models to encode peptide sequences, allowing them to consider the chemical and physical characteristics of amino acids in protein space.
- They used a convolutional neural network to identify recognition patterns for T-cells and also included information about the strength of binding between peptides and MHC molecules (a crucial aspect of T-cell recognition).
- They developed a mechanism to report low confidence predictions to avoid unreliable results.
Overall, TRAP represents a significant advancement in predicting CD8+ T-cell epitopes, offering more accurate and robust results compared to previous algorithms. TRAP has been developed as a user-friendly web application and can be accessed at https://github.com/ChloeHJ/TRAP.
The study authors demonstrated the application of TRAP by identifying therapeutic targets against glioblastoma. Identifying cancer neoepitopes (epitopes within tumour cells) has been regarded as a ‘needle in a haystack’ problem. This is due to the extremely small number of epitopes that need to be hunted among many peptides. In this scenario, the authors prioritized minimizing the loss of true epitopes from the candidate list over maximizing accuracy (in the tradeoff between losing true positives at the cost of getting more negatives). They demonstrated that TRAP not only outperformed existing algorithms but also allowed for the optimization of candidates to minimize the loss of likely antigens.
In a parallel study, the authors used TRAP to systematically evaluate the impact of mutations that give rise to variants of concern. Emerging pathogens such as coronaviruses have posed a significant threat in recent years, and new variants and pathogens are expected to emerge in the coming years. Therefore, it is critical to monitor variants of concern and assess their pandemic potential. TRAP has proven highly effective in evaluating the immune potential of all theoretical SARS-CoV-2 variants and identifying harmful mutations. Such an iterative process of refining the training data, model architecture, and validating predictions will greatly facilitate efforts to mitigate the impact of another pandemic.
Dr Chloe Hyun-Jung Lee, the first author of the paper, said: “Importantly, this study can pave the way for a better understanding of the immune response and the improved detection of therapeutic targets for the development of new treatments and vaccines for cancer, autoimmune diseases, and infectious diseases.”
Professor Hashem Koohy said “This study, underpinned by the innovative model TRAP, seeks to delve deeper into the complexities of the T-cell immune response. Despite the remarkable strides made in single-cell technologies and the revolutionary advancements in machine learning and artificial intelligence, fully unravelling the intricate rules governing T-cell recognition of antigens remains as a challenge. A pivotal move towards addressing this is predicting the antigens that will trigger an effective T-cell response and pinpointing the hallmarks of antigen immunogenicity. TRAP, a state-of-the-art deep learning workflow, stands out in its ability to predict CD8 T cell epitopes. Tools like TRAP are invaluable, not only for identifying cancer neoantigens and/or viral antigens – crucial for the design of more potent vaccines – in a context-aware manner, but also for shedding light on the source of T-cell autoreactivity in autoimmune and inflammatory diseases. I would like to thank the MRC for funding this study.”
Read the full paper here.