Sahakyan Group – Integrative Computational Biology and Machine Learning
Combining computational biology, computational chemistry, and machine learning techniques with biological big data to unravel the higher genomic code of life.
About the Research
In the Sahakyan Group, we strive to make computational biology maximally independent from empirical experimental data, by basing our models and predictions on genomic sequences and core biological mechanisms. We achieve this by devising specific machine learning (supervised and unsupervised) methodologies that are physics and structure “aware”, and apply our ab initio modelling approaches to better understand gene regulation, mutations, and to spot driver DNA alterations in multigenic diseases (such as cancer, cardiomyopathies, autism).
Besides the direct benefits, our approach also lets us understand part of the biology that we cannot predict, i.e. the reminder - cell-specific factors not tightly inter-linked to our genomic blueprint. This may help us better characterise the genome-invariant factors involved in cell differentiation.
We seek enthusiastic individuals to join us and pursue a DPhil degree. Applicants are welcome with interests in either part or all of the genome, transcriptome, and proteome layers of information processing in life. We are particularly keen to decipher the higher genomic code of differential DNA damage susceptibility and repair efficiency, cross mapping our conclusions against the known mutation sites involved in cancer and other multigenic diseases. The work will proceed in close collaboration with the group of Prof. Peter McHugh.
The post will particularly suit computationally inclined individuals with enthusiasm and passion for life sciences and computers, coming from diverse background (computer science, chemistry, physics, engineering, biology). For inquiries and additional details, please contact email@example.com.
Students will benefit from close supervision and multidisciplinary vibrant working environment. They will gain valuable knowledge and hands-in experience in machine learning, biological sequence analyses, advanced computational biology techniques, evolutionary data analyses and computer programming. They will be exposed to modern genomics technologies, combining in-house and public experimental datasets for advanced model development. Students’ work will be finalised in first-author publications, with further opportunities to present in local and international scientific conferences.
Students will be enrolled on the MRC WIMM DPhil Course, which takes place in the autumn of their first year. Running over several days, this course helps students to develop basic research and presentation skills, as well as introducing them to a wide-range of scientific techniques and principles, ensuring that students have the opportunity to build a broad-based understanding of differing research methodologies.
Generic skills training is offered through the Medical Sciences Division's Skills Training Programme. This programme offers a comprehensive range of courses covering many important areas of researcher development: knowledge and intellectual abilities, personal effectiveness, research governance and organisation, and engagement, influence and impact. Students are actively encouraged to take advantage of the training opportunities available to them.
As well as the specific training detailed above, students will have access to a wide-range of seminars and training opportunities through the many research institutes and centres based in Oxford.
All MRC WIMM graduate students are encouraged to participate in the successful mentoring scheme of the Radcliffe Department of Medicine, which is the host department of the MRC WIMM. This mentoring scheme provides an additional possible channel for personal and professional development outside the regular supervisory framework. The RDM also holds an Athena SWAN Silver Award in recognition of our efforts to build a happy and rewarding environment where all staff and students are supported to achieve their full potential.
Sahakyan et al., “Machine learning model for sequence-driven DNA G-quadruplex formation”, Sci. Rep., 7:14535, 2017.
Sahakyan et al., “G-quadruplex structures within the 3’ UTR of LINE-1 elements stimulate retrotransposition”, Nature Str. Mol. Biol., 24:243-247, 2017.
Sahakyan and Balasubramanian, “Single genome retrieval of context-dependent variability in mutation rates for human germline”, BMC Genomics, 18:81, 2017.
Kwok et al., “rG4-seq reveals widespread formation of G-quadruplex structures in the human transcriptiome”, Nature Meth., 13:841-844, 2016.
Sahakyan and Balasubramanian, “Long genes and genes with multiple splice variants are enriched in pathways linked to cancer and other multigenic diseases”, BMC Genomics, 17:225, 2016.
Sahakyan et al, “Structure-based prediction of methyl chemical shifts in proteins”, J. Biomol. NMR, 50:331-346, 2011.