Associate Professor of Genome Biology
Molecular biology and the biological sciences in general have undergone a technical revolution over the last decade, founded upon the ability to sequence and reconstruct an organism's genomic blueprint in its entirety. Subsequent technical advances such as expression or tiled genomic microarrays and now high-throughput sequencing technologies (HTS) allow us to investigate, on the scale of the whole genome, how and in what situations particular parts of that blueprint are actually used. Although the biological questions remain the same as those asked at an individual gene or genomic loci, the methods to generate, analyse and combine these whole-genome data-types are different and require specialist approaches and skills.
One of the most fundamental questions in molecular biology is how are specific parts of the genomic blueprint used in specific situations when the same underlying genomic sequence is used whether a cell becomes a neuronal cell or a blood cell. The most basic expression of a genome's activity is the RNA it produces or "expresses" as in the form of mRNA this will go on to determine which proteins are produced in the cell. It has also become clear that RNA which does not produce protein (non-coding or ncRNA) also has a vital and complex regulatory function within the genome.
One of the main research interests of the Genome Biology group is to study the processes that determine whether RNA is or is not produced from a genomic locus as cells develop into red blood cells (erythropoiesis) and which factors determine the rate at which it is produced. We employ most of the current genome-wide methods to determine which parts of the genome are being transcribed into RNA (RNA-seq), investigating both the stable fractions (mRNA) and raw output of the genome (nascent). We correlate this activity (transcription) with changes in the distribution and chemical modifications of the nucleosomal proteins associated with genomic DNA (DNase-seq and ChIP-seq) and which regulatory proteins (transcription factor ChIP-seq) are bound to the DNA, in an effort to determine how these changes regulate RNA expression (Figure 1 and 2).
Although we use many existing methodologies the group also develops novel assays where needed to fill many of the current deficiencies in our ability to assess genome behaviour. One of the most difficult problems when trying to understand gene regulation on the scale of genome or at individual genes is to determine which regions of the genome control the expression patterns and levels of any particular gene. To address this problem we developed the Capture-C 3C method which allows us to interrogate the regulatory landscapes of hundreds of genes in a single experiment (Figure 3). We are now using the Capture-C method, in combination with our genomics and transcriptomics data, to link genes and regulatory elements en masse in the erythroid system.
We are at present trying to determine which parts of the surrounding genome are functionally required to regulate the transcription of a particular gene or transcript (cis-regulatory elements) and how the molecular events at these regions lead to production of RNA at a remote gene promoter. This represents a fundamental lack in our current understanding of gene regulation and is a necessary step to a complete understanding of this process.
Due to the size and complexity of the datasets produced the group is heavily reliant on bioinformatics to analyse and correlate these data and has a lot of experience in using and developing these types of tools in its own right and as a strong collaboration with the Oxford Computational Biology Research Group (CBRG). The group is very collaborative in structure and works closely with other groups within the MHU department in particular and the WIMM as a whole. This efficient structure allows observations derived from genome-wide observations to be functionally tested in well-understood paradigms of gene regulation such as the α globin locus and facilitates the genome-wide analysis of concepts gained from the careful interrogation of the model loci.
NGseqBasic - a single-command UNIX tool for ATAC-seq, DNaseI-seq, Cut-and-Run, and ChIP-seq data mapping, high-resolution visualisation, and quality control
Telenius J. et al, (2018)
HoxC5 and miR-615-3p target newly evolved genomic regions to repress hTERT and inhibit tumorigenesis.
Yan T. et al, (2018), Nat Commun, 9
Robust detection of chromosomal interactions from small numbers of cells using low-input Capture-C.
Oudelaar AM. et al, (2017), Nucleic Acids Res, 45
Low-input Capture-C: A Chromosome Conformation Capture Assay to Analyze Chromatin Architecture in Small Numbers of Cells.
Oudelaar AM. et al, (2017), Bio Protoc, 7
Sasquatch: predicting the impact of regulatory SNPs on transcription factor binding from cell- and tissue-specific DNase footprints.
Schwessinger R. et al, (2017), Genome Res, 27, 1730 - 1742