Inside each of the cells in your body is an entire instruction manual containing all the information required to build an entire human being. Yet it isn’t just the words in that manual that are important: you have to read the right chapters, and in the right order. To build one particular part of a human, sometimes the end of one paragraph will redirect you to a different part of the book – but how do cells get redirected to the right bit? Complicated interactions between different parts of the instruction manual (otherwise known as your DNA) underlie the fascinating complexity of the human body, but understanding when, where and how they occur remains a fundamental challenge in biological research. In this blog, Marieke Oudelaar, a DPhil student in Jim Hughes’ lab at the WIMM, describes a new tool developed in the Hughes lab that holds the promise to decipher this complex code.
Our bodies are composed of trillions of cells. Each of these cells has its own specific job, such as converting food to energy, detecting light or fighting off infections. The functional units in our cells that perform these tasks are called proteins, and the instructions that are required for the production of these proteins are written in our DNA.
Because all the cells in our bodies are derived from the same fertilised egg, they all contain exactly the same DNA. But if our cells all contain the same instructions, then how can they each have such different jobs? How do they know whether they’re supposed to digest food, detect light or kill off bacteria?
We now know that different cell types use different parts of their DNA to make the precise set of proteins that are required to carry out their specific job. Or, in scientific terms: they ‘express’ different ‘genes’. A gene is a section of DNA that contains the instructions to make a specific protein, and expression is the process by which those instructions are read and turned into the protein itself.
But the regulation of gene expression is an extremely complex process, and at the moment we don’t really understand how cells know when to express which genes; we have a limited understanding of how a cell can selectively produce a specific set of proteins that allow it to do its job.
This is partially due to the fact that only 2% of all of our DNA (our genome) is actually made up of genes. The remaining 98% of our genome contain seemingly non-functional elements that have therefore also been referred to as ‘junk’ DNA.
However, this term more likely reflects our difficultly in interpreting the complete sequence of the genome, and it has been found that hidden in this ‘junk’ are sections of DNA that act as switches that play a critical role in telling a cell which genes to turn on (and express, therefore producing proteins) and which to turn off (or repress, stopping those particular proteins from being manufactured). These switches are often located at a great distance from the gene itself – imagine your DNA as a piece of string, with the switch at one end and the gene at the other.
So how does this work? How can a switch influence whether a gene is on or off, if it’s located so far away from it? We know that our DNA strings are flexible and able to bend, allowing the switch to physically interact with the gene itself, telling it whether to switch on or off. But exactly how these switches work is still an open question.
To resolve this issue, our first challenge is to find where these switches are located in the genome. This is somewhat like trying to find a needle in a haystack, because our genome is incredibly big. If you were to put all the DNA strings in one of your cells end to end, together they would be about 2 metres long! To fit that into a space of about 0.01 mm (which is the size of an average cell nucleus, where our DNA is stored) requires some serious packaging.
This brings us to the second challenge. Even if we manage to locate all the switches in our genome, it’s hard to link them to the genes they regulate, because of the complex way the DNA is folded in our cells. Though we know the linear sequence of our DNA, we don’t know its three-dimensional structure, and thus it’s hard to predict which genes will be close enough to a particular switch to be able to form an interaction that influences the expression of the gene.
One strategy that can be used to resolve this is to use Chromosome Conformation Capture (3C) techniques to analyse the three-dimensional organisation of our genome. Using 3C, it used to only be possible to either look at a large part of our genome in low resolution – like a very badly pixelated picture of a landscape – or to zoom in and get a sharp image of only a few trees in that landscape.
But in 2014, researchers in the WIMM – led by Prof Jim Hughes – developed Capture-C, which allows us to look in high resolution at the entre interacting landscape of hundreds of genes of interest in a single experiment. Dr James Davies has now further developed this technique into Next Generation Capture-C, published recently in Nature Methods, which has a vastly improved sensitivity to detect interactions in our DNA.
Next Generation Capture-C is able to pick up very rare and weak long-range interactions that span a huge distance in our DNA, and it can look at these interactions in very high resolution. It can see interactions that occur over a distance of more than 1.5 million base pairs (the letters in our DNA sequence) with a resolution of about 1000 base pairs. To put this into context, let’s imagine that our cell nuclei are not 0.01 mm but 1 meter in size. On this scale, Next Generation Capture-C is able to see interactions between segments in our DNA that are over 100 meters apart from each other with a 7 cm precision!
The high resolution and sensitivity that Next Generation Capture-C provides can help us to analyse the three-dimensional structure of our genome in much greater detail than ever before, and to study how this structure contributes to the regulation of gene expression.
And this isn’t just a cool technique to look at the shape of our DNA. It is thought that natural variation in DNA switches contributes greatly to our chances of getting certain diseases. With Next Generation Capture-C it is possible to identify the genes that are affected by the different DNA variants in these switches, and thus to learn more about the mechanism of the associated diseases and to identify new potential drug targets to treat these conditions.
Post edited by Bryony Graham, Jim Hughes and James Davies.