Analysis of sequence variation underlying tissue-specific transcription factor binding and gene expression
Lower KM., De Gobbi M., Hughes JR., Derry CJ., Ayyub H., Sloane-Stanley JA., Vernimmen D., Garrick D., Gibbons RJ., Higgs DR.
Although mutations causing monogenic disorders most frequently lie within the affected gene, sequence variation in complex disorders is more commonly found in noncoding regions. Furthermore, recent genome- wide studies have shown that common DNA sequence variants in noncoding regions are associated with "normal" variation in gene expression resulting in cell-specific and/or allele-specific differences. The mechanism by which such sequence variation causes changes in gene expression is largely unknown. We have addressed this by studying natural variation in the binding of key transcription factors (TFs) in the well-defined, purified cell system of erythropoiesis. We have shown that common polymorphisms frequently directly perturb the binding sites of key TFs, and detailed analysis shows how this causes considerable (~10-fold) changes in expression from a single allele in a tissue-specific manner. We also show how a SNP, located at some distance from the recognized TF binding site, may affect the recruitment of a large multiprotein complex and alter the associated chromatin modification of the variant regulatory element. This study illustrates the principles by which common sequence variation may cause changes in tissue-specific gene expression, and suggests that such variation may underlie an individual's propensity to develop complex human genetic diseases. © 2013 WILEY PERIODICALS, INC.