Title: Discovering Gapped Binding Sites of Yeast Transcription Factors
1Discovering Gapped Binding Sites of Yeast
Transcription Factors
The transcription of genes is mainly controlled
by interaction between transcription factors
(TFs) and their recognized binding sites (TFBSs).
To identify TFBSs is a challenging issue since
TFBSs are usually short and degenerate. A gapped
TFBS contains one or more highly degenerate
positions. Discovering gapped motifs is
difficult, because allowing highly degenerate
positions in a motif greatly enlarges the search
space and complicates the discovery process.
Here, we propose a new method for discovering
TFBSs, especially gapped motifs. Empirical tests
on 32 known yeast TFBSs show that the new method
is highly accurate in identifying gapped motifs,
outperforming current methods, and it also works
well on un-gapped motifs. Predictions on
additional 54 TFs successfully discover 11 gapped
and 38 un-gapped motifs supported by literature.
Figure The GAL4 motif contains CGG and CCG at
two flanking regions respectively, but the
in-between positions are degenerate.
Related paper Chien-Yu Chen, Huai-Kuang Tsai,
Chen-Ming Hsu, Mei-Ju May Chen, Hao-Geng Hung,
Grace Tzu-Wei Huang, and Wen-Hsiung Li,
Discovering Gapped Binding Sites of Yeast
Transcription Factors, Proceedings of the
National Academy of Sciences of the United State
of America, Vol. 105 (7), pp. 2527-2532, 2008