ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

feed icon rss

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
  • 1
    Electronic Resource
    Electronic Resource
    Berkeley, Calif. : Berkeley Electronic Press (now: De Gruyter)
    Statistical applications in genetics and molecular biology 4.2005, 1, art1 
    ISSN: 1544-6115
    Source: Berkeley Electronic Press Academic Journals
    Topics: Biology
    Notes: Transcription factors and many other DNA-binding proteins recognizemore than one specific sequence. Among sequences recognized by a givenDNA-binding protein, different positions exhibit varying degrees ofconservation. The reason is that base pairs that are more extensivelycontacted by the protein tend to be more conserved. This observationcan be used in the discovery of transcription factor bindingsites. Here we present a rigorous means to accomplish this. Inparticular, we constrain the order of the information (entropy) in thecolumns of the position specific weight matrix (PWM) whichcharacterizes the motif being sought. We then show how to compute themaximum likelihood estimate of a PWM under such orderrestrictions. This computation is easily integrated with the EMalgorithm or the Gibbs sampler to enhance performance in the searchfor motifs in unaligned sequences. We demonstrate our method on awell-known data set of binding sites of the transcription factor Crpin E. coli.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    Electronic Resource
    Electronic Resource
    Berkeley, Calif. : Berkeley Electronic Press (now: De Gruyter)
    Statistical applications in genetics and molecular biology 2.2003, 1, art5 
    ISSN: 1544-6115
    Source: Berkeley Electronic Press Academic Journals
    Topics: Biology
    Notes: Identification of transcription factor binding sites (regulatory motifs) is a major interest in contemporary biology. We propose a new likelihood based method, COMODE, for identifying structural motifs in DNA sequences. Commonly used methods (e.g. MEME, Gibbs motif sampler) model binding sites as families of sequences described by a position weight matrix (PWM) and identify PWMs that maximize the likelihood of observed sequence data under a simple multinomial mixture model. This model assumes that the positions of the PWM correspond to independent multinomial distributions with four cell probabilities. We address supervising the search for DNA binding sites using the information derived from structural characteristics of protein-DNA interactions. We extend the simple multinomial mixture model to a constrained multinomial mixture model by incorporating constraints on the information content profiles or on specific parameters of the motif PWMs. The parameters of this extended model are estimated by maximum likelihood using a nonlinear constraint optimization method. Likelihood-based cross-validation is used to select model parameters such as motif width and constraint type. The performance of COMODE is compared with existing motif detection methods on simulated data that incorporate real motif examples from Saccharomyces cerevisiae. The proposed method is especially effective when the motif of interest appears as a weak signal in the data. Some of the transcription factor binding data of Lee et al. (2002) were also analyzed using COMODE and biologically verified sites were identified.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...