- Search for TFs - Search for Genes - Search for Associations
Group genes:
Pattern Matching:
Utilities:
Retrieve:
About Yeastract:
Support & suggestions:
|
Index
| |||||||||||||||||||||
| Quorum | Minimum percentage of genes that must contain the motifs in their promoters. |
| Family significance level cutoff | Specifies the maximum p-value of the set of motifs that are to be used in the generation of the families of motifs. |
| Substitutions | Maximum number of substitutions allowed for the corresponding box of the structured motif. |
| Minimum/maximum length | Minimum/maximum length of the corresponding box of the structured motif. |
| Add one box/remove last box | Should be clicked when one wishes to add/remove one box to/from the structured motif, respectively. |
| Minimum/maximum spacer length | Minimum/maximum length of the space between the boxes of the structured motif. |
Figure 4 shows how to add boxes to (and remove boxes from) the structured motif.
Back to topAn email will be sent to the given address once the algorithm finishes. This email contains information about the input parameters and a link to a web page, as shown in Figure 5.
A sample output page is shown in Figure 6, below. This page contains a link to the motif finder's output and a table listing the families of motifs. Each entry contains a logo depicting the PWM (Position Weight Matrix) of the family, the p-value of the root and a link to a file containing the list of motifs in the family and the PWM itself.
The families obtained are displayed in a number of pages, and these can be viewed by using the links shown in Figure 7, below.
Each of the PWMs in this output file can be compared to the TFBS contained in the YEASTRACT database, by following the steps depicted in Figure 8:
The default metric is "Sum of the Squared Distances", and the input PWM can also be trimmed. Trimming removes the columns at the edges of the PWM that have an information content below the selected threshold.
The comparison of the input PWM with the TFBS of the YEASTRACT database is done using a procedure described in [2]. First of all, the TFBS of the YEASTRACT database are converted to PWMs, using the IUPAC rules and assuming equiprobability between the nucleotides. A few examples are shown in Figure 9. The input PWM is then locally aligned (using the Smith-Waterman local alignment algorithm) with each of the TFBS PWMs, with the selected column distance metric to perform the alignment. Four distance metrics were implemented:
For the average log-likelihood ratio distance metric, the nucleotide background frequencies were corrected for the GC-content (38%) of S. cerevisiae promoter DNA.
Each of these metrics compares two columns, evaluating their similarity numerically and either favouring or penalizing their alignment. After the alignments are performed between the input PWM and all the TFBS, they are ordered by score, and the twenty top scoring alignments are displayed in the results table.
An example of the output obtained when a PWM of the family of motifs is compared with the YEASTRACT TFBS is shown in Figure 10. The results table contains TFBS that were found to be similar to the input PWM. Each row contains the TFBS, the TF it belongs to, whether the input PWM aligns on the forward or the reverse strand of the TFBS and the alignment of the input PWM with the PWM of the TFBS. An example of one local alignment is shown in Figure 11.
[1] Mendes N.D., Casimiro A.C., Santos P.M., Sá-Correia I., Oliveira A.L., Freitas A.T., MUSA: A parameter free algorithm for the identification of biologically significant motifs, Bioinformatics, 22, 2996-3002, 2006
[2] Mahony S, Auron PE, Benos PV (2007) DNA familial binding profiles made easy: Comparison of various motif alignment and clustering strategies., PLoS Comput Biol, 3(3): e61. doi:10.1371/journal.pcbi.0030061