- Search for TFs - Search for Genes - Search for Associations
Group genes:
Pattern Matching:
Utilities:
Retrieve:
About Yeastract:
Support & suggestions:
|
Index
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Pattern |
Matches |
|---|---|
| TATATAAG | TATATAAG |
| TATAWAAM | TATAAAAC, TATAAAAA, TATATAAC, TATAAAAA |
| TATA[GC]AA[AT] | TATAGAAT, TATAGAAA, TATACAAT, TATACAAA |
This query requires a list of patterns to be searched for and, optionally, a list of genes. The patterns to be searched for must be at least four bases long.
Back to topIf both a list of patterns and genes are
introduced, the default option is to search for the patterns in the
promoter regions of those genes. If, however, the user selects the Check for all Genes option, then the promoter regions of all genes in the database are searched for the input patterns. This option is
aimed at the identification of genes under the same
regulatory signaling.
It is also possible to allow for one or two nucleotide substitutions in
the input pattern, by selecting the corresponding value in the box
labelled Substitutions.
The search returns a list of genes in whose
promoters the patterns were found, including the number of occurences in
each promoter. By clicking in the entries in the column named View, a view of the forward and reverse strands of the corresponding promoter, highlighted where it matched the query, is afforded. The patterns that matched the promoter sequences and their locations in the promoters are also displayed.
Simple nucleotide sequences are strings that
consist exclusively of the four characters that represent the DNA
nucleotides:
A, T, G and C. A search for a given simple nucleotide sequence only
returns
sequences that match the query string exactly.
Standard IUPAC Nucleotide code is used to describe ambiguous sites in a given DNA sequence motif, where a single character may represent more than one nucleotide. The code is shown in the table below.
| IUPAC Code |
Meaning |
Origin of Description |
|---|---|---|
| G |
G |
Guanine |
| A |
A |
Adenine |
| T |
T |
Thymine |
| C |
C |
Cytosine |
| R |
G or A |
puRine |
| Y |
T or C |
pYrimidine |
| M |
A or C |
aMino |
| K |
G or T |
Ketone |
| S |
G or C |
Strong interaction |
| W |
A or T |
Weak interaction |
| H |
A or C or T |
not-G, H follows G in the alphabet |
| B |
G or T or C |
not-A, B follows A in the alphabet |
| V |
G or C or A |
not-T (not-U), V follows U in the alphabet |
| D |
G or A or T |
not-C, D follows C in the alphabet |
| N |
G or A or T or C |
aNy |
A regular expression is a pattern containing characters and syntactic elements that matches a set of strings. The regular expression characters permitted in the searches for DNA motifs are those included in the IUPAC nucleotide code as well as the following syntactic element:
[] – Matches one of the characters contained in the brackets.
Back to top[1] Biochem J. 1985 July 15; 229(2): 281–286. (PubMed)