- Search for TFs - Search for Genes - Search for Associations
Group genes:
Pattern Matching:
Utilities:
Retrieve:
About Yeastract:
Support & suggestions:
|
Index
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Example 1 | Example 2 | Example 3 | Example 4 | |||||||||||
| Input | Output | Input | Output | Input | Output | Input | Output | |||||||
| ATAT TATA |
-> | ATAT TATA |
AAAT AAAA |
-> | AAAW | AAAT TAAA |
-> | AAAT TAAA |
TCCGCGGA TCCGTGGA TCCACGGA TCCGCGCA TCCGCGGG |
-> | TCCGCGCA TCCACGGA TCCGTGGA TCCGCGGR |
|||
The utility requires a set of sequences of the same length.
Back to topThe output is the most compressed representation of the inserted set of DNA sequences. This compaction of is different from the usual probabilistic compaction of aligned DNA sequences that is based on the position specific frequency of bases. Since the IUPAC code generation as implemented in YEASTRACT does not consider probability of occurrence of a base at a particular location, it does not add further information other than the given sequences. Therefore, the compressed code generated in such a way would give the same set of sequences upon decompression, therefore, preserving the original details. The result is usually a smaller set of strings with one or more ambiguous bases, describing concisely, but also precisely, a set of similar or related DNA strings, such as TF binding sites.
IUPAC code generation is an adoption of ESPRESSO tool for multiple-valued logic minimization by Richard Rudell and Alberto Sangiovanni-Vincentelli, done by Nuno Mendes and David Nunes [2].
Back to topStandard IUPAC Nucleotide code is used to describe ambiguous sites in a given DNA sequence motif, where a single character may represent more than one nucleotide. The code is shown in the table below.
| IUPAC Code |
Meaning |
Origin of Description |
|---|---|---|
| G |
G |
Guanine |
| A |
A |
Adenine |
| T |
T |
Thymine |
| C |
C |
Cytosine |
| R |
G or A |
puRine |
| Y |
T or C |
pYrimidine |
| M |
A or C |
aMino |
| K |
G or T |
Ketone |
| S |
G or C |
Strong interaction |
| W |
A or T |
Weak interaction |
| H |
A or C or T |
not-G, H follows G in the alphabet |
| B |
G or T or C |
not-A, B follows A in the alphabet |
| V |
G or C or A |
not-T (not-U), V follows U in the alphabet |
| D |
G or A or T |
not-C, D follows C in the alphabet |
| N |
G or A or T or C |
aNy |
[1] R. Rudell and A. L. Sangiovanni-Vincentelli. Multiple-Valued
Minimization for PLA Optimization. IEEE Transactions on Computer-Aided
Design, CAD-6:727-750, September 1987. Link
[2] N. Mendes and D. Nunes, Geração de Código IUPAC, INESC-ID Tec. Rep. 16/2004, Jul 2004. Link
[3] Biochem J. 229(2): 281–286, July 15 1985. PubMed
Back to top