| Bioinformatics Toolbox™ | ![]() |
SeqAA = nt2aa(SeqNT)
SeqAA = nt2aa(...,
'Frame', FrameValue, ...)
SeqAA = nt2aa(..., 'GeneticCode', GeneticCodeValue, ...)
SeqAA = nt2aa(..., 'AlternativeStartCodons', AlternativeStartCodonsValue, ...)
| SeqNT | One of the following:
Valid characters include: |
| FrameValue | Integer or string specifying a reading frame in the nucleotide sequence. Choices are 1, 2, 3, or 'all'. Default is 1. If FrameValue is 'all', then SeqAA is a 3-by-1 cell array. |
| GeneticCodeValue | Integer or string specifying a genetic code number or code name from the table Genetic Code. Default is 1 or 'Standard'. |
| AlternativeStartCodonsValue | Controls the translation of alternative codons. Choices are true (default) or false. |
| SeqAA | Amino acid sequence specified by a character string of single-letter codes. |
SeqAA = nt2aa(SeqNT) converts a nucleotide sequence, specified by SeqNT, to an amino acid sequence, returned in SeqAA, using the standard genetic code.
SeqAA = nt2aa(SeqNT, ...'PropertyName', PropertyValue, ...) calls nt2aa with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Each PropertyName must be enclosed in single quotation marks and is case insensitive. These property name/property value pairs are as follows:
SeqAA = nt2aa(...,
'Frame', FrameValue, ...) converts
a nucleotide sequence for a specific reading frame to an amino acid
sequence. Choices are 1, 2, 3,
or 'all'. Default is 1. If FrameValue is 'all',
then output SeqAA is a 3-by-1 cell array.
SeqAA = nt2aa(..., 'GeneticCode', GeneticCodeValue, ...) specifies a genetic code to use when converting a nucleotide sequence to an amino acid sequence. GeneticCodeValue can be an integer or string specifying a code number or code name from the table Genetic Code. Default is 1 or 'Standard'. The amino acid to nucleotide codon mapping for the Standard genetic code is shown in the table Standard Genetic Code.
Tip If you use a code name, you can truncate the name to the first two letters of the name. |
SeqAA = nt2aa(..., 'AlternativeStartCodons', AlternativeStartCodonsValue, ...) controls the translation of alternative start codons. By default, AlternativeStartCodonsValue is set to true, and if the first codon of a sequence is a known alternative start codon, the codon is translated to methionine.
If this option is set to false, then an alternative start codon at the start of a sequence is translated to its corresponding amino acid in the genetic code that you specify, which might not necessarily be methionine. For example, in the human mitochondrial genetic code, AUA and AUU are known to be alternative start codons.
For more information about alternative start codons, see:
www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi?mode=t#SG1
Genetic Code
| Code Number | Code Name |
|---|---|
| 1 | Standard |
| 2 | Vertebrate Mitochondrial |
| 3 | Yeast Mitochondrial |
| 4 | Mold, Protozoan, Coelenterate Mitochondrial, and Mycoplasma/Spiroplasma |
| 5 | Invertebrate Mitochondrial |
| 6 | Ciliate, Dasycladacean, and Hexamita Nuclear |
| 9 | Echinoderm Mitochondrial |
| 10 | Euplotid Nuclear |
| 11 | Bacterial and Plant Plastid |
| 12 | Alternative Yeast Nuclear |
| 13 | Ascidian Mitochondrial |
| 14 | Flatworm Mitochondrial |
| 15 | Blepharisma Nuclear |
| 16 | Chlorophycean Mitochondrial |
| 21 | Trematode Mitochondrial |
| 22 | Scenedesmus Obliquus Mitochondrial |
| 23 | Thraustochytrium Mitochondrial |
Standard Genetic Code
| Amino Acid Name | Amino Acid Code | Nucleotide Codon |
|---|---|---|
| Alanine | A | GCT GCC GCA GCG |
| Arginine | R | CGT CGC CGA CGG AGA AGG |
| Asparagine | N | ATT AAC |
| Aspartic acid (Aspartate) | D | GAT GAC |
| Cysteine | C | TGT TGC |
| Glutamine | Q | CAA CAG |
| Glutamic acid (Glutamate) | E | GAA GAG |
| Glycine | G | GGT GGC GGA GGG |
| Histidine | H | CAT CAC |
| Isoleucine | I | ATT ATC ATA |
| Leucine | L | TTA TTG CTT CTC CTA CTG |
| Lysine | K | AAA AAG |
| Methionine | M | ATG |
| Phenylalanine | F | TTT TTC |
| Proline | P | CCT CCC CCA CCG |
| Serine | S | TCT TCC TCA TCG AGT AGC |
| Threonine | T | ACT ACC ACA ACG |
| Tryptophan | W | TGG |
| Tyrosine | Y | TAT, TAC |
| Valine | V | GTT GTC GTA GTG |
| Asparagine or Aspartic acid (Aspartate) | B | Random codon from D and N |
| Glutamine or Glutamic acid (Glutamate) | Z | Random codon from E and Q |
| Unknown amino acid (any amino acid) | X | Random codon |
| Translation stop | * | TAA TAG TGA |
| Gap of indeterminate length | - | --- |
| Unknown character (any character or symbol not in table) | ? | ??? |
Converting the ND1 Gene
Use the getgenbank function to retrieve the nucleotide sequence for the human mitochondrion from the GenBank database.
mitochondria = getgenbank('NC_001807', 'SequenceOnly', true);
Extract the sequence for the ND1 gene from the nucleotide sequence.
ND1gene = mitochondria (3308:4261);
Convert the ND1 gene on the human mitochondria genome to an amino acid sequence using the Vertebrate Mitochondrial genetic code.
protein1 = nt2aa(ND1gene,'GeneticCode', 2);
Use the getgenpept function to retrieve the same amino acid sequence from the GenPept database.
protein2 = getgenpept('NP_536843', 'SequenceOnly', true);
Use the isequal function to compare the two amino acid sequences.
isequal (protein1, protein2)
ans =
1Converting the ND2 Gene
Use the getgenbank function to retrieve the nucleotide sequence for the human mitochondrion from the GenBank database.
mitochondria = getgenbank('NC_001807', 'SequenceOnly', true);
Extract the sequence for the ND2 gene from the nucleotide sequence.
ND2gene = mitochondria (4471:5511);
Convert the ND2 gene on the human mitochondria genome to an amino acid sequence using the Vertebrate Mitochondrial genetic code.
protein1 = nt2aa(ND2gene,'GeneticCode', 2);
Note In the ND2gene nucleotide sequence, the first codon is ATT, which is translated to M, while the subsequent ATT codons are translated to I. If you set 'AlternativeStartCodons' to false, then the first ATT codon is translated to I, the corresponding amino acid in the Vertebrate Mitochondrial genetic code. |
Use the getgenpept function to retrieve the same amino acid sequence from the GenPept database.
protein2 = getgenpept('NP_536844', 'SequenceOnly', true);
Use the isequal function to compare the two amino acid sequences.
isequal (protein1, protein2)
ans =
1Bioinformatics Toolbox functions: aa2nt, aminolookup, baselookup, codonbias, dnds, dndsml, geneticcode, revgeneticcode, seqtool
![]() | nmercount | nt2int | ![]() |
| © 1984-2009- The MathWorks, Inc. - Site Help - Patents - Trademarks - Privacy Policy - Preventing Piracy - RSS |