nt2aa - Convert nucleotide sequence to amino acid sequence

Syntax

SeqAA = nt2aa(SeqNT)

SeqAA = nt2aa(..., 'Frame', FrameValue, ...)
SeqAA = nt2aa(..., 'GeneticCode', GeneticCodeValue, ...)
SeqAA = nt2aa(..., 'AlternativeStartCodons', AlternativeStartCodonsValue, ...)

Arguments

SeqNT

One of the following:

Valid characters include:

  • A

  • C

  • G

  • T

  • U

  • hyphen (-)

      Note   Hyphens are valid only if the codon to which it belongs represents a gap, that is, the codon contains all hyphens. Example: ACT---TGA

      Tip   Do not use a sequence with hyphens if you specify 'all' for FrameValue.

FrameValue

Integer or string specifying a reading frame in the nucleotide sequence. Choices are 1, 2, 3, or 'all'. Default is 1.

If FrameValue is 'all', then SeqAA is a 3-by-1 cell array.

GeneticCodeValue

Integer or string specifying a genetic code number or code name from the table Genetic Code. Default is 1 or 'Standard'.

    Tip   If you use a code name, you can truncate the name to the first two letters of the name.

AlternativeStartCodonsValue

Controls the translation of alternative codons. Choices are true (default) or false.

Return Values

SeqAAAmino acid sequence specified by a character string of single-letter codes.

Description

SeqAA = nt2aa(SeqNT) converts a nucleotide sequence, specified by SeqNT, to an amino acid sequence, returned in SeqAA, using the standard genetic code.

SeqAA = nt2aa(SeqNT, ...'PropertyName', PropertyValue, ...) calls nt2aa with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Each PropertyName must be enclosed in single quotation marks and is case insensitive. These property name/property value pairs are as follows:


SeqAA = nt2aa(..., 'Frame', FrameValue, ...)
converts a nucleotide sequence for a specific reading frame to an amino acid sequence. Choices are 1, 2, 3, or 'all'. Default is 1. If FrameValue is 'all', then output SeqAA is a 3-by-1 cell array.

SeqAA = nt2aa(..., 'GeneticCode', GeneticCodeValue, ...) specifies a genetic code to use when converting a nucleotide sequence to an amino acid sequence. GeneticCodeValue can be an integer or string specifying a code number or code name from the table Genetic Code. Default is 1 or 'Standard'. The amino acid to nucleotide codon mapping for the Standard genetic code is shown in the table Standard Genetic Code.

SeqAA = nt2aa(..., 'AlternativeStartCodons', AlternativeStartCodonsValue, ...) controls the translation of alternative start codons. By default, AlternativeStartCodonsValue is set to true, and if the first codon of a sequence is a known alternative start codon, the codon is translated to methionine.

If this option is set to false, then an alternative start codon at the start of a sequence is translated to its corresponding amino acid in the genetic code that you specify, which might not necessarily be methionine. For example, in the human mitochondrial genetic code, AUA and AUU are known to be alternative start codons.

For more information about alternative start codons, see:

www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi?mode=t#SG1

Genetic Code

Code NumberCode Name
1Standard
2Vertebrate Mitochondrial
3Yeast Mitochondrial
4Mold, Protozoan, Coelenterate Mitochondrial, and Mycoplasma/Spiroplasma
5Invertebrate Mitochondrial
6Ciliate, Dasycladacean, and Hexamita Nuclear
9Echinoderm Mitochondrial
10Euplotid Nuclear
11Bacterial and Plant Plastid
12Alternative Yeast Nuclear
13Ascidian Mitochondrial
14Flatworm Mitochondrial
15Blepharisma Nuclear
16Chlorophycean Mitochondrial
21Trematode Mitochondrial
22Scenedesmus Obliquus Mitochondrial
23Thraustochytrium Mitochondrial

Standard Genetic Code

Amino Acid NameAmino Acid CodeNucleotide Codon
Alanine AGCT GCC GCA GCG
ArginineRCGT CGC CGA CGG AGA AGG
AsparagineNATT AAC
Aspartic acid (Aspartate) DGAT GAC
CysteineCTGT TGC
GlutamineQCAA CAG
Glutamic acid (Glutamate) EGAA GAG
GlycineGGGT GGC GGA GGG
HistidineHCAT CAC
IsoleucineIATT ATC ATA
LeucineLTTA TTG CTT CTC CTA CTG
LysineKAAA AAG
MethionineMATG
PhenylalanineFTTT TTC
Proline PCCT CCC CCA CCG
SerineSTCT TCC TCA TCG AGT AGC
ThreonineTACT ACC ACA ACG
TryptophanWTGG
TyrosineYTAT, TAC
ValineVGTT GTC GTA GTG
Asparagine or Aspartic acid (Aspartate) B Random codon from D and N
Glutamine or Glutamic acid (Glutamate) ZRandom codon from E and Q
Unknown amino acid (any amino acid) XRandom codon
Translation stop *TAA TAG TGA
Gap of indeterminate length ----
Unknown character (any character or symbol not in table) ????

Examples

Converting the ND1 Gene

  1. Use the getgenbank function to retrieve the nucleotide sequence for the human mitochondrion from the GenBank database.

    mitochondria = getgenbank('NC_001807', 'SequenceOnly', true);
    
  2. Extract the sequence for the ND1 gene from the nucleotide sequence.

    ND1gene = mitochondria (3308:4261);
    
  3. Convert the ND1 gene on the human mitochondria genome to an amino acid sequence using the Vertebrate Mitochondrial genetic code.

    protein1 = nt2aa(ND1gene,'GeneticCode', 2);
    
  4. Use the getgenpept function to retrieve the same amino acid sequence from the GenPept database.

    protein2 = getgenpept('NP_536843', 'SequenceOnly', true);
    
  5. Use the isequal function to compare the two amino acid sequences.

    isequal (protein1, protein2)
    
    ans =
    
         1

Converting the ND2 Gene

  1. Use the getgenbank function to retrieve the nucleotide sequence for the human mitochondrion from the GenBank database.

    mitochondria = getgenbank('NC_001807', 'SequenceOnly', true);
    
  2. Extract the sequence for the ND2 gene from the nucleotide sequence.

    ND2gene = mitochondria (4471:5511);
    
  3. Convert the ND2 gene on the human mitochondria genome to an amino acid sequence using the Vertebrate Mitochondrial genetic code.

    protein1 = nt2aa(ND2gene,'GeneticCode', 2);
    

  4. Use the getgenpept function to retrieve the same amino acid sequence from the GenPept database.

    protein2 = getgenpept('NP_536844', 'SequenceOnly', true);
    
  5. Use the isequal function to compare the two amino acid sequences.

    isequal (protein1, protein2)
    
    ans =
    
         1

See Also

Bioinformatics Toolbox functions: aa2nt, aminolookup, baselookup, codonbias, dnds, dndsml, geneticcode, revgeneticcode, seqtool

  


 © 1984-2009- The MathWorks, Inc.    -   Site Help   -   Patents   -   Trademarks   -   Privacy Policy   -   Preventing Piracy   -   RSS