seqlogo
Display sequence logo for nucleotide or amino acid sequences
Syntax
seqlogo(
Seqs
)
seqlogo(Profile
)
WgtMatrix
=
seqlogo(...)
[WgtMatrix
, Handle
]
= seqlogo(...)
seqlogo(..., 'Displaylogo', DisplaylogoValue
,
...)
seqlogo(..., 'Alphabet', AlphabetValue
,
...)
seqlogo(..., 'Startat', StartatValue
,
...)
seqlogo(..., 'Endat', EndatValue
,
...)
seqlogo(..., 'SSCorrection', SSCorrectionValue
,
...)
Input Arguments
Seqs | Set of pairwise or multiply aligned nucleotide or amino acid sequences, represented by any of the following:
|
Profile | Sequence profile distribution matrix
with the frequency of nucleotides or amino acids for every column
in the multiple alignment, such as returned by the The size of the frequency distribution matrix is:
If gaps were included, |
DisplaylogoValue | Controls the display of a sequence logo. Choices are |
AlphabetValue | Character vector or string specifying the type of sequence (nucleotide or amino acid).
Choices are |
StartatValue | Positive integer that specifies the starting position
for the sequences in |
EndatValue | Positive integer that specifies the ending position for
the sequences in |
SSCorrectionValue | Controls the use of small sample correction in the estimation
of the number of bits. Choices are |
Output Arguments
WgtMatrix | Cell array containing the symbol list in Seqs or Profile and
the weight matrix used to graphically display the sequence logo. |
Handle | Handle to the sequence logo figure. |
Description
seqlogo(
displays
a sequence logo for Seqs
)Seqs
, a set of aligned
sequences. The logo graphically displays the sequence conservation
at a particular position in the alignment of sequences, measured in
bits. The maximum sequence conservation per site is log2(4)
bits
for nucleotide sequences and log2(20)
bits for
amino acid sequences. If the sequence conservation value is zero or
negative, no logo is displayed in that position.
seqlogo(
displays
a sequence logo for Profile
)Profile
, a sequence
profile distribution matrix with the frequency of nucleotides or amino
acids for every column in the multiple alignment, such as returned
by the seqprofile
function.
Color Code for Nucleotides
Nucleotide | Color |
---|---|
A | Green |
C | Blue |
G | Yellow |
T , U | Red |
Other | Purple |
Color Code for Amino Acids
Amino Acid | Chemical Property | Color |
---|---|---|
G S T Y C Q N | Polar | Green |
A V L I P W F M | Hydrophobic | Orange |
D E | Acidic | Red |
K R H | Basic | Blue |
Other | — | Tan |
returns a cell array of unique symbols in
the sequence WgtMatrix
=
seqlogo(...)Seqs
or Profile
,
and the information weight matrix used to graphically display the
logo.
[
returns a handle to the sequence logo figure. WgtMatrix
, Handle
]
= seqlogo(...)
seqlogo(
calls Seqs
, ...'PropertyName
', PropertyValue
,
...)seqpdist
with optional properties
that use property name/property value pairs. You can specify one or
more properties in any order. Each PropertyName
must
be enclosed in single quotation marks and is case insensitive. These
property name/property value pairs are as follows:
seqlogo(..., 'Displaylogo',
controls the display of a sequence logo. Choices are DisplaylogoValue
,
...)true
(default)
or false
.
seqlogo(..., 'Alphabet',
specifies the type of sequence (nucleotide or amino
acid). Choices are AlphabetValue
,
...)'NT'
(default) or'AA'
.
Note
If you provide amino acid sequences to seqlogo
,
you must set Alphabet
to 'AA'
.
seqlogo(..., 'Startat',
specifies the starting position for the sequences
in StartatValue
,
...)Seqs
. Default starting position is 1
.
seqlogo(..., 'Endat',
specifies the ending position for the sequences in EndatValue
,
...)Seqs
.
Default ending position is the maximum length of the sequences in Seqs
.
seqlogo(..., 'SSCorrection',
controls the use of small sample correction in the
estimation of the number of bits. Choices are SSCorrectionValue
,
...)true
(default)
or false
.
Note
A simple calculation of bits tends to overestimate the conservation
at a particular location. To compensate for this overestimation, when SSCorrection
is
set to true
, a rough estimate is applied as an
approximate correction. This correction works better when the number
of sequences is greater than 50.
Examples
References
[1] Schneider, T.D., and Stephens, R.M. (1990). Sequence Logos: A new way to display consensus sequences. Nucleic Acids Research 18, 6097–6100.
Version History
Introduced before R2006a