Main Content

gcrma

Perform GC Robust Multi-array Average (GCRMA) background adjustment, quantile normalization, and median-polish summarization on Affymetrix microarray probe-level data

Syntax

ExpressionMatrix = gcrma(PMMatrix, MMMatrix, ProbeIndices, AffinPM, AffinMM)
ExpressionMatrix = gcrma(PMMatrix, MMMatrix, ProbeIndices, SequenceMatrix)
ExpressionMatrix = gcrma(..., 'ChipIndex', ChipIndexValue, ...)
ExpressionMatrix = gcrma(..., 'OpticalCorr', OpticalCorrValue, ...)
ExpressionMatrix = gcrma(..., 'CorrConst', CorrConstValue, ...)
ExpressionMatrix = gcrma(..., 'Method', MethodValue, ...)
ExpressionMatrix = gcrma(..., 'TuningParam', TuningParamValue, ...)
ExpressionMatrix = gcrma(..., 'GSBCorr', GSBCorrValue, ...)
ExpressionMatrix = gcrma(..., 'Normalize', NormalizeValue, ...)
ExpressionMatrix = gcrma(..., 'Verbose', VerboseValue, ...)

Input Arguments

PMMatrix

Matrix of intensity values where each row corresponds to a perfect match (PM) probe and each column corresponds to an Affymetrix® CEL file. (Each CEL file is generated from a separate chip. All chips should be of the same type.)

Tip

You can use the PMIntensities matrix returned by the celintensityread function.

MMMatrix

Matrix of intensity values where each row corresponds to a mismatch (MM) probe and each column corresponds to an Affymetrix CEL file. (Each CEL file is generated from a separate chip. All chips should be of the same type.)

Tip

You can use the MMIntensities matrix returned by the celintensityread function.

ProbeIndices

Column vector containing probe indices. Probes within a probe set are numbered 0 through N - 1, where N is the number of probes in the probe set.

Tip

You can use the affyprobeseqread function to generate this column vector.

AffinPMColumn vector of PM probe affinities.

Tip

You can use the affyprobeaffinities function to generate this column vector.

AffinMMColumn vector of MM probe affinities.

Tip

You can use the affyprobeaffinities function to generate this column vector.

SequenceMatrix

An N-by-25 matrix of sequence information for the perfect match (PM) probes on the Affymetrix GeneChip® array, where N is the number of probes on the array. Each row corresponds to a probe, and each column corresponds to one of the 25 sequence positions. Nucleotides in the sequences are represented by one of the following integers:

  • 0 — None

  • 1 — A

  • 2 — C

  • 3 — G

  • 4 — T

Tip

You can use the affyprobeseqread function to generate this matrix. If you have this sequence information in letter representation, you can convert it to integer representation using the nt2int function.

ChipIndexValuePositive integer specifying a column index in MMMatrix, which specifies a chip. This chip intensity data is used to compute probe affinities. Default is 1.
OpticalCorrValueControls the use of optical background correction on the PM and MM intensity values in PMMatrix and MMMatrix. Choices are true (default) or false.
CorrConstValueValue that specifies the correlation constant, rho, for background intensity for each PM/MM probe pair. Choices are any value ≥ 0 and ≤ 1. Default is 0.7.
MethodValueCharacter vector or string that specifies the method to estimate the signal. Choices are 'MLE', a faster, ad hoc Maximum Likelihood Estimate method, or 'EB', a slower, more formal, empirical Bayes method. Default is 'MLE'.
TuningParamValueValue that specifies the tuning parameter used by the estimate method. This tuning parameter sets the lower bound of signal values with positive probability. Choices are a positive value. Default is 5 (MLE) or 0.5 (EB).

Tip

For information on determining a setting for this parameter, see Wu et al., 2004.

GSBCorrValueSpecifies whether to perform gene-specific binding (GSB) correction using probe affinity data. Choices are true (default) or false. If there is no probe affinity information, this property is ignored.
NormalizeValueControls whether quantile normalization is performed on background adjusted data. Choices are true (default) or false.
VerboseValue

Controls the display of a progress report showing the number of each chip as it is completed. Choices are true (default) or false.

Output Arguments

ExpressionMatrixMatrix of log2 expression values where each row corresponds to a gene (probe set) and each column corresponds to an Affymetrix CEL file, which represents a single chip.

Description

ExpressionMatrix = gcrma(PMMatrix, MMMatrix, ProbeIndices, AffinPM, AffinMM) performs GCRMA background adjustment, quantile normalization, and median-polish summarization on Affymetrix microarray probe-level data using probe affinity data. ExpressionMatrix is a matrix of log2 expression values where each row corresponds to a gene (probe set) and each column corresponds to an Affymetrix CEL file, which represents a single chip.

Note

There is no column in ExpressionMatrix that contains probe set or gene information.

ExpressionMatrix = gcrma(PMMatrix, MMMatrix, ProbeIndices, SequenceMatrix) performs GCRMA background adjustment, quantile normalization, and Robust Multi-array Average (RMA) summarization on Affymetrix microarray probe-level data using probe sequence data to compute probe affinity data. ExpressionMatrix is a matrix of log2 expression values where each row corresponds to a gene (probe set) and each column corresponds to an Affymetrix CEL file, which represents a single chip.

Note

If AffinPM and AffinMM affinity data and SequenceMatrix sequence data are not available, you can still use the gcrma function by entering an empty matrix for these inputs in the syntax.

ExpressionMatrix = gcrma( ...'PropertyName', PropertyValue, ...) calls gcrma with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Each PropertyName must be enclosed in single quotes and is case insensitive. These property name/property value pairs are as follows:

ExpressionMatrix = gcrma(..., 'ChipIndex', ChipIndexValue, ...) computes probe affinities from MM probe intensity data from the chip with the specified column index in MMMatrix. Default ChipIndexValue is 1. If AffinPM and AffinMM affinity data are provided, this property is ignored.

ExpressionMatrix = gcrma(..., 'OpticalCorr', OpticalCorrValue, ...) controls the use of optical background correction on the PM and MM intensity values in PMMatrix and MMMatrix. Choices are true (default) or false.

ExpressionMatrix = gcrma(..., 'CorrConst', CorrConstValue, ...) specifies the correlation constant, rho, for background intensity for each PM/MM probe pair. Choices are any value ≥ 0 and ≤ 1. Default is 0.7.

ExpressionMatrix = gcrma(..., 'Method', MethodValue, ...) specifies the method to estimate the signal. Choices are MLE, a faster, ad hoc Maximum Likelihood Estimate method, or EB, a slower, more formal, empirical Bayes method. Default is MLE.

ExpressionMatrix = gcrma(..., 'TuningParam', TuningParamValue, ...) specifies the tuning parameter used by the estimate method. This tuning parameter sets the lower bound of signal values with positive probability. Choices are a positive value. Default is 5 (MLE) or 0.5 (EB).

Tip

For information on determining a setting for this parameter, see Wu et al., 2004.

ExpressionMatrix = gcrma(..., 'GSBCorr', GSBCorrValue, ...) specifies whether to perform gene specific binding (GSB) correction using probe affinity data. Choices are true (default) or false. If there is no probe affinity information, this property is ignored.

ExpressionMatrix = gcrma(..., 'Normalize', NormalizeValue, ...) controls whether quantile normalization is performed on background adjusted data. Choices are true (default) or false.

ExpressionMatrix = gcrma(..., 'Verbose', VerboseValue, ...) controls the display of a progress report showing the number of each chip as it is completed. Choices are true (default) or false.

Examples

  1. Load the MAT-file, included with the Bioinformatics Toolbox™ software, that contains Affymetrix data from a prostate cancer study. The variables in the MAT-file include seqMatrix, a matrix containing sequence information for PM probes, pmMatrix and mmMatrix, matrices containing PM and MM probe intensity values, and probeIndices, a column vector containing probe indexing information.

    load prostatecancerrawdata
  2. Compute the Affymetrix PM and MM probe affinities from their sequences and MM probe intensities.

    [apm, amm] = affyprobeaffinities(seqMatrix, mmMatrix(:,1),...
                 'ProbeIndices', probeIndices);
  3. Perform GCRMA background adjustment, quantile normalization, and Robust Multi-array Average (RMA) summarization on the Affymetrix microarray probe-level data and create a matrix of expression values.

    expdata = gcrma(pmMatrix, mmMatrix, probeIndices, seqMatrix);

The prostatecancerrawdata.mat file used in this example contains data from Best et al., 2005.

References

[1] Wu, Z., Irizarry, R.A., Gentleman, R., Murillo, F.M., and Spencer, F. (2004). A Model Based Background Adjustment for Oligonucleotide Expression Arrays. Journal of the American Statistical Association 99(468), 909–917.

[2] Wu, Z., and Irizarry, R.A. (2005). Stochastic Models Inspired by Hybridization Theory for Short Oligonucleotide Arrays. Proceedings of RECOMB 2004. J Comput Biol. 12(6), 882–93.

[3] Wu, Z., and Irizarry, R.A. (2005). A Statistical Framework for the Analysis of Microarray Probe-Level Data. Johns Hopkins University, Biostatistics Working Papers 73.

[4] Speed, T. (2006). Background models and GCRMA. Lecture 10, Statistics 246, University of California Berkeley.

[5] Best, C.J.M., Gillespie, J.W., Yi, Y., Chandramouli, G.V.R., Perlmutter, M.A., Gathright, Y., Erickson, H.S., Georgevich, L., Tangrea, M.A., Duray, P.H., Gonzalez, S., Velasco, A., Linehan, W.M., Matusik, R.J., Price, D.K., Figg, W.D., Emmert-Buck, M.R., and Chuaqui, R.F. (2005). Molecular alterations in primary prostate cancer after androgen ablation therapy. Clinical Cancer Research 11, 6823–6834.

Version History

Introduced in R2007a