chi2gof

Chi-square goodness-of-fit test

Syntax

h = chi2gof(x)

h = chi2gof(x,Name,Value)

[h,p] =
chi2gof(___)

[h,p,stats]
= chi2gof(___)

Description

example

h = chi2gof(x) returns a test decision for the null hypothesis that the data in vector x comes from a normal distribution with a mean and variance estimated from x, using the chi-square goodness-of-fit test. The alternative hypothesis is that the data does not come from such a distribution. The result h is 1 if the test rejects the null hypothesis at the 5% significance level, and 0 otherwise.

example

h = chi2gof(x,Name,Value) returns a test decision for the chi-square goodness-of-fit test with additional options specified by one or more name-value pair arguments. For example, you can test for a distribution other than normal, or change the significance level of the test.

example

[h,p] = chi2gof(___) also returns the p-value p of the hypothesis test, using any of the input arguments from the previous syntaxes.

example

[h,p,stats] = chi2gof(___) also returns the structure stats, containing information about the test statistic.

Examples

collapse all

Test for Normal Distribution

Open Live Script

Create a standard normal probability distribution object. Generate a data vector x using random numbers from the distribution.

pd = makedist('Normal');
rng default;  % for reproducibility
x = random(pd,100,1);

Test the null hypothesis that the data in x comes from a population with a normal distribution.

h = chi2gof(x)

h = 0

The returned value h = 0 indicates that chi2gof does not reject the null hypothesis at the default 5% significance level.

Test Hypothesis at Different Significance Level

Open Live Script

Create a standard normal probability distribution object. Generate a data vector x using random numbers from the distribution.

pd = makedist('Normal');
rng default;  % for reproducibility
x = random(pd,100,1);

Test the null hypothesis that the data in x comes from a population with a normal distribution at the 1% significance level.

[h,p] = chi2gof(x,'Alpha',0.01)

h = 0

p = 0.3775

The returned value h = 0 indicates that chi2gof does not reject the null hypothesis at the 1% significance level.

Test for Weibull Distribution Using Probability Distribution Object

Open Live Script

Load the light bulb lifetime sample data.

load lightbulb

Create a vector from the first column of the data matrix, which contains the lifetime in hours of the light bulbs.

x = lightbulb(:,1);

Test the null hypothesis that the data in x comes from a population with a Weibull distribution. Use fitdist to create a probability distribution object with A and B parameters estimated from the data.

pd = fitdist(x,'Weibull');
h = chi2gof(x,'CDF',pd)

h = 1

The returned value h = 1 indicates that chi2gof rejects the null hypothesis at the default 5% significance level.

Test for Poisson Distribution

Open Live Script

Create six bins, numbered 0 through 5, to use for data pooling.

bins = 0:5;

Create a vector containing the observed counts for each bin and compute the total number of observations.

obsCounts = [6 16 10 12 4 2];
n = sum(obsCounts);

Fit a Poisson probability distribution object to the data and compute the expected count for each bin. Use the transpose operator .' to transform bins and obsCounts from row vectors to column vectors.

pd = fitdist(bins','Poisson','Frequency',obsCounts');
expCounts = n * pdf(pd,bins);

Test the null hypothesis that the data in obsCounts comes from a Poisson distribution with a lambda parameter equal to lambdaHat.

[h,p,st] = chi2gof(bins,'Ctrs',bins,...
                        'Frequency',obsCounts, ...
                        'Expected',expCounts,...
                        'NParams',1)

h = 0

p = 0.4654

st = struct with fields:
    chi2stat: 2.5550
          df: 3
       edges: [-0.5000 0.5000 1.5000 2.5000 3.5000 5.5000]
           O: [6 16 10 12 6]
           E: [7.0429 13.8041 13.5280 8.8383 6.0284]

The returned value h = 0 indicates that chi2gof does not reject the null hypothesis at the default 5% significance level. The vector E contains the expected counts for each bin under the null hypothesis, and O contains the observed counts for each bin.

Test for Normal Distribution Using Function Handle

Open Live Script

Use the probability distribution function normcdf as a function handle in the chi-square goodness-of-fit test (chi2gof).

Test the null hypothesis that the sample data in the input vector x comes from a normal distribution with parameters µ and σ equal to the mean (mean) and standard deviation (std) of the sample data, respectively.

rng('default') % For reproducibility
x = normrnd(50,5,100,1);
h = chi2gof(x,'cdf',{@normcdf,mean(x),std(x)})

h = 0

The returned result h = 0 indicates that chi2gof does not reject the null hypothesis at the default 5% significance level.

Input Arguments

collapse all

`x` — Sample data
vector

Sample data for the hypothesis test, specified as a vector.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: 'NBins',8,'Alpha',0.01 pools the data into eight bins and conducts the hypothesis test at the 1% significance level.

`NBins` — Number of bins
`10` (default) | positive integer value

Number of bins to use for the data pooling, specified as the comma-separated pair consisting of 'NBins' and a positive integer value. If you specify a value for NBins, do not specify a value for Ctrs or Edges.

Example: 'NBins',8