Skip to Main Content Skip to Search
Home |   United Kingdom  Choose Country  |  Contact Us  |  Cart Store 
Create Account | Log In
Products & Services Industries Academia Support User Community Company

 

Product Support

1508 - Curve Fitting Guide


  1. Introduction
  2. Curve Fitting Features in MathWorks Products
    1. Curve Fitting Toolbox
    2. MATLAB Built-in Functions and Other Add-ons (Toolboxes) with Curve Fitting Capabilities
    3. Linear Curve Fitting
    4. Nonlinear Curve Fitting
  3. Weighted Curve Fitting Methods
    1. Curve Fitting Toolbox
    2. Statistics Toolbox
    3. Optimization Toolbox
  4. Improving Curve Fitting Results with the Curve Fitting Toolbox
  5. Other Related Material

Section 1: Introduction

MATLAB has both add-on products as well as built-in capabilities to solve many commonly encountered curve fitting problems. This Tech Note describes several methods that can be used to fit a curve to a given set of data. This Technical Note also explains the techniques that can be used to perform weighted curve fitting, fit a curve to a complex data, and other relevant issues. Typical examples are used to illustrate each curve fitting method.

Section 2: Curve Fitting Features in MathWorks Products

MATLAB has built-in functions that can be used for curve fitting. The MathWorks also has many toolboxes that can be used for curve fitting. These methods can be used for both linear and nonlinear curve fittings. MATLAB also has a dedicated toolbox, the Curve Fitting Toolbox, that can be used for parametric as well as nonparametric curve fitting. This section details the features of the Curve Fitting Toolbox and other toolboxes, as well as various MATLAB built-in functions, that can be used for curve fitting.

    1. Curve Fitting Toolbox

      The Curve Fitting Toolbox is designed specifically for fitting curves to data sets. This toolbox is a collection of graphical user interfaces (GUIs) and M-files functions built using MATLAB.

      • Parametric fitting is performed by using toolbox library equations (such as linear, quadratic, higher order polynomials, etc.) or by using custom equations (limited only by the user's imagination.) Use a parametric fit when you want to find the regression coefficients and the physical meaning behind them.
      • Nonparametric fitting is performed by using a smoothing spline or various interpolants. Use nonparametric fitting when the regression coefficients hold no physical significance and are not desired.

      The Curve Fitting Toolbox also provides functionality for

      • Data preprocessing, such as sectioning and smoothing
      • Standard linear least squares, nonlinear least squares, weighted least squares, constrained least squares, and robust fitting procedures
      • Fitting statistics required to determine how good the fit is, such as R-Square and Sum of Squares Due to Error (SSE)

      Be sure to review the demos pertaining to the Curve Fitting Toolbox .

    2. MATLAB Built-in Functions and Other Toolboxes with Curve Fitting Capabilities

      Apart from the Curve Fitting Toolbox, MATLAB and other toolboxes also provide several built-in features that can be used for linear and nonlinear curve fitting. This section lists and explains a few of these features.

    3. Linear Curve Fitting:

      MATLAB Built-in Functions

      Function Description
      POLYFIT Fit polynomial to data. POLYFIT(X,Y,N) finds the coefficients of a polynomial P(X) of degree N that fits the data, P(X(I))~=Y(I) , in a least-squares sense.
        \ Backslash or matrix left division. If A is a square matrix, A\B is roughly the same as inv(A)*B , except it is computed in a different way.
      POLYVAL Evaluates a polynomial at given points
      CORRCOEF Compute the correlation of two vectors. This can be used with the POLYFIT and POLYVAL functions to compute the R-Square correlation coefficient between actual data and the output of a fitted curve.

      Here is an example that uses the CORRCOEF function to compute the R-Square value:

      load census
      [p, s] = polyfit(cdate, pop, 2);
      Output = polyval(p,cdate);
      Correlation = corrcoef(pop, Output);

      pop is perfectly correlated with itself, as expected, and Output is also correlated with itself. The off-diagonal element is the correlation between pop and Output. This value is very close to 1, so the correlation between the actual data and the result of the fit is very good. Thus, the fit is a good fit.

      For additional examples using the backlash operator and the POLYFIT function to do regression and curve fitting, see the Regression and Curve Fitting section in the Using MATLAB documentation.

      Optimization Toolbox

      Function Description
      LSQLIN Constrained linear least squares
      LSQNONNEG Linear least squares with nonnegativity constraints

      These methods should be used instead of POLYFIT or backslash (\) when there are constraints. For examples using these two functions, consult the Optimization Toolbox documentation.

    4. Nonlinear Curve Fitting:

      MATLAB Built-in Functions

      Function Description
      FMINBND Scalar bounded nonlinear function minimization. This function can be used to minimize a function of only one variable on a fixed interval.
      FMINSEARCH Multidimensional unconstrained nonlinear minimization (Nelder-Mead method). This function finds the minimum of a scalar function of several variables, starting at an initial estimate. This is generally referred to as unconstrained nonlinear optimization.

      Below is a short example showing how to use FMINSEARCH.

      1. First create the data.
        t=0:.1:10;
        t=t(:); % To make t a column vector
        % Add random noise to the data
        Data=40*exp(-.5*t)+rand(size(t));
      2. Next, write a function M-file that accepts curve parameters as inputs and then outputs the fitting error.
        function sse=myfit(params,Input,Actual_Output)
        A=params(1);
        lamda=params(2);
        Fitted_Curve=A.*exp(-lamda*Input);
        Error_Vector=Fitted_Curve - Actual_Output;
        % When curvefitting, a typical quantity to
        % minimize is the sum of squares error
        sse=sum(Error_Vector.^2);
        % You could also write sse as
        % sse=Error_Vector(:)'*Error_Vector(:);
      3. Now call FMINSEARCH.
        Starting=rand(1,2);
        options=optimset('Display','iter');
        Estimates=fminsearch(@myfit,Starting,options,t,Data)
        
        % To check the fit plot(t,Data,'*') hold on plot(t,Estimates(1)*exp(-Estimates(2)*t),'r')
        Estimates will be a vector containing parameter estimates for the original data set.

      FMINSEARCH can often handle discontinuities, particularly if they do not occur near the solution. It also gives local solutions. FMINSEARCH only minimizes over the real numbers (i.e., the solution domain must only consist of real numbers and the output of the function must only return real numbers). When the domain of interest has complex variables, they must be split into real and imaginary parts.

      MATLAB Figure Window: The Basic Fitting Interface and Data Statistics Tools

      MATLAB also supports basic curve fitting through the Basic Fitting Interface. Using this interface, you can quickly perform many curve fitting tasks within the same easy-to-use environment. The interface is designed to

      • Fit data using a spline interpolant, a hermite interpolant, or up to a tenth-degree polynomial
      • Plot multiple fits simultaneously for a given data set
      • Plot the fit residuals
      • Examine the numerical results of a fit
      • Evaluate (interpolate or extrapolate) a fit
      • Annotate the plot with the numerical fit results and the norm of residuals
      • Save the fit and evaluated results to the MATLAB workspace

      Depending on your specific curve fitting application, you can use the Basic Fitting interface, the command line functionality, or both. You can use the Basic Fitting Interface only with 2-D data. However, if you plot multiple data sets as a subplot, and at least one data set is 2-D, then the interface is enabled.

      The Basic Fitting interface can be activated by doing the following:

      1. Plot some data
      2. Select Basic Fitting from the Tools menu of the figure window.

      For more information on the Basic Fitting interface, consult the Using MATLAB documentation.

      Note: For the HP, IBM, and SGI platforms, the Basic Fitting Interface is not supported for MATLAB 6.0 (R12.0) or MATLAB 6.1 (R12.1).

      The Data Statistics Interface can be used to calculate basic statistics about the central tendency and variability of data plotted in a graph. The Data Statistics Interface can be activated by the following:

      1. Plot some data
      2. Select Data Statistics from the Tools menu of the figure window.

      When you select Data Statistics, MATLAB calculates the statistics for each data set plotted in the graph and displays the results in the Data Statistics dialog box.

      Optimization Toolbox

      Function Description
      LSQNONLIN Solves nonlinear least squares problems, including nonlinear data fitting problems
      LSQCURVEFIT Solves nonlinear data fitting problems

      Below are examples on using these two functions.

      LSQNONLIN: Use this function to minimize continuous functions and yield only local solutions. The example below illustrates how to use LSQNONLIN to fit the function

      f = A + B*exp(C*x) + D*exp(E*x)

      to the x and y data sets, where y is the expected output given x. To do this, create the following function named fit_simp.m which uses the x and y data, both of which are passed into LSQNONLIN as optional input arguments. Use the x data to calculate values for f and subtract the original y data from this. The results are the differences between the experimental data and the calculated values. The LSQNONLIN function minimizes the sum of the squares of these differences:

      function diff = fit_simp(x,X,Y)
      % This function is called by LSQNONLIN.
      % x is a vector which contains the coefficients of the
      % equation.� X and Y are the option data sets that were
      % passed to lsqnonlin.
       
      A=x(1);
      B=x(2);
      C=x(3);
      D=x(4);
      E=x(5);
      diff = A + B.*exp(C.*X) + D.*exp(E.*X) - Y;

      The following script is an example of how to use fit_simp.m, the function declared above.

      % Define the data sets that you are trying to fit the
      % function to
      X=0:.01:.5;
      Y=2.0.*exp(5.0.*X)+3.0.*exp(2.5.*X)+1.5.*rand(size(X));
       
      % Initialize the coefficients of the function
      X0=[1 1 1 1 1]';
       
      % Set an options file for LSQNONLIN to use the
      % medium-scale algorithm
      options = optimset('Largescale','off');
       
      % Calculate the new coefficients using LSQNONLIN
      x=lsqnonlin(@fit_simp,X0,[],[],options,X,Y);
       
      % Plot the original and experimental data
      Y_new = x(1) + x(2).*exp(x(3).*X)+x(4).*exp(x(5).*X);
      plot(X,Y,'+r',X,Y_new,'b')

      Note: LSQNONLIN can only handle real variables. For fitting curves in a case involving complex data set, the data set should be split into real and imaginary parts. Below is an example of how to perform a least squares fit using complex parameters.

      To fit the complex variable, you need to break your complex parameters into their real and imaginary parts and pass them into the function that is called by LEASTSQ as a single input. First, separate the real and complex parts into two vectors. Then, concatenate the two vectors such that the first half is the real part and the second half is the imaginary part. In the MATLAB function, reassemble the complex data and evaluate it using the complex equations that you would like to fit. Split the output into its real and imaginary parts, and concatenate the two parts together to form a single output vector that is passed back to LEASTSQ. Below is an example of how to fit real X and Y data to two complex exponentials.

      Create the following function:

      function zero = fit2(x,X,Y)
       
      % Rebuild the complexinput from input x
      cmpx = x(1:4) + i.*x(5:8)
       
      % Evaluate the function using the complex data
      zerocomp = cmpx(1) .* exp(cmpx(2) .* X) + ...
      cmpx(3) .* exp(cmpx(4) .*X ) - Y;
       
      % Convert the answer into a column vector such that the
      % first half is the real part, and the second half is
      % the imaginary part.
      lx = length(X); 
      zero(1:lx) = real(zerocomp); % Real part
      zero(lx+1:2*lx) = imag(zerocomp); % Imaginary part

      To evaluate this function, you need the X and Y data sets. LSQNONLIN will fit for the parameters a, b, c, and d in

      Y = a*exp(b*X) + c*exp(d*X)

      where a, b, c, and d are complex numbers.

      X=0:.1:5; % X data
      Y=sin(X); % Y data
      Y=Y+0.1.*rand(size(Y))-.05; % Add some noise to Y
      cmpx0=[1 i 2 2*i]; % Complex initial conditions
       
      % Convert initial conditions to a real part
      x0(1:4)=real(cmpx0);
       
      % Concatenate the imaginary part to the real part.
      x0(5:8)=imag(cmpx0);
       
      % Call leastsq to perform the fit
      x=leastsq(@fit2,x0,[],[],X,Y);
      cmpx = x(1:4) + i.*x(5:8); % Convert answer to complex
       
      % Plot the results
      Y1 = real(cmpx(1).*exp(cmpx(2).*X)+ ...
      cmpx(3).*exp(cmpx(4).*X));
      plot(X,Y1,'r');
      hold on
      plot(X,Y,'+') ;

      LSQCURVEFIT: Use this function to solve nonlinear curve fitting (data fitting) problems in the least squares sense. That is, given input data xdata, and the observed output ydata, find coefficients x that best fit the vectored value function F(x, xdata). LSQCURVEFIT uses the same algorithm as LSQNONLIN. Its purpose is to provide an interface designed specifically for data fitting problems.

      There is no difference in practice between 2, 3, or any N-dimensional parametric fitting. Below is an example of 3-D parametric fitting to the following function:

      z = a1*y.*x.^2 + a2*sin(x) + a3*y.^3

      The function myfun.m is created with the following code:

      function F = myfun(a,data)
      x=data(1,:);
      y=data(2,:);
       
      F = a(1)*y.*x.^2 + a(2)*sin(x) + a(3)*y.^3;

      The following example script illustrates how to use the above function.

      xdata = [3.6 7.7 9.3 4.1 8.6 2.8 1.3 7.9 10.0 5.4];
      ydata = [16.5 150.6 263.1 24.7 208.5 9.9 2.7 163.9 325.0 54.3];
      zdata = [95.09 23.11 60.63 48.59 89.12 76.97 45.68 1.84 82.17 44.47];
       
      data=[xdata;ydata];
       
      a0 = [10, 10, 10]����� % Starting guess
      [a,resnorm] = lsqcurvefit(@myfun,a0,data,zdata)

      The output from this script is

      a = 0.0074� -19.9749�� -0.0000

      resnorm = 2.1959e+004

      Statistics Toolbox

      Function Description
      Nonlinear regression
      nlinfit Nonlinear least squares data fitting by the Gauss-Newton method
      Linear regression
      lscov Least squares estimates with known covariance matrix
      regress Multivariate linear regression
      regstats Regression diagnostics
      ridge Ridge regression
      rstool Multidimensional Response Surface Visualization (RSV)
      stepwise Interactive tool for stepwise regression

Section 3: Weighted Curve Fitting Methods

When fitting data that contains random variations, there are two major assumptions that are usually made about the error:

  • The error exists only in the response data and not in the predictor data.
  • The errors are random and follow a normal (Gaussian) distribution with zero mean and a constant variance.

The second assumption of a constant variance can be seriously affected by the presence of outliers. Because outliers lie far away from the true pattern of the data, they can induce potential error to the true fit. To improve the fit, you can use an additional scale factor (known as weights) to improve the quality of the fit. The weights of the outliers can be minimized and thus a better fit can be obtained.

You can use the following three toolboxes to implement weights for the fits.

  1. Curve Fitting Toolbox

    This toolbox has a generalized least square regression capability that can be used to fit both linear and nonlinear data.

    The weight is part of the options to the fit and is supplied using the function FITOPTIONS. In the Curve Fitting Toolbox, the weight can be any vector of weights associated with the data.

  2. Statistics Toolbox

    The ROBUSTFIT function available with the Statistics Toolbox performs weighted robust regression. ROBUSTFIT uses robust regression to fit data and ouputs the regression coefficients. It can also perform weighted linear regression.

  3. Optimization Toolbox

    You can also use LSQNONLIN and LSQCURVEFIT, the least-square solvers in the Optimization Toolbox, to perform weighted least squares fit. In order to do the weighted least-squares fitting using LSQNONLIN and LSQCURVEFIT, you need to have an equation that you want to fit your data. An example of this can be found in Solution 27840.

Section 4:� Improving Curve Fitting Results with the Curve Fitting Toolbox

Many issues can influence curve fitting. The following is a list of tips that can help improve the quality of a fit:

  • Model Selection: Model selection, either from our curve fitting library or custom equations, is the most prominent issue. Try to fit different models to your data.
  • Data Preprocessing: Data preprocessing prior to fitting the curve is also very useful. This may include:
    • Transforming the response data
    • Removing Infs, NaNs, and outliers
    More details on this can be found in the Curve Fitting Toolbox documentation.
  • Rational fit may have issues regarding discontinuties when the predictor goes to infinity due to singularity. More information on this can be found in the Curve Fitting Toolbox documentation.
  • The fitting process is more likely to converge if you supply as much information as possible about the coefficients being estimated. The following tips are useful for increasing the speed and accuracy of the fit:
    • Make intelligent guesses as starting values. If you have an idea about likely coefficient values, then use those as starting values.
    • In the absence of information about starting values, try a variety of starting values.
    • Try restraining the parameters. For example, if you know that a parameter has to be positive, then placing its lower bound at 0 may lead the iterative process towards a solution that it otherwise might not find.
    • Adjust various fitting options, namely
      • Try different algorithms.
      • Increase the number of iterations or function evaluations allowed.
      • Reduce the convergence tolerance.
    • Breaking the data into smaller subsets and fitting different curves to different subsets.
  • Sophisticated problems are best solved by an evolutionary approach, whereby a problem with a smaller number of independent variables is solved first. Solutions from lower order problems can generally be used as starting points for higher order problems by using appropriate mapping. For example, if your model is best described as
    y = c + a*exp(b*x) + d*sin(f*x)
    then it is always better to fit one term at a time, starting with the more important terms. You might want to fit
    y = c1 + a1*exp(b1*x)
    first, and then use the resulting coefficients for a,b,c as starting points when fitting the whole equation:
    y = c + a*exp(b*x) + d*sin(f*x)

Section 5: Other Related Material

The MATLAB News and Notes February 2002 article on Curve Fitting, Atmospheric Carbon Di-Oxide Modeling and the Curve Fitting Toolbox, might also be useful to you.

Contact support
E-mail this page
Print this page