4.43478

4.4 | 25 ratings Rate this file 186 downloads (last 30 days) File Size: 51.61 KB File ID: #10065

Polyfitn

by John D'Errico

 

20 Feb 2006 (Updated 17 Dec 2009)

Code covered by the BSD License  

N-d polynomial regression model

Download Now | Watch this File

File Information
Description

Yes, I know. This is yet another in the long list of polynomial regression modeling tools. What does polyfitn do that the others don't? You can supply any list of terms, with any set of exponents: positive, negative, or real. It works in n-d as well as 1-d. And it has a simple, clean interface to specify your terms.

Simple statistics are generated, R^2, rmse, standard deviations and variances of the coefficients. I've supplied an evaluation tool to use the model.

x = -2:.1:2;
y = cos(x);
p = polyfitn(x,y,'constant x^2 x^4 x^6');

% Evaluation
polyvaln(p,[0 .5 1])
ans =
      0.99996
       0.8776
      0.54031

% Conversion to a sympoly
polyn2sympoly(p)
A scalar sympoly object
    0.99996 - 0.49968*x^2 + 0.041242*x^4 - 0.0012079*x^6

This last operation employs my own sympoly toolbox also found on the file exchange, although it is not necessary for use of polyfitn and polyvaln.

http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=9577&objectType=FILE

As another example, fit a complete 2-d model with all terms up to a second order model:
x = randn(100,2);
y = randn(100,1);
p = polyfitn(x,y,2);

% Display it as a sympoly, just makes it easy to read:
polyn2sympoly(p)

A scalar sympoly object
    0.19175*X1^2 + 0.052769*X1*X2 + 0.093648*X1 + 0.039723*X2^2 - 0.04943*X2 - 0.19188

% Only want a linear model in 2-d, but with no constant term? Its easy to specify:
uv = rand(100,2);
w = sin(sum(uv,2));
p = polyfitn(uv,w,'u, v');

polyn2sympoly(p)
A scalar sympoly object
    0.74604*u + 0.68452*v

Acknowledgements

The author wishes to acknowledge the following in the creation of this submission:
Symbolic Polynomial Manipulation
This submission has inspired the following:
gapolyfitn

MATLAB release MATLAB 7.0.1 (R14SP1)
Zip File Content  
Published MATLAB Files polyfitn_demo
Other Files
license.txt,
PolyfitnTools/demo/html/polyfitn_demo.png,
PolyfitnTools/demo/html/polyfitn_demo_01.png,
PolyfitnTools/demo/html/polyfitn_demo_02.png,
PolyfitnTools/demo/polyfitn_demo.m,
PolyfitnTools/doc/understanding_polyfitn.rtf,
PolyfitnTools/polyfitn.m,
PolyfitnTools/polyn2sym.m,
PolyfitnTools/polyn2sympoly.m,
PolyfitnTools/polyvaln.m,
PolyfitnTools/ReadMe.rtf,
PolyfitnTools/test/test_main.m
Tags for This File  
Everyone's Tags
Tags I've Applied
Add New Tags Please login to tag files.
Comments and Ratings (41)
03 Mar 2006 Duane Hanselman

Powerful general tool for N-D curvefitting. Great extension to polyfit/polyval. Flexible input syntax. Scales data to minimize numerical problems.

12 Apr 2006 Vassili Pastushenko

I wanted to compare the speed of my SURFIT with that of POLYFITN - POLYVALN. My problem is that POLYFITN apparently accepts only a vector as dependent variable, whereas I have to get a polynomial approximation of a matrix. How should I do that?

An example from my work: imagine that I have a data matrix Z (512 by 511) which I want to approximate by a polynomial of 2-nd degree in vertical direction and 0-th degree in horizontal direction. This means I want to use a bipolynomial of degrees [2 0]. With SURFIT the problem is solved easily:

B=surfit(Z,[2 0]); %B = my bipolynomial approximation of Z (regression)

What should I do with POLYFITN-POLYVALN to get B?

12 Apr 2006 Vassili Pastushenko

Finally I have succeeded to compare the speed, which is surprisingly almost 40 times slower by POLYFITN-POLYVALN than by SURFIT. I cannot understand this because I know how low does John rating other programs if they are with some examples even 2% slower than possible. Here is my comparison made for a bipolynomial of 2-nd degree in both directions (I was not able to resolve my case of degrees [2 0]). Much more surprising is higher accuracy of SURFIT. I should vote for deletion of POLYFITN were I sure that my syntax of input to it indeed corresponds to bipolynomial of degrees [2 2].

tic,[X,Y]=meshgrid(1:size(Z,2),1:size(Z,1));
mod=polyfitn([X(:),Y(:)],Z(:),2);
B=polyvaln(mod,[X(:),Y(:)]);
B=reshape(B,size(Z,1),size(Z,2));toc
Elapsed time is 1.203000 seconds.

tic,B2=surfit(Z,2);toc
Elapsed time is 0.031000 seconds.

sum((Z(:)-B(:)).^2)
ans = 76181

sum((Z(:)-B2(:)).^2)
ans = 75534

13 Apr 2006 Vassili Pastushenko

Yes, the comparison was done correctly. I have noticed that the ratio of errors depends on data. Occasionally, I used real data for comparison, which allowed to notice an inacceptably big difference in accuracy. Below I have included a model of a piece of these data as default.
 

[B,B2]=john;
Elapsed time is 0.063000 seconds. %POLY...N
Elapsed time is 0.000000 seconds. %SURFIT

JV = 1.1931e+005 95907 %Errors

 function [B,B2]=john(Z)
%======================
if nargin<1
y=zeros(100,160);
y(1:30,20:60)=-8;
y(80:100,40:120)=-8;
Z=y+randn(size(y));
end

[R,C]=size(Z);
[X,Y]=meshgrid((1:C)/C,(1:R)/R);
z=[X(:),Y(:)];
ZZ=Z(:);
tic,
mod=polyfitn(z,ZZ,2);
B=polyvaln(mod,z);
toc
B=reshape(B,R,C);

tic,B2=surfit(Z,2);toc
JV=sum([ZZ-B(:),ZZ-B2(:)].^2)

17 Apr 2006 Tony Flipper

did what I needed it to do - find the same regression equation that Excel gave me

06 Dec 2006 g hunter

does what i need it to.

note that Vassili Pastushenko wrote SURFIT.

27 Feb 2007 ye zhang

Thanks a lot!!
it is really helpful commmand...

16 Mar 2007 Jean-Luc Dellis

Hi all,

Good job John as usually you do. I planed to use bootstrap approach for some educational experimental works because uncertainties can touch both x and y variables. In that goal i compared your program with my own bootstrap program using polyfit. That gave similar results except a) as expected bootstrap was slower, but that is not very important in my case, b) bootstrap failed when data points were under around 10 with a polyfit warning message stating an ill-conditioned matrix. This is due to replicated points in the bootstrap samples.

Maybe i could manage bootstrap in a more correct way. Anyway your program succeed more quickly and with reasonable uncertainties even when there was noise in the independant variable x.

So, thanks a lot John. I know that if i have some clever task to do, i before have to check if John has done it.

11 May 2007 Oh-ig Kwoun

This is exactly what I needed.
It works great.
Thank you, John

28 Sep 2007 Simon Ren

This is excellent work. Thanks John.
I have a question. I have a series points (x, y,z), but no any equation yet. These data was obtained from test. I am not sure if I can get the function z=f(x,y) with polyfitn? Could you help me? Thank you very much.

20 Nov 2007 guido spinola  
14 Feb 2008 Rajneesh Kumar

It's very easy to use and works excellent. Has been very useful to my research. Thank you.

29 Mar 2008 Vincent Morio

Very powerful and efficient tool. I have used it to approximate the nonlinear entries of my linearized model by multivariate polynomials. Then, it is rather easy to put it under LFT form in order to synthesize an LPV controller. Many thanks.

18 Apr 2008 Muhammad Ihsan Salim

Well, I need to try it out first, so i gave it four star. If it works on fitting my DIC displacement distribution, I gave you the whole star you want. But after all, thank you any way..

24 Apr 2008 Rebecca Bendick

When I try to use polyfitn, I get lots of errors and no answer:

??? Error using ==> class
The CLASS function must be called from a class constructor.

Error in ==> sympoly at 92
    P=class(P,'sympoly');

Error in ==> horzcat at 12
  sp=sympoly(sp);

Error in ==> repmat at 45
    siz = [M N];

Error in ==> polyfitn>buildcompletemodel at 241
    modelterms = [modelterms;[repmat(k,nt,1),t]];

Error in ==> polyfitn at 148
  modelterms = buildcompletemodel(modelterms,p);

25 Apr 2008 John D'Errico

This error is incomplete, since it does not show the calling syntax. As such, I cannot say why the problem arose. Please send me a direct e-mail with any questions, and I will resolve them immediately.

19 Nov 2008 Richard Crozier  
21 Nov 2008 Richard Crozier

Hello, the software is excellent, but there is a very minor error, it does not check for that the modelterm matrix passed in is not empty, resulting in an mtimes error in this case. I discovered this because of my poor coding, so I suppose I shouldn't expect polyfitn to check this for me!

21 Nov 2008 John D'Errico

Sorry. Polyfitn should check this for you. It will do so in my next revision.

30 Jan 2009 Mohsen Nosratinia

Many thanks! just what I needed. saved me a lot of coding

17 Feb 2009 Gavrilo Bozovic

Thanks a lot!

20 May 2009 Agustin Pimstein

John hello,

Even running this example of yours,

n = 1000;
x = rand(n,2);
y = exp(sum(x,2)) + randn(n,1)/100;
p = polyfitn(x,y,3)

...I'm getting this error:
??? Undefined function or method 'polyfitn' for input arguments of type 'double'.

Any idea of what the problem might be???

Thanks,

Agustin

20 May 2009 John D'Errico

Agustin found his problem - this is always a path problem. Something got put where MATLAB does not look.

09 Jun 2009 Michal Kvasnicka

John, could you add to the polyfitn package description file mentioned in your ReadMe file or any relevant reference on multidimensional polynomial regression.

"In additions to these m-files, there are several other files of interest. For those who have an interest in understanding how polyfitn works (quite simple really) you should look to the file understanding_polyfitn.rtf, in the doc subdirectory."

The underlying theory will be very helpful for me. Thanks in advance ...

11 Aug 2009 Thomas Clark

Fit for it's purpose ;)

Thanks John, I had to chop and modify polyvaln for my application, but it was really good to have a way of generating the polynomial coefficients for a given n and order.

27 Aug 2009 Chao Xu

Hallo John,
Nice work!
I am a newbie in matlab and still got some questions about 3d fit.
I have a 3d dataset F(128,128,16) and would like to get the polynomial function F(x,y,z), how should I do it?

thanks!

27 Aug 2009 John D'Errico

Assuming that you have an array, that contains only w(x,y,z) for each element of the array, use ndgrid to generate the combinations of (x,y,z). Then call polyfitn with the data.

11 Oct 2009 Andreas

Great work !
Thank you very much. It is really easy to use, allows a lot of user specifications and is quite fast imho.

21 Oct 2009 James

Hi John,

I am new to all this stuff. actually to MATLAB itself.
I have 2D data I want to fit (to a 2-variable function) with 3 parameters: z = A + B*x +C*y. But that is scarce matrix, i.e. with holes. I do not have data (or z if you prefer) for each and every combination of x and y, but only for a sall number of them.

Any way around this ?

I appreciate your help in that matter.

James

22 Oct 2009 John D'Errico

James,

You don't need every possible combination of (x,y,z) to be known to estimate a model, certainly not a first order model like this. To estimate the 3 unknown parameters in this model, in theory, you need at least 3 data points at a minimum. In practice, you won't get very good estimates of those parameters with only 3 data points.

In the past, I often had clients ask me how much data they needed to obtain. After all, data is often scarce, expensive, hard to obtain. It costs money, and time. You can't have too much data, at least if it is good data.

In practice, if you end up with 10 or 20 points to estimate these three parameters, that will (probably) be entirely adequate. More data will give you higher quality estimates. Even 6 points, carefully chosen, if the noise is low enough, will often be adequate. Only you know the quality of your data, and how much you have, as well as how accurately you need the model to be estimated.

If you have further questions, I can only suggest you send me an e-mail, with your data, and any other information you have on the problem.

John

15 Nov 2009 Alexandros alex

The function polyfitn will probably help me but I don't know how to install this function.
I downloaded the file and I pasted it in the toolbox subfolder of Program files/Matlab....
However when I run this
uv = rand(100,2);
w = sin(sum(uv,2));
p = polyfitn(uv,w,'u, v');
if exist('sympoly') == 2
  polyn2sympoly(p)
end
if exist('sym') == 2
  polyn2sym(p)
end

it returns
??? Undefined function or method 'polyfitn' for input arguments of type 'double'.

Error in ==> testxk at 3
p = polyfitn(uv,w,'u, v');

What's wrong?

16 Nov 2009 John D'Errico

If you are trying to place downloaded files into the Matlab/toolbox directories, this is generally wrong. Those directories are cached by matlab when it starts up, so any files that you put in those directories will not be seen until you restart matlab. Also, anytime you make any changes to files that you put in those directories, the changes will not be recognized by matlab.

Better is to create your own new (and separate) directory for downloaded files. Add that directory on your search path, using the pathtool function, or using addpath and then savepath.

16 Nov 2009 Alexandros alex

So if I want to run this script
uv = rand(100,2);
w = sin(sum(uv,2));
p = polyfitn(uv,w,'u, v');
if exist('sympoly') == 2
  polyn2sympoly(p)
end
if exist('sym') == 2
  polyn2sym(p)
end

do I have to "call" the script "polyfitn" with a specific command?
Generaly what should I do in order to run the above script while polyfitn is a custom script of a user?

16 Nov 2009 John D'Errico

As I explained the first time, you need to put polyfitn into a directory that is on your search path.

help addpath
help savepath
help pathtool

Your particular test problem gives me the results:

uv = rand(100,2);
w = sin(sum(uv,2));
p = polyfitn(uv,w,'u, v');

polyn2sympoly(p)
ans =
    0.76152*u + 0.73572*v

Nothing special needs be done except putting it in a directory where matlab can find it.

16 Dec 2009 Toby

Hello John,

Thank you for the more than useful polyfitn function. However, I do think I found a bug.

Please run the working example below:

__________________________________
clear all
close all

x = ...
    [1.15,1.1,1.05,1,0.91,1.15,1.1,1.05,1,0.91,1.15,1.1, ...
    1.05,1,0.91,1.15,1.1,1.05,1,0.91;];
y = ...
    [8,8,8,8,8,11,11,11,11,11,15,15,15,15,15,17,17,17,17,17;];
z = ...
    [10,7,5,3,2,12,8,6,4,3,12,9,7,5,3,12,9,7,5,3;];

fitdata = polyfitn...
    ([x(:),y(:)], z(:),'constant x^2 y^2 x*y x^3 y^3 x^4 y^4 ')
% fitdata = polyfitn...
% ([x(:),y(:)], z(:),'constant x^2 y^2 x*y x^3 y^3 x^4 y^4')

x_min = min(min(x));
x_max = max(max(x));
y_min = min(min(y));
y_max = max(max(y));

N=10;
L = linspace(x_min,x_max,N);
P = linspace(y_min,y_max,N);
[ l, p ] = meshgrid(L,P);
zz = polyvaln(fitdata,[l(:),p(:)])
zz = reshape(zz,length(L),length(P))
surf(p,l,zz)
__________________________________________

Run it again, but uncomment the commented part.
The first time the solution is not smooth, while the second time it is. The only difference is the last input string in the polyfitn.
I haven't tried it on different x,y,z inputs though.

Toby

17 Dec 2009 John D'Errico

Toby - you found a bug in the model parser. Repaired now. Thanks, John

12 Jan 2010 Toby  
12 Feb 2010 Hrishikesh Karvir

Great function, really easy to use! I am not sure how it compares to the new surface fitting toolbox from Matlab (R2009).

07 May 2010 Jose

John, help me please!
I need to know the straight line equation for two 3D points. For the 2D space I used work with the traditional polyfit function using a first order polynomial, but I dont kown how to perform this using your polyfitn function! Could you help me?

07 May 2010 John D'Errico

If you have data that falls on a straight line in three dimensions, then you need to use a tool designed to solve that problem. Polyfitn is not that tool.

I presume that you wish to find the line that passes through these points, assuming that each variable has noise on it. This is called a total least squares problem. The solution is commonly formulated using a singular value decomposition. That is, assume that your data is in an nx3 array, X, where each row of the array X corresponds to a single data point in three dimensions.

A line in N dimensions (here N = 3) is defined by the equation

   P(t) = P0 + P1*t

where P0 and P1 are vectors of length 3. Any point on that line is given by some value of the parameter t. This would be called a parametric form of the line. It has the advantage that is any "slopes" of the line might be zero or infinite, we never have a singularity.

How can we compute the vectors P0 and P1? P0 is simple. We simply compute the mean of our data.

   P0 = mean(X,1);

Note that P0 is not unique, as ANY point on the line would suffice to define a line. To estimate the line equation using the svd, the mean of your data is best. P1 is also easy enough to generate. It is the right singular vector that corresponds to the maximum singular value of the matrix with P0 subtracted off. Lets try it out...

Here is some data that falls along a straight line in three dimensions.

n = 100;
X = bsxfun(@plus,rand(n,1)*[2 3 5],randn(1,3)) + randn(n,3)/25;
plot3(X(:,1),X(:,2),X(:,3),'.')

P0 = mean(X,1)
P0 =
       2.3547 0.11682 0.85517

[U,S,V] = svd(bsxfun(@minus,X,P0),0);

diag(S)
ans =
       18.253
      0.40999
      0.35826

The straight line is a good fit when the maximum singular value is significantly larger than the other singular values. Here, that maximum singular value is roughly 44 times as large as the next larges singular value. The right singular vectors are the columns of V.

V
V =
        0.322 0.87967 -0.34999
        0.489 -0.47108 -0.73414
      0.81068 -0.065251 0.58185

The corresponding right singular vector is the first column of V. I'll transpose it to be consistent in shape with P0.

P1 = V(:,1).'
P1 =
        0.322 0.489 0.81068

Is P1 close to being correct? It turns out that we can scale P1 by any constant and it will be equally valid. I'll scale it here so the first element is 2. If we look back at the way I generated the data, it would originally have been [2 3 5] had there been no noise applied. How well did I do?

P1*(2/P1(1))
ans =
            2 3.0373 5.0353

I'd say that was pretty close.

30 Aug 2010 Martin Kaszynski

Great tool, especially in combination with "gapolyfitn".

Please login to add a comment or rating.
Updates
24 Feb 2006

Fixed a bug for when depvar is a row vector. Also cleaned up the documentation some more, and the demos. Added HTML.

01 Aug 2007

Trap for number of data points too few to estimate parameter variances. Also removed the .DS_store file from the zip.

08 Aug 2007

Bug fixed - linear 1-d models with no intercept requested, could still try to estimate the intercept.
Doc fix - added a better description of the standard form for a linear polynomial regression.

08 Aug 2007

Bug fixed - linear 1-d models with no intercept requested, could still try to estimate the intercept.
Doc fix - added a better description of the standard form for a linear polynomial regression.

26 Sep 2007

Repair a bug in polyn2sympoly. The result was always "correct", but the Coefficients field was transposed in shape. This prevented some sympoly operations from working properly.

20 Mar 2008

Repaired a problem with polyfitn, when a model with fewer independent variables than those supplied is estimated. Also repaired a glitch in the help for polyvaln.

17 Dec 2009

Repair the model parser for strings with a space at the end

Tag Activity for this File
Tag Applied By Date/Time
approximation John D'Errico 22 Oct 2008 08:16:19
interpolation John D'Errico 22 Oct 2008 08:16:19
polynomial John D'Errico 22 Oct 2008 08:16:19
multinomial John D'Errico 22 Oct 2008 08:16:19
linear regression John D'Errico 22 Oct 2008 08:16:19
modeling John D'Errico 22 Oct 2008 08:16:19
surface John D'Errico 22 Oct 2008 08:16:19
curve John D'Errico 22 Oct 2008 08:16:19
surface Matthew 03 Feb 2009 15:35:04
fit Thomas Clark 11 Aug 2009 09:11:55
coefficients Thomas Clark 11 Aug 2009 09:11:55
approximation Alea 28 Apr 2010 06:02:32

Contact us at files@mathworks.com