Weighting Data in curve fitting reduces uncertainty in coefficients

Hello,

I have a general theory question regarding uncertainty estimations in Igor Pro.

When fitting a user-defined function with and without weighting, I get smaller uncertainties on the coefficients for the weighted than for the un-weighted.

It is my understanding that adding weighting to the y-data is equivalent to changing the assumption that you know your y-data perfectly to the assumption that there is some uncertainty in your y-data. So why would increasing the uncertainty in your y-data make the curve fit better? What am I missing?

Thanks,

Eric
Perhaps by way of a simple example. Suppose that you have a data set with one "outlier" data point. You fit without weightings. All points will be given equal weight in the chi-squared minimization. The one outlier will skew your results. Now, weight that outlier as being highly uncertain (having a high variance). That weights it less in the overall fit. Your results are now "tighter". Finally, remove that one outlier and replace it by a point that is exactly where it should be by expectation. Remove all weightings and re-fit. Your results will likely be even tighter still.

--
J. J. Weimer
Chemistry / Chemical & Materials Engineering, UAHuntsville
eschiesser wrote:
It is my understanding that adding weighting to the y-data is equivalent to changing the assumption that you know your y-data perfectly to the assumption that there is some uncertainty in your y-data.


Nope. If you don't specify weights then Igor will just use some default weight for every data point, typically 1.0.
Quote:
It is my understanding that adding weighting to the y-data is equivalent to changing the assumption that you know your y-data perfectly to the assumption that there is some uncertainty in your y-data. So why would increasing the uncertainty in your y-data make the curve fit better? What am I missing?

741 is correct, but not complete. When you don't provide a weight wave, the assumption is made that the *model* is correct (that is, it reflects the unknown "perfect" data correctly). It also makes the assumption that differences between the data and the model (the residuals) are a sampling of an unknown Gaussian distribution that has uniform variance for all data points and a zero mean. The distribution is estimated from the residuals and the sigmas are based on that estimate.

If you provide weighting, it says that you actually have a priori information about your measurement errors. It also allows you to tell the fit that your measurement errors do not have uniform variance. In this case, the sigmas are computed based on your weights; the chi-square value then gives you a way to compute a statistic that can give you information about how good the model is- it is assumed that "excess" residuals beyond what is predicted by the weighting are a result of a model that doesn't completely fit the underlying "perfect" data.

Apparently, the weights you are supplying predict residuals smaller than the actual residuals, an indication that the model doesn't completely fit the shape of the data, or that you have underestimated the actual measurement errors.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com