Linear Least Squares Fitting with uncertainity

Hello,

I have a set of known values (X) and a set of measured values (Y), plus uncertainties (+/- dY).

How do I do a Least Squares linear fit but with taking the uncertainties in Y into account?

Regards,
Ali
The uncertainties should be in the form of Standard Error. You simply need to select the wave containing the uncertainties as the weighting wave for the fit. If you are using Quick Fit, apply that wave as the error bar wave in the graph of the data. You will also need to select the Quick Fit menu item to use the error bars as the weighting wave. If you are using the Curve Fit dialog, go to the Data Options tab and select the wave.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
Thanks for your answer. I am not sure if this is what I want.

I have 9 measurements of spectra, each one in the form of x+/- dx. e.g. 4023+/-0.23 Angestrom.

The standard error (= stdev/sqrt(n)) does not take into account the dx. It simply take the standard deviation and divide by square root of the number of measurements. I think what I have is more like "error bars" in each measurement. I would like this error bars to be included in the linear fitting algorithm.
If you have errors in both X and Y, then you need ODR fitting. To learn more, execute this Igor command:

DisplayHelpTopic "Errors in Variables: Orthogonal Distance Regression"

In ODR fitting, the weighting waves are very important.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
John, I think your first answer was more the nose..it kind of seems like Ali's second post was using dx as a general stand in for any error, not an implication of uncertainty in the independent variable. That said, from what I can tell,

Ali, I think you're confusing terminology here. The standard error formula you're citing is decent way to report error bars for data where the number you're interested in is the average of the measurements. So you're trying to measure a single number (forget fitting functions for second) and you take N identical measurements and get N different values. The measured standard deviation is a measurement (itself subject to measurement error) of the uncertainty of a *single* measurement. But the measured mean of N single measurements has an uncertainly sqrt(N) smaller than the uncertainty of each individual measurement, so the standard error makes for a better error bar than the standard deviation. Better in that it improves (as it should) by taking more data points, which is especially important if you have different data reported by averaging different numbers of measurements. Even better (generally agreed) for error bars is to calculate a proper confidence interval using the StudentT distribution and a selected confidence interval (say 90%, 95%, 99% all common choices), but StudentT factor approaches a constant value for large N, so the standard error is not a terrible stand-in, but that's not actually what you need to do for least squares fitting.

Back to to least squares fitting: Igor's algorithm is looking for your best estimate of the inherent uncertainty on each Y point that you input. If you don't have any apriori knowledge of that uncertainty, you can estimate from individual measurements. If you take 10 measurements of Y at the SAME X value and enter all 10 data points into the fit as X,Y pairs (that's how I would do it...the least squares fitting algorithm doesn't care if you repeat X values) the uncertainty on each data point should be the standard deviation...your estimate of uncertainty for a single measurement of Y. The improved mean of those 10 measurements will be weighted more heavily by the fact that there are 10 input data points rather than 1.

Alternately you can pre-average Y values that are collected at the same X value and enter them as single pairs of X,mean(Y). Then the uncertainty you would want to enter should be the standard error of the mean: you have less uncertainty in the mean of N Y measurements than a single Y measurement. The improved mean will be weighted more heavily by the fact that your standard error goes down with increasing measurements.

Either method should give about the same fit result. It's also possible you have come up with uncertainty estimates from some entirely different method than the individual data point measurements: theoretical or previous instrument characterizations. The assumption is still that you're inputting the uncertainty on each individual data point used in the fit, and if you've estimated correctly you should get a reduced Chi-square value close to 1. If your uncertainties are all off by a constant scaling factor, you'll still get the same fit results, but the reported uncertainties on the fit parameters will be different and reduced Chi-squared will be significantly larger or smaller than 1.
Thanks, Ian. I second everything you said there.

I might add also that doing the fit with all the data as XY pairs will give the same answer as doing the fit using pairs of averaged Y values at each X.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
Ian - Thanks for the great insight!

--
J. J. Weimer
Chemistry / Chemical & Materials Engineering, UAHuntsville
johnweeks wrote:

I might add also that doing the fit with all the data as XY pairs will give the same answer as doing the fit using pairs of averaged Y values at each X.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com


One advantage of averaging Y values (that share neighboring X values) is that if the distribution of individual errors is not Gaussian, then least-squares fitting is not correct (one should do maximum likelihood, for example). However, the Central Limit Theorem implies that the average TENDS to a Gaussian as the number N of averaged points goes to infinity. If the original distribution is not too pathological, the conversion can be reasonably rapid. So fitting to averaged values is more likely to put you in a limit where least-squares fits are valid and, as a bonus, you get an estimate (sem) of the weighting for each point.

Of course, if the X values are too different, then you are averaging points whose means vary too much, and that can smooth out features in the data.


John Bechhoefer
Department of Physics
Simon Fraser University
Burnaby, BC, Canada