How to make a batch fit program run on multiple cores?

sleepingawake86
Posts: 34
Joined: 2010-06-28
Location: United States

Hello,
Below I have pasted code for a batch fitting program. It fits a bunch of waves to a Gaussian. I am analyzing large data sets, 500+ waves each over 200 pts, and am running into the program taking like 10mins to complete. I am using an intel dual core i5 processor and I am wondering if there is anyway to work the code to run on multiple cores so that it finishes faster? Perhaps some way to split the waves in half and run each set on a different core in parallel?

```function BatchGauss(Nmax)
variable Nmax
variable ii
wave W_coef,W_sigma

make /N=(Nmax) W_A
make /N=(Nmax) W_Aerr
make /N=(Nmax) W_width
make /N=(Nmax) W_widtherr
make /N=(Nmax) W_y0
make /N=(Nmax) W_y0err

for(ii=0;ii<Nmax;ii+=1)

string loopwave="wave"+num2str(ii)

CurveFit/M=2/W=0 gauss, \$loopwave/D

W_A[ii]=W_coef[1]
W_Aerr[ii]=W_sigma[1]
W_width[ii]=W_coef[0]
W_widtherr[ii]=W_sigma[0]
W_y0[ii]=W_coef[3]
W_y0err[ii]=W_sigma[3]

endfor

end```

ChrLie
Posts: 92
Joined: 2009-01-06
Location: Switzerland

Did you try how much improvement you get using the /NTHR = 0 flag on CurveFit?

thomas_braun
Posts: 94
Joined: 2009-10-07
Location: Germany

Hi,

I guess there is not so much profit from doing each curve fit in parallel with only 200 points.
The fact that you have 500+ fits to perform suggest that you want to do multiple curvefits in parallel, each on its own core.

So you might want to have a look at

`DisplayHelpTopic "ThreadSafe Functions and Multitasking"`

which is a bit involved.

For your current appraoch I would also add the flags "/N/Q" to CurveFit.
Have you checked that single precision coefficients are precious enough?
If not you might want to add "/D" to the Make calls to use double precision.

Hope that helps,
Thomas

Posts: 836
Joined: 2007-06-29
Location: United States

Yeah, especially with a built-in fit function you won't get any help from multiple cores on a single fit. In fact, the overhead of threading would probably make it take longer.

The solution would be to do multiple fits in parallel. We ship an example experiment that does just that. You can probably adapt it to your situation fairly easily. To view it, select File->Example Experiments->Curve Fitting->MultipleFitsInThreads.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com

sleepingawake86
Posts: 34
Joined: 2010-06-28
Location: United States

I used the recommendations to use multiple threads and I've been trying to get it to work, but keep getting errors about the thread index/ID.

Exact error is: "while executing a user function, the following error occurred: Invalid Thread Group ID or index"

Here is the code

```ThreadSafe Function lingaussFit(coefwave,data)
wave data, coefwave

FuncFit/NTHR=0 linguass coefwave  data /D

End

function BatchlinGauss(Nmax)
variable Nmax
variable ii,jj
wave W_coef1,W_sigma1
wave W_coef2,W_sigma2
Variable Nmid=Nmax/2   //split total number in half to run on separate cores

//Create batch coeficent waves
make /N=(Nmax) W_A
make /N=(Nmax) W_Aerr
make /N=(Nmax) W_width
make /N=(Nmax) W_widtherr
make /N=(Nmax) W_y0
make /N=(Nmax) W_y0err
make /N=(Nmax) W_B
make /N=(Nmax) W_Berr
make /N=(Nmax) W_c
make /N=(Nmax) W_cerr

for(ii=0;ii<Nmid;ii+=1)

jj=ii+Nmid

string loopwave1="wave"+num2str(ii)
string loopwave2="wave"+num2str(jj)

//Parallel on different cores

W_A[ii]=W_coef1[0]
W_Aerr[ii]=W_sigma1[0]
W_width[ii]=W_coef1[3]
W_widtherr[ii]=W_sigma1[3]
W_y0[ii]=W_coef1[4]
W_y0err[ii]=W_sigma1[4]
W_B[ii]=W_coef1[1]
W_Berr[ii]=W_sigma1[1]
W_c[ii]=W_coef1[2]
W_cerr[ii]=W_sigma1[2]

W_A[jj]=W_coef2[0]
W_Aerr[jj]=W_sigma2[0]
W_width[jj]=W_coef2[3]
W_widtherr[jj]=W_sigma2[3]
W_y0[jj]=W_coef2[4]
W_y0err[jj]=W_sigma2[4]
W_B[jj]=W_coef2[1]
W_Berr[jj]=W_sigma2[1]
W_c[jj]=W_coef2[2]
W_cerr[jj]=W_sigma2[2]

endfor

end

Function linguass(w,x) : FitFunc
Wave w
Variable x

//CurveFitDialog/ These comments were created by the Curve Fitting dialog. Altering them will
//CurveFitDialog/ make the function less convenient to work with in the Curve Fitting dialog.
//CurveFitDialog/ Equation:
//CurveFitDialog/ f(x) = a*x+b*exp(-(x-c)^2/d^2)+y_0
//CurveFitDialog/ End of Equation
//CurveFitDialog/ Independent Variables 1
//CurveFitDialog/ x
//CurveFitDialog/ Coefficients 5
//CurveFitDialog/ w[0] = a
//CurveFitDialog/ w[1] = b
//CurveFitDialog/ w[2] = c
//CurveFitDialog/ w[3] = d
//CurveFitDialog/ w[4] = y_0

return w[0]*x+w[1]*exp(-(x-w[2])^2/w[3]^2)+w[4]
End

<igor>```

Posts: 906
Joined: 2007-06-21
Location: United States

I think you need to call ThreadGroupWait at the end of the for loop, just before the endfor. Otherwise you will be trying to start a thread while it is still running from the previous iteration of the loop.

Something like this:

```	// Wait for all threads to finish before starting another batch of threads
do

741
Posts: 187
Joined: 2009-03-22
Location: Belgium

Here's how I would do this (note - untested)

```Function MultiThreadIt(Nmax)
variable nMax

Make /N=(Nmax) /D /O W_A, W_Aerr, W_width, W_widtherr, W_y0, w_y0err

Make /N=(Nmax) /FREE W_Dummy
// W_Dummy exists only so the MultiThread keyword can be used.
// since MultiThread expects a wave assignment, we give it one,
// even though its contents will be discarded

// pack the data waves into a 'wave wave'. I use a wave wave simply
// because I like it, but it's entirely optional
Make /WAVE /FREE W_DataWaves
variable i
for (i = 0; i < Nmax; i+=1)
W_DataWaves[i] = \$("wave" + num2istr(i))
endfor

MultiThread W_Dummy = RunAFit(W_DataWaves, W_A, W_Aerr, W_width, W_widtherr, W_y0, w_y0err, p)
End

ThreadSafe Function RunAFit(dataWaves, W_A, W_Aerr, W_width, W_widtherr, W_y0, w_y0err, index)
wave /WAVE dataWaves
wave W_A, W_Aerr, W_width, W_widtherr, W_y0, w_y0err
variable index

wave thisData = dataWaves[index]
CurveFit/M=2/W=0 gauss, thisData /D

wave W_coef
wave W_sigma

W_A[index]=W_coef[1]
W_Aerr[index]=W_sigma[1]
W_width[index]=W_coef[0]
W_widtherr[index]=W_sigma[0]
W_y0[index]=W_coef[3]
W_y0err[index]=W_sigma[3]

return 0    // make the multithread wave assignment happy
End```

[ last edited June 1, 2012 - 12:23 ]
sleepingawake86
Posts: 34
Joined: 2010-06-28
Location: United States

Thank you all for your comments. I have gotten the program to work with the threads fairly well, but now am having an issue with the error wave.

I have a different coefficient wave for each thread but the fits try to write to the same error wave and I keep getting problems from that.

Is there a way to specify a different error wave when fitting? I know I can specify the coefficient wave, but the error wave I cant figure out.

Posts: 906
Joined: 2007-06-21
Location: United States

You specify the weighting wave using the /W flag. From the CurveFit help:

Quote:
/W=wghtwaveName: wghtwaveName contains weighting values applied during the fit, and must have the same length as waveName. These weighting values can be either the reciprocal of the standard errors, or the standard errors. See the /I parameter above for details.

But I suspect by "error wave" you mean "destination wave". For that you need /D=destwaveName.

Or perhaps you mean "residual wave". For that you need /R=residwaveName.

[ last edited May 31, 2012 - 18:16 ]
Posts: 836
Joined: 2007-06-29
Location: United States

I expect he means the W_sigma wave. There is no option to write that information to a different wave. Under ordinary circumstances, I would tell you to copy W_sigma to a new wave after the fit is done in order to preserve it. I confess that I haven't thought about implications for curve fitting.

I *think* that if you do curve fits in separate threads using the (cumbersome) thread group coding, then the curve fit will take place in its own virtual data folder, and W_sigma will be distinct for each thread. I'm not sure how this applies to using the Multithread keyword, though. I would try adding a sigma wave to the input parameters of 741's RunAFit function, and inside that function copy W_sigma to that wave. To access W_sigma, you must add *after the call to CurveFit* a WAVE statement to connect with W_sigma.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com

Posts: 906
Joined: 2007-06-21
Location: United States

Quote:
I would try adding a sigma wave to the input parameters of 741's RunAFit function, and inside that function copy W_sigma to that wave.

It is permitted to write to a thread worker input function's parameter wave but you can not redimension it.

It's up to the programmer to make sure that each thread gets a separate wave to write to.

I don't see how you can do this with Multithread.

741
Posts: 187
Joined: 2009-03-22
Location: Belgium

Just a note that I corrected some typos in my code sample above. It now actually compiles.

hrodstein wrote:
I don't see how you can do this with Multithread.

For W_sigma RunAFit() would either have to passed a 2D wave so that each iteration can access its own row or column in the wave, or it could take a wave wave. I'd probably use a 2D wave.

[ last edited June 1, 2012 - 12:27 ]
mtaylor
Posts: 88
Joined: 2010-04-13
Location: United States

As a quick 'n' dirty workaround, I simply load multiple instances of Igor and ran each fit in the batch in its own instance. This works quite well unless you have a larger batch than you have cores, at that point, it's advisable to use the multithreader project here: http://www.igorexchange.com/project/multithreader , which will run your large batch a few at a time in parallel.

sleepingawake86
Posts: 34
Joined: 2010-06-28
Location: United States

So using all the suggestions I got code that is "working". What I mean is that when I run the code using a small sample set(10-50 waves) the code executes perfectly with no problems. Every time I try to run it on the larger set(500+) that I need it for igor closes to desktop with no warning. It doesn't freeze or that, it just closes. If it froze up I would just assume I don't have enough free memory, but it doesn't do that, just closes to the desktop.

I am running the latest version of igor on a Macbook pro OS X 10.7, 2.8ghz i7, with 4GB of ram.

Posts: 563
Joined: 2007-03-01
Location: United States

sleepingawake86 wrote:
So using all the suggestions I got code that is "working". What I mean is that when I run the code using a small sample set(10-50 waves) the code executes perfectly with no problems. Every time I try to run it on the larger set(500+) that I need it for igor closes to desktop with no warning. It doesn't freeze or that, it just closes. If it froze up I would just assume I don't have enough free memory, but it doesn't do that, just closes to the desktop.

If you can reproduce the crash, please email support@wavemetrics.com with detailed instructions we need to reproduce the crash and any Igor experiments, procedures, etc. we need to follow your directions. If you can't reproduce the crash, please at least send us the Macintosh crash log. This might give us some clues as to what is happening.

Given that you are using Igor's multithreading features, you must follow the "rules" with regards to what you do with data in threads, etc. Without at least seeing your code it is impossible to help you solve your problem.

Also, you say you are using the latest version of Igor. If you are not using the nightly build available at http://www.wavemetrics.net/Downloads/latest/ then you are not actually using the latest version. If necessary please download the latest build and check that you still get a crash. We have fixed many bugs since the latest official release of Igor (6.22A).