Beginners multithreading in fit functions

Hi all,
Apologies for what is probably an elementary question and for subjecting readers to what is probably offensively clumsy code. I have cobbled together a fit function to simultaneously fit surfaces to the 4 layers of 3D wave. These surfaces are defined by signal sources which have 6 parameters each and are a function of 2 independent variables, and the number of these sources can be defined by setting the length of the coef wave fed to the FitFuncMD function. All this to say that it would be nice to be able to use the multithreading abilities of FitFuncMD to speed things up, but I find that if I declare dipolexyz2() to be threadsafe, it runs quickly and without error but gives answers that are clearly different from when I run it as is.

Apart from overly verbose and possibly misguided coding, what am I misunderstanding about multithreading? Am I wrong in thinking that FitFuncMD should be able to use multiple processors if the called fit function uses thread safe operations/functions and doesn't change the waves that it's operating on (broadly speaking)? I don't see that I have broken those commandments. I do use a text wave for the fitting constraints, but it seemed that that was legal as long as the function that called FitFuncMD wasn't threaded. I must be misunderstanding something fundamental, but I don't know where to start looking.

What I believe is the offending code is as follows. There are two more functions that I don't reproduce here called in the primary fit function, but they are identical to dipolex_quad() except that the formula varies.
function dipolexyz2(w,xa,ya,za) : FitFunc
    wave w
    variable xa,ya,za
    variable sourceno, result=0
    for(sourceno=0; sourceno<numpnts(w)/6; sourceno+=1)
        if (za==0)
            result += dipolex_quad(w,xa,ya, sourceno)
        endif
        if (za==1)
            result += dipoley_quad(w,xa,ya, sourceno)
        endif
        if (za==2)
            result += dipolez_quad(w,xa,ya, sourceno)
        endif  
        if (za==3)
            result += sqrt((dipolex_quad(w,xa,ya, sourceno))^2+(dipoley_quad(w,xa,ya, sourceno))^2+(dipolez_quad(w,xa,ya, sourceno))^2)
        endif  
    endfor
   
    return result
end

function dipolex_quad(w,xa,ya, sourceno)
    wave w
    variable xa,ya, sourceno
    variable mx=w[sourceno*6],my=w[sourceno*6+1],mz=w[sourceno*6+2],xb=w[sourceno*6+3],yb=w[sourceno*6+4],h=w[sourceno*6+5]
    variable xij,yij,zij,rij
    variable bx_quad, delx, dely, theta, i1, i2
    nvar/z diam=root:diam, source_int=root:source_int, thick=root:thick, sensor=root:sensor
    if (!nvar_exists(source_int))
        variable/g root:diam=0, root:source_int=1, root:thick=0, root:sensor=0
    endif

    zij=h
    bx_quad=0
    switch(source_int)
        case 1:                 // Dipole (point) source
            xij=xa-xb
            yij=ya-yb
            rij=sqrt(xij^2 + yij^2 + zij^2)
// Formula in following line (and equivalents below) will vary for dipoley_quad and dipolez_quad
            bx_quad=3*xij*xij*mx/(4*pi*rij^5) - mx/(4*pi*rij^3) + 3*xij*yij*my/(4*pi*rij^5) + 3*(xij*h)*mz/(4*pi*rij^5)
            break
        case 2:                 // 7-point quadrature on a disk with radius = diam/2
            delx=0
            dely=0
            xij=(xa-delx)-xb
            yij=(ya-dely)-yb
            rij=sqrt(xij^2 + yij^2 + zij^2)
            bx_quad+=0.25*(3*xij*xij*mx/(4*pi*rij^5) - mx/(4*pi*rij^3) + 3*xij*yij*my/(4*pi*rij^5) + 3*(xij*h)*mz/(4*pi*rij^5))
            for (theta=0; theta<6; theta+=1)
                delx=cos(theta*pi/3)*sqrt(2/3)*(diam/2)
                dely=sin(theta*pi/3)*sqrt(2/3)*(diam/2)
                xij=(xa-delx)-xb
                yij=(ya-dely)-yb
                rij=sqrt(xij^2 + yij^2 + zij^2)
                bx_quad+=(0.125)*(3*xij*xij*mx/(4*pi*rij^5) - mx/(4*pi*rij^3) + 3*xij*yij*my/(4*pi*rij^5) + 3*(xij*h)*mz/(4*pi*rij^5))
            endfor
            break
        case 3:                 // 9-point quadrature on a square of size diam or cube with height thick
        case 4:
            make/o/n=3 weights = {5/9,8/9,5/9}, delta = {-sqrt(3/5), 0, sqrt(3/5)}
            weights/=2
            delta*=(diam/2)
            for (i1=0; i1<3; i1+=1) // Square plane or top layer of cube
                for (i2=0; i2<3; i2+=1)
                    xij=(xa-delta[i1])-xb
                    yij=(ya-delta[i2])-yb
                    rij=sqrt(xij^2 + yij^2 + zij^2)
                    bx_quad+=weights[i1]*weights[i2]*(3*xij*xij*mx/(4*pi*rij^5) - mx/(4*pi*rij^3) + 3*xij*yij*my/(4*pi*rij^5) + 3*(xij*h)*mz/(4*pi*rij^5)) 
                endfor
            endfor
    endswitch
end

Thanks for your help!
Nathan
So what you have is a fit function in *three* independent variables, not two, plus a few functions to implement sections of the work. I don't find your coding bad at all.

So let get this straight- you have a 3-dimensional wave, which you feed to FuncFitMD. If you add the Threadsafe keyword to the fit function (and the sub-functions, since a threadsafe function can't call something that's not threadsafe) then you get a different answer. What happens if you mark the function Threadsafe, but you use /NTHR=1 with FuncFitMD? Does that change the result? It should work just like the non-threadsafe version.

I see references to global variables in your dipolex_quad() function. Do they get written to during the fit? If so, you should be aware that threaded code doesn't guarantee a particular order of running. Writing and reading global variables can be bad, because you don't know when one thread accesses them in relation to other threads. If you're only reading them, I think it should be OK. You might check to make sure you are getting the expected values out of the globals.

In general, you should avoid global variables. They produce dependencies in your code that can be difficult to track down. From the names I'm guessing that you're trying to fit antenna models to measurements, and the globals are feeding in information about the antenna configuration. That shouldn't change during the fit, so you should be OK, as long as you are getting the right values of the globals inside the functions in the first place.

Are the runs really different, or just a little bit different? Are you careful to use the same initial guesses for both cases?

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
Thanks for getting back to me. I regretfully must say that the difference I observed before was probably due to the test data I was using. Very sorry. I managed to confuse myself and eventually came to the conclusion that the switch-case clauses weren't working properly during multithreading, but I since realise that I created some circular reasoning when I created the synthetic data I was testing with.

That said, I do find that the multithreading fits are more stable when fitting multiple sets of parameters (sources). I consistently get singular matrix errors for at least some initial values using either /NTHR=1 or non multithread-enabled functions, while enabling multiple threads does converge to a reasonable answer. The starting conditions are always exactly the same as the initial values of the coef wave are copied from 2 other waves which are not subsequently changed. I don't know if the multithread fit is expected to follow different routes to the minima, but I can consistently reproduce it.

Thanks, too, for the advice on global variables. I suspect I'm not using best practice as far as interacting with the user but implemented them in the hope of preventing errors due to calling unset local variables. I shall keep your suggestions in mind in future.

Thanks again, and sorry for getting you to take the time on what was, in the end, muddle-headed thinking. If you want more details, however, about fits that work with multithread functions and not with single thread, I can provide them.

Nathan
If /NTHR=1 works differently from /NTHR=0 I most definitely want to know about it because that would be a bug (in most cases). If that is really the situation please create an Igor experiment file that has your fitting function and a data set illustrating the problem and send it to support@wavemetrics.com.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
Nathan did as I asked and sent an example that I could run. I should have noticed before that his dipolex_quad() function depends on global variables to select to communicate information to the fit function. Global variables cause big problems in threaded code. I advised Nathan to replace the global variables with fit coefficients that are held. It is a bit kludgey, but it is a good way to communicate constants to a fit function.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com