Extract one point from many waves

Hello!

Say I have 300 2D waves, each of dimensions 400x500, called wave_0, wave_1...wave_299. I would like to create a 1D wave with 300 points (one point for each of the 300 waves) consisting of the value at a single [i][j] coordinate for each wave. So: beamwave[] = wave_p[i][j], where p is a point in beamwave. I would like 400x500=200000 waves of length 300 - one wave for each coordinate (although they don't necessarily all have to exist at the same time - one at time may be ok). Normally I initially combine all 2D waves into a 3D wave of dimensions 400x500x300 and extract the beams at each coordinate point-by-point. I then process each beam and store the results in a new 400x500x300 wave. This generally works ok except that with a lot data I get out-of-memory errors...and also this way is not that fast.

I can reduce memory problems by, say, overwriting the 3D wave... or by not building a 3D wave and instead getting each point one wave at a time, which I guess avoids memory fragmentation or something... For example, the following works but is too slow, and still requires one large 3D wave:

function tooslow()

variable i, j, k
make/n=300 beamwave
make/n=(400,500,300) testout

        for(i=0;i<400;i+=1)
            for(j=0;j<500;j+=1)
                for(k=0;k<300;k+=1)
                    wave blah=$("wave_"+num2str(k))
                    beamwave[k] = blah[i][j]
                endfor
                dosomethingto(beamwave)
                testout[i][j][]=beamwave[r]
            endfor
        endfor
end


Is there a better way? Thanks!
Your problem has two sides:
1) Limited memory.
2) Performance.

I think the best way to increase the performance of your code is to combine the 300 input waves into a single 3D input wave, that can then be manipulated using faster primitives such as ImageTransform or MatrixOP. But the payoff is that you need to have sufficient memory to hold a copy of the input data.

So I don't have a good answer to problem 1. I think you could look into Igor64 if you're using Windows and have sufficient RAM in your computer to make it worthwhile. However, note that you could do everything in-place if you could live with the reduced performance.

Otherwise, here is what I would to to increase the performance. First combine the input waves into a single 3D wave:
Function CombinePlanes(inputBaseName, startIndex, endIndex)
    string inputBaseName
    variable startIndex, endIndex
   
    variable nPlanes = endIndex - startIndex + 1
   
    Make /O/N=(400, 500, nPlanes)/D M_CombinedInput
    // you can conserve memory here by not adding /D if you don't need
    // the precision
   
    variable i
    for(i = startIndex; i <= endIndex; i+=1)
        wave blah=$("wave_"+num2istr(i))   
        ImageTransform /P=(i) /D=blah setPlane, M_CombinedInput
        KillWaves /Z blah   // warning: kills input wave. Will free memory
                            // and we have a copy now (but verify that the code is correct!)
    endfor
End


Then we have the loop that processes the beams:
Function ProcessBeams(inputWave)
    wave inputWave
   
    variable nRows = DimSize(inputWave, 0)
    variable nCols = DimSize(inputWave, 1)
    variable nPlanes = DimSize(inputWave, 2)
   
    variable i, j
    for (j = 0; j < nCols; j+=1)
        for (i = 0; i < nRows; i+=1)
            ImageTransform /BEAM={(i),(j)} getBeam, inputWave
            wave W_Beam
           
            // do something to beam here
           
            // store the result. This overwrites the input,
            // if you have enough memory you can fill a copy
            // instead
            ImageTransform /BEAM={(i),(j)} setBeam, imageWave
        endfor
    endfor
End
One might also ask why you need the 20000 1D waves- depending on what you want to do with the data, I wonder if you need all of them, or if perhaps the 3D wave might be useable without going through the step of extracting all those 1D waves.

So- what is the next step?

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
Thanks for your quick replies!

I don't need all 200000 waves at the same time. John, do you mean there could be a way to perform my operations on a 3D wave without point-by-point copying to a 1D wave? That would also be useful!

I've been doing something similar to 741's suggestion: I build a 3D wave from the original data, get each 1D wave, perform some operation like an FFT or interpolation, then store the result in a (usually complex) 3D wave, overwriting the 1D wave each time. (I can overwrite the original 3D wave too, but it would be nicer to be able to keep it.)

This all works, but if I use 3D waves I generally have to interpolate the data to fit it in memory. So I tried using a big pile of 2D waves instead of a 3D wave, similar to my example above. I found that this can reduce memory problems and interpolation, but it's obviously a lot slower to extract the 1D waves.

In my example I used a 3D wave to store the result, but ultimately I was also wondering about using another big pile of 2D output waves, (or indeed 200000 1D waves, if that's possible!) in order to avoid 3D waves altogether.
A 400x500x300 wave works out to about 640 MB with double precision. That's not bad; I've done much worse. You're not even approaching the theoretical 4 GB limit available to a 32-bit process. I'm surprised that you cannot fit two of these into memory, but it depends on two aspects: how much RAM is available, and how fragmented it is.

The latter is probably why you can make 300 400x500 waves but cannot make a single 400x500x300 wave.

If I was in your shoes, I would save myself trouble by looking into getting more free memory. Beyond simply adding more RAM, also read DisplayHelpTopic "Memory Management" to see if Windows is limiting you to a subset of the theoretical 4GB address space.

But if you're forced to make do with what you have, here is what comes to mind (note – advanced topics):
Storing individual planes in separate waves works around fragmentation. However, in addition to being intrinsically slower than working with 3D waves, the current Igor also isn't particularly friendly to having lots of waves. Wave lookups are pretty slow, and Igor will choke if you have lots of waves (so forget about the 20000 waves).

But there's an exception to this – if you use wave waves, that contain only references to free waves, you essentially avoid the wave lookup overhead, while also making it possible to have tons of waves, simply because free waves require less housekeeping by Igor.

The idea here is that you could have a 300-point wave wave that contains refs to the 300 2D waves. You would then pass and use this wave in your calculations instead of the $"wave_" + num2istr(i) lookups. 300 waves isn't too bad – they could even be regular waves. This will save you the wave lookup overhead, but don't expect this to make a night-and-day difference.

In much the same way you could conceivably have 20000 1D waves contained in a 20000-point wave wave. But only if they are free waves!

See DisplayHelpTopic "Free Waves" and DisplayHelpTopic "Wave References".

After that there's always multithreading. Your problem looks pretty well suited to that. But this post is long enough as it is.

cammy wrote:

I don't need all 200000 waves at the same time. John, do you mean there could be a way to perform my operations on a 3D wave without point-by-point copying to a 1D wave? That would also be useful!

Well, it depends what you want to do with it. Curve fitting, for instance, can use any arbitrary subrange of a wave as input, and you can make a graph with any arbitrary sub-set of a wave. But whether or not you can use a portion of a wave as input to an analysis operation depends on which operation you're using.

I like 741's suggestion of wave waves.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com