Multithread ImageLoad and ImageRegistration

I have a particularly large dataset consisting of 10,000 32-bit TIFF images where each is about 1000x1000 in dimension. The task is to calculate a single averaged image where the averaging operation is done after each image is aligned to the very first image.

As expected, this process takes several hours due to slow ImageLoad (that needs hard disk access) and ImageRegistration (that has heavy computation) operations. I have thought about using several threads to speed up the operation, but it seems that neither ImageLoad nor ImageRegistration supports ThreadSafe functions.

Any suggestions on how to speed up this task?
I will begin by lowering the "quality" (bit/pixel) for image alignment or using only a particular channel. You can even carefully rescale the images (i.e. 200x200). All operations will depend on the scales or properties of the features over which you align your images. As to memory vs disk: unless you have 10000*1000*1000*4 bytes of memory, you'd have to figure out a good way to load and unload images from disk into memory.

I cannot comment on imageload being slow.

best,
_sk
_sk wrote:
I will begin by lowering the "quality" (bit/pixel) for image alignment or using only a particular channel. You can even carefully rescale the images (i.e. 200x200). All operations will depend on the scales or properties of the features over which you align your images. As to memory vs disk: unless you have 10000*1000*1000*4 bytes of memory, you'd have to figure out a good way to load and unload images from disk into memory.

I cannot comment on imageload being slow.

best,
_sk


Thanks for the suggestion. Unfortunately the feature in the image is quite small (I'm aligning some discrete set of points) so the coarsest flag I could use is /FLVL=2. I guess I may try rescaling the image before calling ImageRegistration and see whether rescaling is an expensive operation or not.

Regarding memory usage, I have to load and kill one image at a time. Maybe that's the speed bottleneck?
What kind of disk are the images stored, conventional hard drive or an SSD?

An SSD might give some speed increase in loading.

Andy
alfredzong wrote:

As expected, this process takes several hours due to slow ImageLoad (that needs hard disk access) and ImageRegistration (that has heavy computation) operations. I have thought about using several threads to speed up the operation, but it seems that neither ImageLoad nor ImageRegistration supports ThreadSafe functions.

Any suggestions on how to speed up this task?


FWIW, ImageRegistration is threadSafe in IP7, but that is unlikely to help you because the most probable bottleneck is ImageLoad.

As most of the work here is single threaded you could split your image collection into multiple disk folders and then run multiple instances of Igor to process the data.

A.G.
WaveMetrics, Inc.
Quote:
I have a particularly large dataset consisting of 10,000 32-bit TIFF images

It is not clear if these are in separate files or one file.

ImageLoad is not thread-safe. You can run multiple instances of Igor and have each instance work on a different set of files. Or, if they are in one file, use the ImageLoad /S and /C flags to control which images are loaded by which instance of Igor.

According the the help browser, ImageRegistration is not thread-safe in Igor6 but it is thread-safe in Igor7, so you could write a multi-threaded Igor procedure to do the registration. However, if you are going to use multiple instances for ImageLoad, you may as well do it for ImageRegistration also, and then merge the results at the end.

I agree that an SSD drive may help, but with ImageLoad only.
Thanks all for the suggestions. 10,000 images are in separate TIFF files and I was using an SSD drive. Nonetheless, maximum disk read speed during a single Igor instance of ImageLoad seems to be capped at 40 MB/sec even though the SSD is benchmarked at a read speed of more than 100 MB/sec.

It seems that the best solution is to run multiple Igor instances.
alfredzong wrote:
Regarding memory usage, I have to load and kill one image at a time. Maybe that's the speed bottleneck?


Maybe or something like this.

Also worth pointing out are GBLoadWave and FBinRead. Maybe you can go around any of the overhead incurred by ImageLoad.
Can you post a representative image?

best,
_sk
_sk wrote:

Also worth pointing out are GBLoadWave and FBinRead. Maybe you can go around any of the overhead incurred by ImageLoad.
Can you post a representative image?


When considering "overhead" neither GBLoadWave nor FBinRead are free of "overhead". It is possible that ImageLoad may be relatively slower because of multiple instances of a disk seek/read; something that can be replaced with a single read into a wave followed by multiple wave access to extract the relevant data (this might work if the images are not compressed).

Except for the case of stacked TIFF images, where LibTIFF code is amazingly bad, I would not expect any improvement in trying to roll your own image reading routines. Also, FWIW, the latest IP7 should not be affected by LibTIFF stack performance limitations.

A.G.
WaveMetrics, Inc.
I would also say that rolling your own image loading functions will probably not gain you anything.

I've found a tiff file here, http://www.fileformat.info/format/tiff/sample/, and then I load that on my machine

Function DoStuff()
   
    variable refTime = stopmstimer(-2)

    ImageLoad "C:Users:thomas:AppData:Local:Temp:MARBLES.TIF"

    printf "%g\r", (stopmstimer(-2) - refTime) / 1e6
End


this takes between 40 and 70ms. What do you get?

It is loaded from a SSD. Are your TIFF files compressed?