Looking for advice on writing an IGOR loader/plotter for very large files

Hello Brandon,

At two-weeks we have 60*14*86400 data points which is roughly 72M points. This should not be a problem but you might want to look at optimizing your storage to the real dynamic range of the variables. A tab-delimited file is not as efficient as a binary file but it might have an advantage in case there is partial corruption of the data.

I like the idea of using SQL in such application. IGOR ships with the SQL XOP (Igor Pro folder:More Extensions:Utilities). Feel free to contact me directly if you want to discuss this.

A.G.
WaveMetrics, Inc.

Log in or register to post comments

March 13, 2012 at 11:44 am - Permalink

hrodstein

I recommend that you write some procedures without regard to the size of the files. You can later decide if some other storage mechanism is worthwhile. I don't recommend going up the SQL learning curve before you have experience in dealing with your data, although it is very worthwhile for its own sake.

Once you have a basic plotter, I think you will be in a better position to decide if it is worthwhile to store the data in a database.

I also don't think you need to use structures, at least initially. In this case, structures would simply be a way to package multiple function parameters into one parameter. This may or may not turn out to be useful in your case.

If you intend to read data from the log while the equipment is writing to it, that is another problem. You will have to devise some way to keep track of what you have already read so you can read only new data.

I would start by writing a routine to load the file (LoadWave/G if it is all numeric, LoadWave/J if it contains strings or date/time values). You should load each file into its own data folder.

Next come up with a routine to plot columns specified by a string list parameter ("pressure;temp;current") over a range of times specified by two numeric parameters.

Next come up with a user interface that allows the user to choose the columns (using a listbox) and range of times through a control panel. For user entry of date/time values in a control panel, my date control snippet might be of use.

Log in or register to post comments

March 13, 2012 at 01:12 pm - Permalink

andyfaff

Igor wrote:

At two-weeks we have 60*14*86400 data points which is roughly 72M points.

Assuming that each point is a double that equates to 580Mb per log file.

Log in or register to post comments

March 13, 2012 at 09:58 pm - Permalink

bjreese

hrodstein wrote:

Next come up with a user interface that allows the user to choose the columns (using a listbox) and range of times through a control panel. For user entry of date/time values in a control panel, my date control snippet might be of use.

hrodstein - Thanks for the snippet and the advice. I think I will get some basic plotting stuff written up and see where that takes me.

Igor wrote:

At two-weeks we have 60*14*86400 data points which is roughly 72M points. This should not be a problem but you might want to look at optimizing your storage to the real dynamic range of the variables.

A.G. - Thanks for letting me know that the number of points won't be a problem, I was a bit concerned with the large file size. I will do a bit more reading into SQL and may contact you if I feel this is something I will be needing.

Into IGOR I go!

-Brandon

Log in or register to post comments

March 14, 2012 at 08:25 am - Permalink

Igor

andyfaff wrote:

Igor wrote:

At two-weeks we have 60*14*86400 data points which is roughly 72M points.

Assuming that each point is a double that equates to 580Mb per log file.

While large, it is still within the 32-bit application limits.

I don't know what data are collected in this case but many common measurements do not require 64 bits for representation so in this application it would make sense to store data in separate waves where each wave type is selected to fit the dynamic range of the corresponding parameter. The same argument would apply when storing the data in a database.

As far as I can see, the only difficulty here is introduced in managing and accessing more than one data set at a time. If that is a requirement of the OP then SQL is a good choice.

A.G.
WaveMetrics, Inc.

Log in or register to post comments

March 14, 2012 at 09:51 am - Permalink