[Help needed] Igor Crash on program involving sockit

Dear Igor Community,

I am facing a problem which I have not able to solve on my own up to now. A quick overview about what my program is doing:

I use two computers that are running an experiment together. The two parts are completely independent, but I need communication between the computers to ensure they are repeatedly taking data at the same time. My Igor program operates as follows:

Open a sockit-connection and start the joint measurement.
Listen to the port for a trigger signal (via processor function) to tell you to start your part of the measurement.
If all measurement points have been processed, close the sockit connection and save the data.

This works out quite nicely for some undefined number of complete runs, but at some point Igor crashes. By use of the debugging feature I cam to the conclusion that it seems to happen after initializing the measurement, but before the processor function is called for the first time. This leaves me puzzled since nothing but waiting for a sockit message should be happening.
Are there common mistakes I am not aware of?

Thanks for your help!
This looks like a fairly difficult problem to diagnose, mainly because it might be difficult to reproduce for someone outside of your lab. You may want to try to come up with a minimal example that causes the issue, and that is as portable as possible.

The first question to address is where the crash is occurring. Sockit would seem like the likely candidate but it is difficult to be sure. This information should be contained in crash logs, especially in the stack back trace, which should tell you the chain of functions leading up to the crash.

I'm going to copy some text from the XOP toolkit manual that describes this:
XOP Toolkit Manual wrote:

On Mac OS X, the crash log is stored at the following locations:
10.4 ~/Library/Logs/CrashReporter/Igor Pro.crash.log
10.5 ~/Library/Logs/CrashReporter/Igor Pro__.crash
10.6 ~/Library/Logs/DiagnosticReports/Igor Pro__.crash

On Windows XP the crash log file is named “drwtsn32.log”. It is a plain text file that you can view in a text editor. The log file is normally located at:
C:\Documents and Settings\All Users\Application Data\Microsoft\Dr Watson\

With the introduction of Windows VISTA, crash logs are no longer easily available. Microsoft has made it so complicated to get a crash log that it is useless. If you want to get an idea of what is involved, search for “.mdmp files not being created” at http://social.msdn.microsoft.com/Forums.


I'm not sure how one would obtain that information in Vista and above. It seems odd for Microsoft to not include a way to get this information, so I suppose there might be some way but I'm unaware of it.

Your best bet is probably to contact the author of Sockit, providing him with as much information as possible, and preferably a minimal test case.
Dear Daniel,
I am the author of SOCKIT (and am on holiday for the next couple of weeks), hopefully you find it useful.

Daniel G wrote:

This works out quite nicely for some undefined number of complete runs, but at some point Igor crashes. By use of the debugging feature I cam to the conclusion that it seems to happen after initializing the measurement, but before the processor function is called for the first time.


When you say Igor crashes, do you mean that the entire Igor program crashes - or that the code you are trying to run stops working? The first being of more immediate concern, the second less so. The reason I ask this question is because you seem to be able to use the debugger to identify which section of code is causing the problem. Are you able to reproduce the crash in a given section of code?

It would help if you could post your code with the relevant sockit pieces, if it's not too convoluted.

If it is causing Igor to crash in a major way - and you are on OS X - there is a stack trace produced when the program crashes. This is available from the window that pops up that asks if you want to send a report to Apple. If you have this, then this will be immeasurably useful in diagnosing the problem as this will tell me what line in the plugin is causing the problem.

Could you also test if the problem occurs with the latest version of the plugin (http://www.igorexchange.com/project_download/files/projects/SOCKIT-IGOR… - try using the plugin from /win or /mac, not the installer as I'm not positive this is up to date)?

A.
Thank you two for your replies. At the moment I started programming a less potent workaround, but once I have at least something working I will come back and investigate the problem more thoroughly, following the advice you sent me. I will keep you updated!

Anyway, Andyfaff, thanks for providing the Sockit plugin, without it there would be no thoughts about even having this 'synchronized' measurement. I appreciate its existence a lot!
I should point out that I use this XOP on a 24/7 basis (with relatively high comms load) to control an instrument. As such I take all bug report seriously as I need a high reliability for instrument operation.

THings ot note:
1) it's not a good thing to call any SOCKIT functions/operations from a processor function. THis is because the processor function is called from SOCKIT itself. Surrounding this call are mutexes to protect the internal state of all the communications. Thus, if Sockit functions/ops are called from the processor function it is likely that they would cause SOCKIT to stall.
2) This warning also applies to calling DoXOPIdle from the processor function.
Hey Andyfaff,

while working on a minimal crashing code example, it didn't crash anymore. So I put some hours in rewriting the code piece by piece, but due to another approach (presumeably), it doesn't crash (the whole Igor) anymore.
Btw: with the latest sockit .xop version the program made Igor crash at the end of every spectrum.
With respect to your last post the most likely reason (that I had the crashes) was quitting the sockit connection from the processor function as a response to a termination message. The new approach involves having a connection all the time that is only shut by the user. Is there a smart way to close the connection in response to a 'finished' message recieved via the sockit connection?
Apart from that manual connection closing process, everything works nicely.

Again, thanks to both of you for helping me out and pointing me towards the solution.
I might rethink some of the design behind the XOP, w.r.t calling SOCKIT from the processor function, but I can't see how to do things differently if it's also necessary to have threadsafe operations in the XOP.

There are several approaches to this problem:

1) Use Execute/P/Q "sockitcloseconnection(" + num2istr(mysocknum)+")" in the processer function to add the sockitcloseconnection call to the operation queue, which will execute at the first available opportunity, when nothing else is happening. This is probably the quickest option.

2) Start a ctrlnamedbackground as a 'cleanup/scheduler'. In the processor function set a global variable that indicates that the socket should be closed. In the background task test to see if the variable is set, if it is - close the socket.

3) Do all communications with the instrument in an IGOR thread(s): see the attached code for an example. See also:
DisplayHelptopic "ThreadSafe Functions and Multitasking"

This option is quite nice as the main thread of IGOR is kept free. However, it does involve learning how to program IGOR threads. I think this will be a good option if you can do it.
threadedcomms.ipf
Thank you a lot,

I will read through this whenever there will be some time for reading and reprogramming again!

Best wishes