I have a set of data from which I wish to create the pdf and cdf. The data sets have NO normal distribution. I did the KStest (StatsKStest/ALPH/T=0 srcwavename,distwavename) which is independent from the distribution, and it was fine. I would like to create the adequate pdf and cdf as well. But something is wrong as the pdf doesn’t starts from 0 (at the Y axis) and accordingly the cdf doesn’t reach 1. I’ve used the commands below:
It would be helpful if you attached the actual data and results, and pointed out the specific problem you have.
I tried it with my own synthetic data set:
The integration ends with 0.999, which seem wrong but is a numerical artifact. Your Integrate command uses /METH=1, which chooses trapezoidal integration. While this is commonly the best choice for real data that represents some underlying smooth curve, I think it could be argued that it is not the best choice for a histogram, which represents the counts between the bin edge values. As such, each point in the histogram really represents a rectangular area, and /METH=0 or 2 would be better. Using /METH=1 gives a final value of 1.000 in my test case, but a non-zero starting value. /METH=2 adds one extra point to the output, has zero as the starting value and 1 as the ending value.
To understand these differences, read the documentation for the Integrate command, especially the Details section. Take with a grain of salt the statement there that says, "Trapezoidal integration is a more accurate method of computing the integral than rectangular integration." As I said above, the best method depends on what the data actually represent.
I have posted an Igor experiment file with my example.
Thank you very much for your reply, it was a great help.
I tried to create pdf at different ways now, it seems the best is, if I set B=1 instead of 4. I understand that you say the output shouldn’t start at zero But if I use the P flag: does it mean that I normalize the histo, doesn’t it? ‘Normalizes the histogram as a probability distribution function, and shifts wave scaling so that data correspond to the bin centers” so I expect the data starts from zero and end at zero, meaning that probably I have no value less and bigger than those points.
Also it says that I should use the Meth=0 or 1 but not the 2 in the next step. “When using the results with Integrate, you must use /METH=0 or 1” If i use Meth=0 instead of 1, the data indeed reach the 1 but do not start from zero. The cumulative probability function should be between 0 and 1, should't be?
I have attached the corresponding Igor file. I would like to compare the two ISI distributions (wave0, wave1) (with KS test) and also to compare the two corresponding pdf and cdf. I also attached the graph of the pdf and cdf were created in matlab from the same data set.
Thank you very much for your help,
I see, thanks for the correction!
I've tried the final version on other pairs of data set and it seems to be good.
Thank you very much for your help!
Back to top