Replace duplicates with average values-- remove data points and then insert?

hello
I have time series measurements that have occasional duplicate timestamps, my datapoint are around 49000 in number and there are about 2000 pair duplicates.
I can find where the duplicates are but don't know how to take average of each pair of duplicate measurements, remove one of the two rows of duplicates and replace the left row with the averaged value. I tried to use delete points, but the point number would then change after each remove which creates problem in ID the next pair of duplicates. Any good solutions to this problem? Other than do math on the point number, such as make the point number -2 in the loop after each remove. Can I get around using deleting points but use other approaches?

Thank you!
A rather good solution to the index thing (using deletepoints) might be to go backwards through the index (rather N-1 ... 0 than 0 ... N-1) . It can also be slightly faster (less data to move once something is deleted), depending on data structure.

Are the timestamps monotonic (I would guess so but better be sure)? Are there triplets?
In case time stamps are monotonic, I'd probably crawl through the data last set to first set and check whether the 'next' (actually previous) one has the same time stamp. If so, average them, store the result in the high index set, and delete the low index one. Repeat until you reach index 0. Caution with multiplets here...

HJ
I have to wonder whether a method exists to avoid looping (backwards). Perhaps a clever combination of FindDuplicates, with the resultant wave + source wave blended through MatrixOP using implicit indexing or in-line logical testing?

--
J. J. Weimer
Chemistry / Chemical & Materials Engineering, UAH
Each time you call DeletePoints in a loop all points after the deleted point must be moved in memory. If you are dealing with a large wave and have to do a large number of deletions, that will be slow.

Another approach, which may or may not be faster for a given set of input data, is to loop through the entire input dataset and copy the required data from a pair of input waves to a pair of output waves. Here is a function that I wrote to do this. It does not use DeletePoints but instead calls Redimension once. I have tested it somewhat but don't claim it to be foolproof. It also should work for more than two consecutive identical X values but I have not tested that.

Function RemoveDuplicatesXY(xWave, yWave)
    Wave xWave, yWave
   
    Duplicate/FREE xWave, xWaveCopy
    Duplicate/FREE yWave, yWaveCopy
   
    Variable numPointsIn = numpnts(xWave)
    Variable numPointsOut = 0
    Variable previousX = xWaveCopy[0]
    Variable previousY = yWaveCopy[0]
    Variable currentX, currentY
    Variable numPointsWithThisX = 1
    Variable sumOfYValuesWithThisX = previousY
    Variable i
    for(i=1; i<numPointsIn; i+=1)       // Handle up to but not including the last point
        currentX = xWaveCopy[i]
        currentY = yWaveCopy[i]    
        if (currentX != previousX)
            xWave[numPointsOut] = previousX
            yWave[numPointsOut] = sumOfYValuesWithThisX / numPointsWithThisX
            numPointsOut += 1
            numPointsWithThisX = 0
            sumOfYValuesWithThisX = 0
        endif
        numPointsWithThisX += 1
        sumOfYValuesWithThisX += currentY
        previousX = currentX   
        previousY = currentY   
    endfor

    // Handle the last input point
    if (currentX != previousX)
        // Last point is not a duplicate
        numPointsWithThisX = 1
        sumOfYValuesWithThisX = currentY
    endif
    xWave[numPointsOut] = currentX
    yWave[numPointsOut] = sumOfYValuesWithThisX / numPointsWithThisX
    numPointsOut += 1
   
    Redimension/N=(numPointsOut) xWave, yWave
   
    // For debugging only
    #if 0
        Printf "Number of input points=%d, number of output points=%d\r", numPointsIn, numPointsOut
    #endif
   
    return numPointsOut
End


I am attaching the experiment that I used to test this function. The experiment includes another function for timing how long the function takes on a particular XY pair.